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The Study Program in Biophysical Science, Boulder, 
Colorado, 20 July-16°August 1958, was organized and 
conducted by the Biophysics and Biophysical Chemistry 


œ Study Section of the.National Institutes of Health, 


“Public Health Service, United Statés Department of 
Health, Education, and Welfare, under research grant 
RG-5048 from that agency. About 120 senior research 
contributors and selected younger scientists took part in 
this program, which was held in an environment con- 
ducive to informal interchange of ideas. 

The general objective of the Study Program was to 
aid and encourage the further blending of concepts and 
methods of physical science with those of life science in 
the investigation of biological problems. The Study 
Section conceived this activity as an experiment in the 
exchange of information and ideas among representa- 
tives of different branches of science. The core of the 
Study Program was a carefully integrated series of 
about 60 lectures, constituting compact summaries of 
certain key problems and critical evaluations of recent 
advances. The lecture material provided a framework 
for other activities such as planned workshop sessions, 
spontaneous discussion groups, and library study. 

The results of the experiment reinforced the view of 
the Study Section that the same lecture material, appro- 
priately edited and published, would also provide a timely 
research guide and a useful base on which to build new 
courses and seminars in biophysical science. Thus, the 
values of this effort would be made available to a much 
broader audience than could be accommodated in the 
informal atmosphere desired for the Study Program. 

The method of publication reflects the objectives of 
the Study Program. Presentation in the Reviews of 
Modern Physics brings the material before the com- 
munity of physicists, who may find interest and stimu- 

~ latiomnere. In this respect, the publication is a sequel 
to “Borderland Problems in Biology and Physics” by 
John R. Loofbourow, which appeared in the Reviews of 
Modern Physics during 1940. Simultaneous publica- 
tion in book form by John Wiley and Sons, Inc., who 
have prepared the illustration drawings, brings the 
material to the attention of scientists in all fields and 
countries. Through the combined publication proced- 
ure, the results of the Study Program are being dis- 
seminated promptly, to a wide audience, at the lowest 

- practicable cost. 
The entire Study Section devoted long hours and 
much thought to the formulation of the Study Program. 

Encouragement and help were given by former members 


-- P. Hill and G. B° B, M. Sutherland. An invaluable aid 


“in-formalating the program was the experience gaine 
from thé “Conferenee-on Certain Fundamental Aspects 
of Biophysical Science,” conducted by the Study Section 
in January 1958 at Bethesda, Maryland, and planned 
by a committee: P. H. Abelson, R. C. Williams, and 
H. Neurath, chairman. . 
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Biophysical Science—A Study Program 


Physical aad chemical approaches to problems in 
“iology have become increasingly productive in recent 
bars, Major advances in the understanding of, life 


“erocesses have been made through research in such 


vecialties as biophysical chemistry, molecular biology, 

1opkysics, and electrophysiology. Continuing progress 
will require an ever more perceptive study of the intet- 
actions of matter, energy, and inférmation in biological 
systems, 

This publication grew out of a special activity, de- 
signed to aid and stimulate the further blending of the 
concepts and methods of physics and chemistry with 
those of the life sciences in the study of biological 
problems:*Ihe papers in this volume were presented in 
the Study Program in Biophysical Science, held in 
Boulder, Colorado, during the summer of 1958. 

A living system is a self-perpetuating combination of 
atoms, organized in a highly specific manner and inter- 
acting with its environment in such a way as to meta- 
bolize, reproduce, grow, and adapt. The Study Program 
was designed to present biological problems as “viewed 
through physical spectacles and investigated by physi- 
cal ideas and methods.’’* Biological processes can be 
examined, for example, in terms of atomic structures, 
energy levels, and binding forces. The engineer might 
prefer to specify the design and operation in terms of 
components and “black boxes.” The chemist discusses 
molecular configurations and kihetic and thermody- 
‘namic details of biochemical re&ictions. These and other 


>: modes for ¢zscribing biological processes are found in 


this volume. 

The” Study Program included a number of “case 
studies” in which topics were chosen to point up the 
need and value of physical approaches at all levels of 
biological orgdnization. Particular emphasis centered 
upon muscle and nerve, where considerable information 
is available at many of these levels. The series of papers 
related to nerve, for example,.ertcompasses the molecu- 
lar organization of the nerve fiber, the nature of the 
nerve impulse, nerve metabolism, sensory performance 
of organisms, receptor mechanisms, and integrated 
responses of higher neural centers. Another series deals 
with genetics, replication, and synthesis of nucleic-acid 
and protein macromolecules. .Here again, the topics 
range from descriptions of the biological processes of 
replication, through physical and biochemical details 


* AV. Hill, “Why Biophysics?” Lectures on the Scientific Basis 
of Medicine (Athlone Press, London, 1956), Vol. IV, p. 1. Reprinted 
in Science 124, 1233 (1956). ? 
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© 
of relevant structures, to mathematical considerations 
in the coding of informatjon. 


The subjects were selected with emphasis on funda-_~ 


mental concepts, to broaden the base of common lan- 
guage and understanding among a heterogeneous group 
of biologists, physicists, and chemists. This emphasis 
restricted the range of topics that could-be presented 
within the limited duration of the Study Program. 
Also, the interests of those who organized the program 
are unavoidably reflected in its contents. - 

This particular choice of topics is not intended to 
define or delimit the field of biophysics, but, rather, to 
illustrate the power and success of the integrated physi- 
cal-biological approach. Future conferences and publi- 
cations of this kind will undoubtedly emphasize alter- 
native subjects such as, for example, those at higher 
levels of biological complexity. 

The arrangement of the material presented in the 
Study Program was chosen to provide a logical un- 
folding of the selected subjects, along with a continuous 
injection of background material. The initial chapters 
give many of the fundamental concepts, whereas the 
later chapters make use of developments introduced in 
the intervening chapters. Since references may be 
needed to supplement the necessarily abridged dis- 
cussions in the papers, a Selected General Bibliography 
is given on pages 2 to 4. The-entries are divided into 10 
subject categories, together with a designation of those 
papers that are most closely related to each category. 
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3. Conference on Tissue Fine Structure, Biophys. 
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RESENT-DAY research in the life science§ is char- 
acterized by unprecedented advances in the bio- 
chemical and biophysical analysis of the mechanisms 
underlying life processes..Biochemistry has passed from 
the early phase, in which the determination of the 
chemical composition of complex biomolecules was its 
major ¢esk, through the period of investigation of the 
detailed prdcesses of intermediary metabolism, whereby 
“energy-rich” compounds become available to provide 
the energy for the building, maintenance, and repair of 
the organismic machinery. Biochemistry is turning its 
attention more and more to a study of the ways in 
which this available energy is coupled with physiologi- 
cal processes and particularly with biosynthetic reac- 
tions, as in the synthesis of the more critical biomolecules 
such as the proteins and nucleic acids and their deriva- 
tives. In these studies, biochemists require, in addition 
to the molecules that react in a given process, also an 
enzyme or enzymes whose role it is to catalyze the 
reaction in a very specific manner. Biochemistry thus 
has become to some extent the prganic chemistry of 
enzyme-catalyzed reactions. But it has become obvious 
also that, although each energy-yielding or energy- 
requiring reaction in the cell is catalyzed by a specific 
enzyme reactions in the cell involve enzymes organized 
in specific patterns or structural arrays. This organiza- 
tion or ordering of enzymes permits cycles of reactions 
to occur thatewould be highly improbable or impossible 
if the individual steps in the reaction were catalyzed by 
individual enzymes randomly distributed in the cell. 
The biochemist is now striving to isolate not only 
individual enzymes required efot particular reactions 
but also specifically organized groups or “assemblies” 
of enzymes. These are found readily-available in partial 
systems of cell particulates such as the mitochondria 
and microsomes that can be isolated from fragmented 
cells and purified in vitro by differential centrifugation. 
With such subrellular, organized systems of enzymes it 
has become possible to achieve spectacular syntheses 
in the test tube of proteins, nucleic acids, and other 
critical biomolecules. With the help of isotopic tracers, 
it is possible also to determine the ways in which indi- 
vidual groups are shuttled about in these biosyntheses. 
Biophysics, which is rapidly dévelopi 1g to the status 
of a mfajor branch of the life sciences, js following the 
pattern set by biochemistry, and biophysicists are learn- 
ing to operate in clo3e harmony with biochemists. Just 
as biochemists had to determine first the composition 
and structural chemistry of the complex biomolecules, 
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so in the early development of the field, biophysicists 
have to determine first, with the aid of crystallographic 
and physicochemical methods, the detailed configura- 
tion of the molecular’ chains of which the complex 
macromolecules are constructed. It will then be neces- 
sary to investigate the forces between the macromole- 
cules and the smaller molecules in the protoplasmic 
environment and between various types of macromole- 
cules. By now, it is fully apparent that the interaction 
Properties of the macromolecules crucial for life proc- 
esses depend upon the highly specific manner in which 
the covalent chains are structured and integrated with 
each other by secondary bonds within individual macro- 
molecules. It is equally apparent that these interaction 
properties also spontaneously tend to form super- 
macromolecular aggregates whose stability and inter- 
action properties depend sensitively upon their chemical 
environment of small molecules and ions. These super- 
macromolecular aggregates are, in fact, not only the 
active machinery by which muscles contract, glands 
secrete, and energy is mobilized by spatially organized 
enzymes in mitochondria, but also the means by which 
the genetic material is segregated and recombined with 
mathematical precision and by which the genetic coding | 
is preserved and transmitted to posterity. Such speci- 
ficity of interaction properties underlies the process by 
which the coded biochemical and biophysical instruc- 
tions are passed from the germ cells to the various 
tissue cells, causing them to develop into the mature 
organism and to maintain and repair the organism 
until processes of aging and degenerative disease disturb 
the nice balance of interactions sufficiently To cause 
death. : 

Thus biophysics, like biochemistry, has to reckon 
with hierarchies of organization and with the 
properties that are characteristic of systems mo less 
complex than those provided by living organisms at 
each particular level of organizational complexity, viz., - 
molecular, macromolecular, subcellular, cellular, super- 
cellular, organismic, and superorganismic. ~ ~*~ 

It may be apparent that some of fle foregoing is 
based on prognostication of advances presently “in tiẹ -- 


Works” or still to come, as well as being a statement ite? 


likely to be made by the historian of mid-twentieth 

century life science. However, - the» characterization 
. ° . ~ 

makes it clear that there are two rather different ways 


of investigating life processes, both’ of them highly — = 
important. One way is primarily analytical. It Ro 


illustrated by the biochemist’$'Search for pure enzymes 
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ve illumi f the basic biosynthetic pr 
; ` SAE ; i e illuminated some OF i ; 
which specifically catalyze individual reactions. ‘This hav by which the genetic determiners guide cellular n 
leads not only to an understanding of the individual esses by W: f o PR E contents s ¥ 
reactions, but also eventually to the discovery that differentiation om ee ae eyes Erer ie 
these reactions constitute complex cycles of metabolic the “black box” are a g 3 
or biosynthetic reactions. The analytical method is possible. ig aaa Rote 
= characterized also by the biophysicist’s striving to Actually, much 0 id. in these terms, be consi dera al 

isolate fibrous macromolecules in monodisperse condi- analytical biology would, in ; > efi + 


tion and to determine the physicochemical properties 
and intramolecular-chain configuration of these macro- 
molecules. It is inevitable that such investigations must 
precede successful attempts to determine how the 
macromolecules interact to cause biological function. 
The analytical method has made possible most of the 
striking advances in the biophysical and biochemical 
sciences which, together with the parent science of 
general physiology, have come to be known as “molec- 
ular biology.” 

The hallmark of the alternative approach, which may 
be called the “organismic” or “systems” approach, is 
to allow the cell, tissue, organism, or groups of organ- 
isms to remain intact in their normal environment and 
to observe the interaction properties of these intact 
entities, their adaptation to their environment and their 
f tendency to form still more highly organized entities or 

systems with new properties, many of which are 
l seemingly unpredictable from a knowledge of less com- 
| plex systems. 

Most of the current advances in biophysical science, 
and certainly the major portion of the present Study 
Program in Biophysical Science, involve primarily the 
analytical approach of molecular biology. This may 
consist, in the isolation and examination of particular 
> cellular constituents at the molecular level; in the 
j demonstration of the role of free radicals or “excitons” 

in particular cell processes; or in the application of 
polyelectrolyte theory in an understanding of muscle 
contraction, to cite a few examples. Actually, the 
tendency of physicists who have become biologists or 
biophysicists has been to search for simplified models 

of complex biological processes and, by a sophisticated 

i study of such models, to discover fundamental new 
f principles in biology. In this approach, the numerous 
| complexities and vast areas of ignorance that lie be- 
tween the simplified model system and the end biolog- 

ical process under study are characteristically and 
purposely avoided or neglected. It is recognized that 

- there is a formidable “black box” between the molec- 

ular effectors, as, studied in the model system, and the 

final behavior of the cells or organism under study. It 

is quite possible phat this approach may lead to impor- 
~~ Zant breakthroughs in biological theory. One ofthe 
best-illffstrations of its effectiveness is the phage-micro- 
organism studies and the modern biophysical approach 
= to genetics generally. This subject forms an important 
part of this Study Program. “These advances have not 
been limited to genetics, narrowly conceived as con- 


cerned alone with mecbeaisms of inheritance, but also 
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ell as with eventual > y 
articularly in the 


“black box” science—1.€., 
fnediary detailed mechanisms as W 
“basic” mechanisms. However, P RY Bee 
framework of a Study Program such as this, it 1s 1m- 
portant that the dynamic, adaptive, regulatory, homeo- 
static properties of organisms’ and cells be kept clearly 
in mind, as well as the more readily studied properties 
of simpler partial systems. For such purposes, textbook 
descriptions of the “typical” cell are of patte help. 
Rather, the living cells themselves must be studied, and R 
it is the purpose of the first papers in the Study Program | 
to demonstrate some of these properties. A 

In carrying out my own assignment of discussing the Su 
role of cell constituents in life processes, I should like to i 
sketch very briefly the present situation in cellular 
biology in the light of some of the great, accomplish- 
ments and controversies of the past. It may be hoped | 
that this not only will provide much needed perspective l 
concerning the “systems” or organismic properties of 
living organisms but also may stimulate some to con- 
sider the vast opportunities in theoretical biology, a 
field that is almost nonexistent at the present time, at 
least if judged by the standards of theoretical physics. 
Of necessity, such a theoretical biology must deal not a 
only with the properties of cellular coysfituents but f 
also with the properties of the organism as a whole. It a 
must seek to identify and characterize those pr@perties | 
of living systems which, like the quantum of action, are | 
ingrained in nature itself and must be assumed as a 
principle of operation of the living system? 
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HISTORICAL PERSPECTIVES 


In characterizing the search for the physicochemical $ 
basis of life processes, itis pertinent to mention a few 
concepts and controversies as well as several of the | 
major theories and principles. In these one may trace -do 
from the very beginnings of Greek thought, the thread | $ 
of the conflict between the atomistic or analytical and | 
the holistic or organismic (molecular bidlogy contrasted 
with organismic or evolutionary biology). 
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speculative controversiés about the possibility of “pre- 
formation”—i.e., the possibility that the egg or sperm 
contains within it an embryo fully formed in miniature, 
that in develépment the miniature organs unfold into 
the mature form. Many of the great scientists of the day 
adhered to this patently impossible concept, differing 
among themselves chiefly as to whether .the sperm 
(Leeuwenhoek, Boerhaave, Leibnitz) or the egg (Swam- 
merdam, Malpighi, Haller, Bonnet)° ‘contained the 
miniature structures. This kind „of“ speculafion was 
terminated by the demonstration by the early embry- 
ologists of the progressive manner in which new struc- 
tures are formed from- previously existing simpler 
systems. The investigations of the continuity of genetic 
determiners, as well as the pluripotential properties of 
the ceks af the early embryo, were enriched greatly by 
the ingenious experimental work in developmental 
biology in the last few decades. 

Obviously, a certain amount of basic information 
must be carried in the egg and sperm, and this is cur- 
rently regarded as being the “coding” or instructions of 
how to construct the structures rather than being 
microeditions or templates of the structures themselves. 
It must be remembered that, in addition to the genetic 
coding which must be preserved intact in the germ cells 
for transmission to the progeny, there is also the coding 
that goes to every cell of the body and that directs the 
chemical and structural differentiation of the various 
tissues and organs during development. For example, 
certain cells of the developing embryo must be in- 
structed to become differentiated from their fellows as 
limb buds and these in turn must cause the orderly 
development fot merely of the skeleton and muscles 
of the 4imb but also the appropriate innervation from 
the developing nervous system, as well as the various 
secondary patterns of organization of the limb such as 
the color ang patterns of feathers, hair, and so on. 

Great emphasis in recent years has been placed on 
the mechanism by which deoxyribonucleic-acid (DNA) 
macromolecules uncoil, replicate, and recombine, in 
explanation of the nature ard’action of the genetic 
determiners, and on the manner in which ribonucleic 
acid (RNA) is coded by the DNA and is, in turn, made 
the code for the biosynthesis of proteins and other 
complex biomolecules in the cytoplasm. There is, how- 
ever, little definitive evidence presently available con- 
cerning the manner in which this coding is regulated, 
modulated, activated, and inhibited so as to make pos- 
sible the adaptive, pattern-directed reactions that 
characterize the normal processes of development, 
main¢enance, and repair. There is a tendency among 
biophysicists to take all of this more or less for granted 
as being primarily a problem of complexity—properties 
of the black box. But this may well trn out to be as 
much a part of the eSsential problem of life as the 
definitive coding and biosynthetic processes as such. 
We may, in fact, be dealing here with a bit of the 


essence of the biological problem generally—i.e., the 
difficulty in arranging not only to provide primary, 
secondary, and derivative codes fer particular reactions, 
but also to see to it that the proper substrate and 
enzyme molecules are ready at hand at the right places 
and all at the right moment so as to make possible = 
orderly development, regulation, maintenance, and re- 
pair. As Delbriick* has indicated, there is still a-great 
gap between the DNA map and the genetic map; 
probably also between many other complex biological 
processes and the macromolecular coding that appar- 
ently directs the process. The nature of this regulatory, 
adaptive, feedback mechanism also is discussed in this 
Study Program, though regrettably not nearly so 
thoroughly as the importance of the subject demands. 


Cell Theory 


Formulation of the cell theory in 1838-1840 by 
Schleiden and Schwann proved to be one of the greatest 
generalizations in biology, ranking in importance with 
the theory of evolution, because it made possible the 
explanation of biological process in terms of the cells 
of which the organism is composed. The rapid rise of 
physiology and of pathology as well as an understand- 
ing of the special significance of the germ cells followed 
as soon as the concept of the cell as the unit of life was 
fully apprehended. 

In the century that followed, it became obvious that 
the idea of the cell as the unit of life must be modified. 
Any particular cell is what it is by virtue of its relation- 
ship to all the other cells in the body. The organism is 
an ideal federation of cells, each with a high degree of 
specialization, division of labor, and a corresponding 
mutual interdependence. It is the organism that is alive 
in the strict sense of the word, though individual con- 
stituent cells may readily be removed and cultivated 
in vitro under appropriate conditions. i 


Theory of Organic Evolution 


Mention is made of this most important hypothesis 
primarily to emphasize the fact that it was the result 
of naturalistic observations on whole organisms in 
whole communities. It was developed quite independent 
of the then current tendency to focus attention on the 
cells and their parts. Darwin was, in fact, singutarly 
misleading and ineffective as an analytical “molecular _ 
biologist” (witness his theory of pangenesis and his 
hypothetical “gemmules’”’). However, the evolutionary 
theory strongly reinforced the need ^to explain the 
mechanism of inheritance and provided a challenge to 
molecular biologists for more than a century to comes = 

Mendel’s work prévided*a beautifully clear sët of 
basic relationships. But, unfortunately, they were soon 
e a 


* M. Delbrück, “Atomic phẸsics in 1910 and molecular biology 
in 1957,” a lecture delivered at the Massachusetts, Institute of 
Technology in connection with the Karl Taylor Compton Lec- 
tures by Professor Niels Bohr. <a 
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forgotten and had to be reintroduced at the turn of the 
century. 


Theories of the Fundamental Structure of 
Protoplasm as tke Physical 
- Basis of Life 


Soon after the formulation of the cell theory, atten- 
tion was focused upon the protoplasm (so called by 
Purkinje in 1840) as the physical basis of life.t The 
search was on to’ discover what is actually alive in the 
cell. With the rapid development of optical instrumen- 
tation, methods of “preserving” (fixing), sectioning, and 
staining tissues, there was ushered in the golden age of 
descriptive morphology. Investigators were in hot pur- 
suit of the “living subcellular particles,” the ultimate 
living units. 

To discoveries made during this very active period 
one owes the development of the knowledge of some of 
the most fundamental cell phenomena. In this active 
expansion of descriptive cytology from 1870 to 1890, 
three general theories were proposed for the funda- 
mental structure of protoplasm as the physical basis of 
life, as follows: 


f 
F. 
| 


| 1. The Fibrillar Theory (“One-Dimensional” Arrays) 


According to this view, proposed most succinctly by 
Flemming (1882), “The essential energies on which life 
depends have their seat in fibrillae.” Heidenhain’s book 
Plasma und Zelle abounds in examples of fibrous struc- 
tures in cells (muscle, mitotic apparatus, cilia, flagella, 
sperm tails). Two fibrous types were distinguished : the 
unbranched (filar) and the branched (reticular). The 

3 fibrils are immersed in a clear “hyaloplasm” (Leydig). 


= 2. Alveolar or Foam Theory (Two-Dimensional, 
Membrane-Limited Structures) 


The chief proponent of this theory, Biitschli (1878- 
1892), believed that protoplasm is a diphasic system 
containing a continuous clear, viscous fluid (hyalo- 
plasm) and a discontinuous phase of microsomes (sic) 
which are probably lamellar in life. In this theory, fibers 
were ignored. However, since fibers undoubtedly do 
} exist, Strasburger (1892) proposed a compromise. Pro- 
y toplasmm contains two plasms: a “‘trophoplasm” which 
i is membranous or alveolar and whose function is meta- 
bolic and nutritive, and a “kinoplasm” which is fibrillar 
| and-which-mediates movement or contraction of any 
| kind. jt. em 
ae 2) % . ^3. Granule Theory i 


Between 1880 and 1890, Altmann exhaustively de- 


scribed certain granules (“piastidules’’) in protoplasm 


A 


which he regarded as the organic units of the cell, as the 


= R: review of these early studies, the reader j 
ERA an the book of E. B. Wilson, The Cell in Development na 
dity (The Macmillan Company, New York, 1925), o 
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cells are the organic units of the ore eee 
sidered the granules to be “living eae orea 2, 
(bioplasts).” In 1890, he coined the PIRRE a 
granulum e granulo,” paraphrasing y irchow’s earier 
(1855) dictum “omms cellula e cellula.” Sis 

However, as is now known, the cytologists were a 
led astray by their insistence that the structures t cy 
observed in fixed tissues were present as such ny the 
líying cells. A halt was called, particularly by W. B. 
Hardy, at the turn af the century, to the speculative 
race by descriptive morphologists to discover the phys- 
ical basis of life, through observation of structures In 
fixed and stained tissues. He showed that many of the 
structures claimed by cytologists to be the physical 
basis of life are, in fact, coagulation artifacts. He was 
able to imitate many of them in artificial sysens sub- 
jected to precipitation under controlled conditions. 

J. Loeb’s trail-blazing work on the physicochemical 
and colligative properties of proteins helped remove the 
needless complexity with which the early colloid en- 
thusiasts had cloaked their studies of protoplasm. 

The startling work of Langmuir, Harkins, Rideal, 
Adam, and Gorter on molecular orientation in mono- 
and multi-molecular films laid a basis for an under- 
standing of the properties of interfacial films and factors 
stabilizing polyphasic systems. Studies of the para- 
crystalline, mesomorphic state, beginning with the work 
of Lehmann on fluid crystals, led to an understanding of 
molecular ordering in anisotropic systems. 

The return to the study of living cells by means of 
micromanipulation (Peterfi, Chambers) and by tissue 
culture (Harrison, Levi, Weiss) helped offset the eclipse 
into which descriptive morphology had been sent by 
the skeptical, physicochemically minded physiobgists, 
and revealed new and important properties of cells and 
tissues. 

With the introduction of electron microscopy, and 
ee ae the development of techniques for 
ee nae in ultrathin (100 to 500 A) sections 

y high-resolution (ca 10 A) electron microscopy, we 
seem to have entered upon a new golden age of ultra- 
cytology. From a study of tissues fixed by only a few 
types of fixatives (because of limitations imposed b 
plastic embedding and other preparative t nee y 
well as because of the intrinsi liti augue 
of the cells), descripti 1c qualities of preservation 
a (emt leap ars Net 
equipment i, S raou dias 
and is being i SY for the nonphy 

g installed j 
world over. 


ee analogous period of development of light 
py in the last century, the new flood of electron- 


optica i 
a Ee A has brought with it important new 
oncerning the structure of protgplasm and 


its subcellular “su i SCOD » 
, bmicro ic’ i organ- 


ce danger that one may again be misled 
‘pretation because of the fixation 


increasing num- 
el€ctron-optical 
sicist to operate 
n more and more laboratories the 


in matters of 
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artifacts that are now being observed at a hundred 
times the magnification available to the morphologists 
of the last century. However, the danger of this is 
greatly reduced because other biophysical and bio- 
chemical advances make it possible to obtain independ- 
ent evidence concerning the molecular organization of 
protoplasm. : 


CURRENT CONCEPTS OF CELLULAR 2 
ULTRASTRUCTURE IN RELATION a 
TO FUNCTION 


Electron microscopy has revealed highly significant 
new facts about each of ‘the three types of protoplasmic 
structures that figured prominently in early theories of 
histology: the granular, the membranous, and the 
fibrous structures. For present purposes, it suffices to 
indicate Briefly the upshot of the more significant of 
these discoveries. 

By appropriate fragmentation or maceration of cells, 
followed by differential centrifugation and purification, 
it is possible to prepare, in separate fractions, certain of 
the cellular organelles believed vital by the early cytol- 
ogists. Their biochemical properties and ultrastructure 
also may be studied in one and the same preparation. 

In this way, it was shown that mitochondria are the 
“power plants” or energy source of the cell by which, 
through oxidative phosphorylation, phosphate-bond 
energy is made available through the splitting of adeno- 
sine triphosphate (ATP), the metabolic fuel of the cell. 

The most characteristic aspect of mitochondrial 
organization is the internal layered structure. Biochem- 
ical studies show that both the outer enclosing mem- 
brane and fife internal lamellae contain the enzymes 
required for the citric-acid cycle. From the work of 
Lehninger and others, it seems possible that the various 
individual enzymes involved in such biochemical cycles 
occur as clusters or “assemblies” anchored in or upon 
the membranous structures. The direct demonstration 
of such assemblies and of the molecular organization of 
the layers now becomes a challenging problem in 
electron-optical technique. Thûs, although mitochon- 
dria play a vital metabolic rolé in protoplasmic func- 
tion, they are far from being the elementary units of 
life that Altmann pictured them to be. 

In the presence of appropriate substrate molecules, 
cofactors, and activators, fragments of macerated cells 
have proved tapable of rather spectacular biosynthesis 
of peptides, steroids, and other complex biomolecules. 
Apparently, sufficient amounts of enzymatically active 
cell surface are retained in such preparations to produce 
biosynthesis. This has proved to be a very valuable 
biochemical tool particularly when used in conjunction 
with isotopic tracers. A a 

Sirñilar results were obtained with the so-called 
“microsonfe” preparations from fragmented cells sub- 
jected to differentšal centrifugation. The microsomes 
occur in the fraction lighter than that containing the 
mitochondria. Electron-microscopic examination shows 


these microsomes to be fragments of thé membrane- 
limited structures in the cell (“endoplasmic reticulum,” 
“cytomembranes,” “ergastoplasm”) which, when as- 
sociated with RNA-rich particles (“ribosomes”), are 
thought to constitute the biosynthetic center of the 
cell. Electron-microscopic examination of very thins 
sections of actively synthesizing cells suggests that the 

cytoplasm may be divided into two phases by such 

membrane-limited structures. One phase: is continuous 

with the space between the nuclear. membranes and 

perhaps also with the extracellular fluid by means of 

micropores in the limiting cell membrane. This phase 

contains a relatively homogeneous fluid material which 

doubtless contains biosynthetized materials. The second 

phase contains the mitochondria and the RNA-rich 

granules (ribosomes), which are the biochemical and 

structural apparatus of biosynthesis. Thus, there may 

actually be two plasms in cytoplasm (one of which 

might perhaps be called trophoplasm) as adherents of 

the alveolar theory supposed. However, the membranes 

that separate the two are themselves metabolically 

very active ones which, by partitioning off regions of the 

cytoplasm, facilitate the biosynthetic process. 

Other membrane-limited structures are also of funda- 
mental significance. Of these, perhaps only two need be 
mentioned: the Golgi system and the limiting envelope 
or plasma membrane of the cell. 

The Golgi “apparatus,” which was considered by 
cytologists to play an essential role in secretion proc- 
esses, came under severe criticism as a possible fixation 
artifact. However, electron-microscopic observations 
indicate that it is a multilayered structure frequently 
seen in the secretogenous zone of the cell. It is supposed 
by some to be the “wrapping and packaging” depart- 
ment of the cell in which secretion granules and other 
particulates may be enclosed in a membranous wrap- 
ping which, by fusion with the cell membrane, allows 
the granules to traverse the plasma membrane. 

The plasma membrane serves not only to enclose the 
cell and to direct thé molecular traffic into and out of 
the cell but also, presumably, to mount the bidchemical 
mechanism by which solutes, such as sodium ions, may 
be transported or “pumped” against an activity gradi- 
ent. Electron-microscopic observations indicate that 
such membranes have a structure similar to that of the 
typical unit membrane in cytoplasm generally. This 
consists of two dense lines separated by a less dense 
area. The total thickness is 60 to 70 A and may consist 
of a bimolecular layer of mixed lipids with a monolayer 
of protein or other material on either interface. As in the 
case of mitochondrial membranes, Tt remains to be: 
seen whether or not improved techniques cah seveal 
the presence of specialized, possibly enzymatic, struc- 
ture having a role in transport or other hiochemical 
processes mediated at the cell surface. 

One comes finally to the fibrous types of arrays which 


are ubiquitous in protoplasm, being particularly con- 
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spicuous in the connective tissue, in the mitotic appara- 
tus of dividing cells, and in tissues such as muscle where 
rapid and reversible interaction of the macromolecular 
lattice causes contraction or produces tension. 

Since the early pioneering work in the twenties and 
<hirties, particularly that of Astbury, many fibrous 
proteins have been isolated and investigated with 
respect to their structural and physicochemical proper- 
ties. It has been found that these elongate macromole- 
cules manifest a high degree of specificity in their ability 
to aggregate in ordered arrays, to change the pattern of 
interaction with change in the chemical environment, 
and possibly to change the intramolecular configuration 
(helical, supercoiled, and uncoiled, random chains). 
This matter is dealt with in detail in this Study Program 
in the case of certain proteins such as collagen and 
certain muscle proteins which manifest a specific band 
structure in the electron microscope and by means of 
which their aggregation states can be determined. 

The production of fibrous lattices in cells and tissues 
apparently depends upon such builtin properties of the 
macromolecules. There are no microworkmen to go 
about in the cells joining up the proper molecular beams 
of the scaffolding according to some predetermined 
blueprint! Rather, after the elongate macromolecules 
have been synthesized, and possibly activated by pre- 
liminary enzyme action, they automatically polymerize 
and aggregate with each other according to their 
builtin specificity, to form the structure or microlattice 
appropriate in each protoplasmic situation. Thus, the 
genetic code may be thought of as arranging for the 
biosynthesis of the complex macromolecules of many 
kinds, but interaction of the macromolecules under 
specified conditions occurs spontaneously. In many 
instances, for example, characteristic fibrogenesis can 
occur in vitro. s 

Thus, fibrous arrays are essential, not only to provide 
structural and contractile elements but also, as in the 
case of the nucleic acids, to provide a means of arrang- 
ing specific linear sequences of determiners correspond- 
ing to patticular chemical codes. However, even the 
most enthusiastic exponent of DNA and viruses as 
linear codes, would hardly ascribe to them the proper- 
ties of life such as was suggested by Flemming and the 
adherents of the fiber school. 

This, as the specialized biophysical and biochemical 
functions of morphologically identifiable components, 
such as the mitochondria, ribosomes, Golgi system, and 
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ures, come to be better 
a study of partial systems 7 
he organization of the cell as 


a whole becomes more and more yea It 1s ave 
highly coordinated temporal and pee Bertie ie 
provides the dynamic, regulatory, nomeo ) : 
i ties of the cell which are the essence o 

adaptive proper , -5 bilities-of the cell 
life. Somehow, the biosynthetic capabilities.: ee s 

afe regulated and geared into the energy-yie IDE ESA 
actions of metabolism so as to produce the substances 
that may be required not just locally within the syn- 
thesizing cell but by cells at remote loci in the organism. 
One has only to think of the: complex ordering that 
controls the processes of growth and development ın 
which the individual cells seem to be governed by 
master plans or codes (morphogenetic “fields of gen- 
eral or local specificity) but in which the individual 
cells obviously must be constantly producing substances 
which make it possible for them to react meaningfully 
to the general and local fields. Indeed, the very existence 
of the field depends upon such processes in the indi- 
vidual cells. Yet the master plan (genetic coding) is 
characteristic of the entire organism as a whole, not 
only during development and growth but in all of the 
processes of aging. 

One sees illustrated here in especially impressive 
fashion a general property that seems to characterize 
living systems at all levels of complexity and that has in 
it an indeterminacy which is of fundamental significance. 
Somehow, the appropwate reactions occur at the right 
places and at the right time, in individual cells and in 
the organism as a whole, to produce the eventual result 
preordained in the coding of the DNA transmitted in 
the germ cells. o 

It is easy to see, therefore, why it is difficult at the 
present stage of scientific development to deal with 
problems of this nature at the molecular» level with 
precise methods of physics and chemistry, and at the 
same time to characterize with equal precision the 
ordering and the reactions that govern the over-all 

directive behavior of the organism as a whole. 
_ Toa certain degree, the difference in mode of attack 
eee re es papers in this Study Program 
e A ae by the professional orientation of 
biology” ors are primarily in the area of “molecular 
By as compared with fhose dealing with the 


a 8 
, t 
] rger systems aspects of cell, Issue, and organismic 


other membrane-limited struct 


understood (chiefly by 
vilro), the importance of t 
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IOLOGY is now in an exciting phase marked by 
the confluence of different disciplines of research 
in the attack on focal problems. But the efithusiagsm 
raised by the spectacular results of combined physical 
and chemical approaches to biology sometimes has out- 
raced people’s ability to keep pace conceptually with 
the technical developments. As a result, we frequently 
try to fit our questions to the very limited answers which 
our fragmentary knowledge has been able to provide, 
instead o? boldly facing the much broader questions 
posed by living systems and phrasing them in such a 
way that still more penetrating answers may be ob- 
tained în the future. In order to do this, one needs to 
focus on the real living objects, rather than on the 
somewhat fictitious and oversimplified models that one 
is prone to formulate, as intended targets for physical 
and chemical attack. Models ‘are necessary, but they 
must bear more than a coincidental resemblance to the 
real object if they are to serve as meaningful aids to 
analysis. 

As the result of this extensive use of overly simple 
models, notions about the cell have become at times 
slightly vague and unrealistic. It would be presumptu- 
ous in one single chapter to try to do more than to just 
give a few illustrative examples of what the real cell is 
like. The be&St that can be hoped is to show the change 
that has occurred in our thinking about the cell from 
the static to the dynamic—that is, from static organi- 
zation to organized behavior. Much of the knowledge 
of what the cell is has come from ruling out erroneous 
conceptions of what it is not. Progress has come from 
narrowing the margin of error. By being exposed to a 
few examples of the living cell in action, the reader can 
judge for himself whether or not his mental picture of 
the cell corresponds to the real thing. 

The first example deals with one of the most promi- 
nent characteristics of the cell—its shape. Figure 1 (a) 
shows a textbook picture of a particular cell found in 
the cerebellum. One sees that the cell body has elabo- 
rate ramifications. This is the way one usually learns 
about a cell—through pictures in a book; and since the 
picture looks the same in all of the thousands of copies 
of a textbook of microscopic anatomy, one forms the 
notion that all such cells are like tin soldiers stamped 
out according to a standard pattern. Thus, the mental 
habit of cell form as something Static and rigid becomes 
engrained. The truth is, however, thaf no two cells are 
ever strictdy alike, nor is any one cell Quite the same at 
different times of its life history. It is this history which 
a static textbook picture fails to reveal. To stress this 
fact, Fig. 1(b) shows by comparison a Chinese brush 


f : 


oo 


drawing of some shrub. Here common experience tells 
us that the bush has not been stamped out in the shape 
in which one finds it. It has grown into that shape from 
seed. So, what one sees as pattern in the shrub is merely 
the residual record of prior activities of that particular 
protoplasmic system. In other words, shape is simply 
an index of antecedent processes by which that shape 
has come about. 

Something else is lacking, however, in both of the 
pictures besides the account of prior events. The objects 
are portrayed against a blank background as if they 
were in a vacuum. Again, in the case of the plant, one 
knows from daily life how vital the invisible air is for 
its existence as a provider of chemical necessities. 

Now, in the case of the cell, the medium is involved 
even more intricately: it provides not only chemical 
components for nutriment, but also a physical frame- 
work that integrates the separate cells into a structural 
continuum. The existence of this continuum usually 
remains unrecognized because staining techniques de- 
liberately leave the substratum out of sight. Yet, to the 
classical morphologist only seeing was believing and, 
what is worse, not seeing amounted to not believing. 
This attitude is undergoing radical change. 

Processes as such are not visible. What is visible is a 
constellation of elements at different stages of the 
process. Visible form, a pattern at any one stage, must 
be viewed as the product of antecedent formative 
processes. The cell thus appears as a system of highly 
complex, but ordered, molecular populations grouped 
in a hierarchy of supramolecular complexes, in constant 
interactions among themselves and with their environ- 
ment (that is, the space beyond the cell border), leading 
to features, some permanent, others transitory, the 
visible expression of which is recorded as shape. If the 
environmental conditions are reasonably constant for a 
group of cells of the same type, the behavioral history 
of the latter will be reasonably similar so as to end up 


Fic. 1. Comparison between a nerve cell 
Qk Wire (a) anda plant (b). 
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(a) (b) (c) (d) (e) 


Fic. 2. Samples of shapes assumed by cells of the same 
connective-tissue cell strain in tissue culture. 


with reasonably similar and classifiable shapes of the 
sort that have madeit possible for sciences of microscopic 
anatomy and microscopic pathology to develop. As soon 
as there is a change in the conditions, the behavioral 
response of the cell likewise changes and the familiar 
shape derived from normal standard conditions ceases 
to be a diagnostic sign. 

A most dramatic illustration of this situation is seen 
when cells are taken out of their normal site in an organ- 
ism and transferred into an extraneous medium in tissue 
culture. As an example, a time-lapse phase-contrast 
microcinematograph* of human-liver cells spread on 
glass in horse serum (film made in my laboratory by 
A. Cecil Taylor and Albert Bock) shows up im- 
pressively the lack of fixity and thé incessant reshuffling 
of cell content and contour. No static description can 
do justice to this vivid record of ever-changing activity. 
These liver cells in culture look quite different from 
those one would be used to seeing in stained sections 
through am intact liver. Except for shape, however, 
they still possess most of the essential properties of liver 
cells. In a third type of setting, for instance, in suspen- 
sion, they would assume still other shapes. 

Thus, one realizes that there is no way of getting a 
fully valid description of a cell except by studying its 
behavior under as wide a spectrum of conditions as is 

feasible. Cells of different kinds behave differently. 
Whilé the transfer to tissue culture alters their mor- 
phological ‘expressions markedly, they do retain their 
constitutional. distinctions of behavior. 


- “Ín con¢lusion, one is led to the thesis that cell shape 


is the fesnlt of a distinctive behavioral reaction of a 


living cell to its environment. a 
To make-this concrete, consider a specific example. 


* The motion pictures referred to in the text were shown at the 
Study Program in Biophysical Science in conjunction with the 
Jecture on which this article is based. ot a> 
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ast cells cultured in dilute blood 
spectrum of shapes. The same cell 
f the series of forms pictured in 
bipolar spindle at one extreme 
[Fig. 2(e)] to the multipolar star at the other [Fig AoT 
The shape thus depends upon the number of direc 3 S 
in which the cell border shows radial extensions. T 
tips of these extensions are the active mobile organs o 
tlre cell. They push outward and thus distort the origi 
nally rounded surface,of the cell. Evidently, if Una are 
only two processes tugging in opposite directions, the 
cell body is drawn out between them into the shape of 
a spindle [Fig. 2(e)]. If there are three major protru- 
sions, the cell assumes a tricornered shape [ Fig. 2(b) ], 
and with even more processes along its circumference 
it approaches more and more a star shape [Ejg. 2(a)]. 

Must one accept this spectrum merely as a given 
descriptive fact, or can it be explained causally in the 
way physical systems are treated? The answer js that, 
to a certain extent, the whole series can be expressed in 
terms of a single function derived from a study of cellu- 
lar behavior. Cells do not live in a structural vacuum 
as is the illusion created by standard histological prepa- 
rations, which, by stressing only those features which 
happen to be stained, obliterate the structural con- 
tinuum within which the cells reside. In tissue culture 
in a blood-plasma clot, for instance, this continuum is 
provided by a network of fibrin fibers—aggregates of 
molecular chains of varying diameter from submicro- 
scopic to microscopic „dimensions, the meshes of the 
network being filled with serum (Fig.-3). It is in this 
fibrous jungle that these cells live and move, applying 
themselves to the interfaces between the fibers and the 
liquid medium as to a trellis. As was mentioned before, 
the shape of these cells is determined by the number 
of protrusions from their surface. One can go one step 
further and prove that the number of such processes, 
mn ae is a function of the fibrous constitution of the 
medium, 


A colony of fibrobl 
plasma yields a wide 
can appear in any 0 
Fig. 2, ranging from the 


+ i 
Fic. 3. Electron mi 


different pH values m ograms of plasma clots coagulated at 


(from left to right: alkaline, neutral acid), 
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The relevant interattion is between the cell surface 
and the fibers in its microenvironment. To understand 
such surface reactions, one must give up in the first place 
the outdated notion that the cell surface is a sort of 
static cellophane-like bag. This may be true of some 
specialized cell types, for instance, the cellulose mem- 
brane of a plant cell, the capsule of a bacterial cell, or 
the envelope of a red blood corpuscle. But in most types 
of cells, the surface is far from stable afd is by no megjis 
of identical composition and state all over the cell. In 
tissue culture, this state of disequilibrium manifests 
itself in the continual thrusting forth and withdrawal 
of surface processes atthe. expense of cellular energy, 
showing great variations of the contractile force along 
the surface. Temporarily weaker points along the sur- 
face hus become outlets for thrusts. Such microleaks 
or “‘herniitions” may occur at random or they may be 
determined systematically by outside factors, of which 
one of the most important is the encounter of a fibrous- 
liquid mterface with the cell surface. The fiber contact, 
in a sense, pricks the cell surface locally. The strength 
of the resulting strain can be shown to vary with the size 
of the fibers. Hydrodynamic, viscous, and elastic compe- 
tition for outflow favors fibers which have a larger 
diameter (Fig. 4). Consequently, the prevalence of a 
few major protrusions over minor ones may be expected 
to be the greater, the larger the average fiber size is in 
the medium. 


These predictions have been tested (jointly with , 


B. Garber) by culturing cells in plasma clots containing 
fibrin fibers of different average dimensions. The average 
diameter of such fibers is a function both of pH (Fig. 3) 
and of plasm& concentration, larger fibers being formed 
at eitker lower pH or higher plasma concentrations. It 
actually was found, in line with expectations, that the 
ratio of bipolar cells (few processes) over multipolar 
stellate forfns (many processes) increased as a steady 
linear function as the plasma concentration was raised 


9 . . . 
Fic. 4. Microherniations of cell content at intersections of fibrin 
fibers of various sizes with cell surface. Arrows indicate proto- 


* plasmic outflow. 
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Fic. 5. Distribution of cell shapes in populations cultured in 
plasma clots of different average fiber sizes. 


or as the pH during clotting was lowered (Fig. 5). Other 
criteria of cell shape, such as the ratios of length over 
width of the cell bodies and of the cell nuclei, showed 
correspondingly systematic changes. In other words, 
the whole gamut of shapes displayed by this particular 
cell strain could be written in a single formula derived 
from insight into the mechanisms by which deforma- 
tions into one or another shape come about. The point 
to stress is that one gets further by studying realistically 
the formative process rather than by dwelling upon 
pictorial samples of forms already achieved. 

At the same time, it must be stressed that the formula 
is a probabilistic one, for to predict just what any indi- 
vidual cell will look like is impossible because of the 
accidental nature of the details of its surroundings. The 
microclimate and the microenvironment of the indi- 
vidual cell are unique, unknown in each particular in- 
stance, and this establishes a certain degree of variance 
for the actual expressions within each cellular system 
which is built into its nature. $ 

In a different medium, the same cell strain would give 
different responses. For instance, these cells, when sus- 
pended in a liquid medium without interlaced fibers, 
would manifest the inequalities along their surfaces by 
blunt herniations rather than by the pointed protrusions 
noted along filaments. As a result, such celis appear to 
be.blistering and boiling along their surfates as’can ke ` 
seen in cinematographs of unattached single @lls, and 
is well known from the'loose cells in the late stages of 
cell division. Conversely, cells of different tissue types 
would show morphologiéal responses, different from the 
strain just exemplified. The more a cell ‘is given to 
producing internal structures that serve as a cytoskele- 
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Tic. 6. Effect of regionally varying degrees of tensions on the 
organization of a fibrin nétwork and, through it, on the mor- 
phology and orientation of enclosed cells. 


ton—such as in sperm cells, higher forms of protozoans, 
or muscle fibers—the less will its shape be codetermined 
by physical constellations in its environment, although 
even in the most extreme case the polarization of the 
axiated system of the cell is presumably still a response 
to external gradients or to other inhomogeneities of the 
environment. There is an enormous task laid out here 
for future detailed investigations on the physical factors 
involved in cytogenesis and morphogenesis. 

As a further example of the complexities involved, 
one can again cite experiments with connective-tissue 
cells in tissue culture. A long time ago, I became pre- 
occupied with the role of external factors in guiding the 
oriented movements of cells. In 1927, I succeeded in 
orienting cells in tissue culture by applying stretch to 
the blood-plasma clot; the cells assumed a common 
orientation along the lines of tension. The analysis of 
this phenomenon over the years has led to the following 
summary conclusion: The primary effect is on the 
orientation of molecular chains which become alignea 
in the direction of the stretch. When cells are contained 
in such a medium, their shape—that is, behavioral 
deviation from a sphere—simply reflects the degree of 
structural organization of the underlying submicro- 
scopic fibrous network. Where all fibers run parallel 
and, therefore, where one direction only is open to the 
cells, they naturally become bipolar, all pointing in the 
same direction (Fig. 6). With decreasing amounts of 
stretch and correspondingly less rigorous orientation of 
the fibrin, cell forms grade over from the strictly de- 
termined spindles at one extreme to the only proba- 
bilistically describable cells of the multipolar sort men- 
tioned in the earlier example. In the present case, 


Elongation of erstwhile-round connective-tissue cells 
sheet of parallel collagen fibers from the interior of 


n as 


t vary from clot to 


with the average constitution of the 
lly within the same clot in accord- 
variation of an extrinsic factor, 


however, the average shape does'no 


clot in accordance 
clot, but varies loca 
ance with a systematic 


namely, stretch. 


However, the immediately relevant thing for the 


orientation af the cells is the orientation of the Pee 
pathways which they are bound to follow, am it is 
inmaterial that, fn those earlier experiments, the agent 
for producing such fibrous orientation was stretch. ae 
though the normal organism frequently uses stretch m 
that capacity, other forces also are at work to produce 
oriented fiber patterns, which likewise act as guides for 
cells. The inside of a fish scale, for instance, contains 
layers of collagen fibers beautifully arrayed in parallel 
lines, and when loose spherical cells are deposited on 
such a sheet they immediately assume spindle shapes, 
with the axes strictly aligned along the fibrous sub- 
stratum (Fig. 7). Therefore, to carry the analysis 
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Fic. 8. Phase-contrast photomicrogram of cells which have 
become elongated along streaks of silicone paste on glass {round 
white and dark splotches are clusters of cells over nonadhesive 
silicone where they have been unable to attach themselves). 


further, one must study why a round cell on a linear 
track becomes correspondingly deformed. 

A clue to the mechanism operating in this case may 
be obtained from the following experiment. A glass slide 
first is streaked with siticone paste or with cholesterol 
A loose cell suspension then is placed on top in an ap- 
propriate liquid medium. As the cell surface does not 
adhere to cholesterol but does adhere to glass, the cell 
becomes drawn out along theestriplets of hare glass; 
ae 1s oriented parallel to the streaks (Fig. 8). 
> ape is determined here by the differential adhesion of 

ifferent parts of the cell surface to the substratum. An 
ae more Impressive example of this kind is provided 
y cells which have been seeded out on glass scored with 
2 P olathe (experiments performed with A. Cecil 
aylor). These cells become deformed from a spherical 


to an elongate & ` eee 
grooves (Fig 9), ape in the direction of the micro- 


at then precise] 
sponse? To understa 
in which spindle-sh 


R a . 

y. is the mechanism of this re- 
nd it, consider an older experiment 
aped cells were made to spread out 
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flat when they were forced to splash against a smooth 
surface from above. Such a surface offers the cell in- 
numerable directions in which to radiate at the same 
time. Thus, the periphery of the cell actually expands 
concentrically, flattening the cell into a disk (Fig. 10). 
This transformation is accompanied by profound 
changes in function—the cell becomes phagocytic—and 
in the distribution of intracellular materials, further 
emphasizing the intimate dependence of chemical at- 
tivity on the physical constellatian in a cell. Similarly, 
a freshly isolated cell set down on scored glass spreads 
out at first in all directions, but only those sectors of 
the cell border which lie in a linear direction of the 
substratum retain a foothold and continue to advance. 
The other sectors are retracted and become consolidated 
as cet flanks. 

Each cell surface thus acquires radically different 
properties at the ends and along the flank. The two 
“engings,” one at either end, remain active and by ex- 
tending in opposite directions simply draw the cell out 


e fc. 9. Cells which have become elongated 
along the microgrooves of scored glass. 


into an elongate form. Such elongation again has further 
consequences for the cell. For instance, the mitotic 
spindle preparatory to cell division mostly will become 
aligned with the long axis of the cell so as to make 
elongation a major factor in the orientation of cell 
growth. But because of the opposing tugs, to which 
this elongate cell is subjected from its two active ends, 
there is no net forward movement. Such a cell simply 
shuttles back and forth about a stationary position, 
comparable to Brownian ‘motion, depending upon which 
one of the two ends happens to have the upper hand 
at the moment. 

Thus, although the foregoing has brought some deeper 
understanding of cellular orientation, it tells nothing 
aboyt the mechanism of cellular locomotion, which re- 
mains one of the basic unsolved biophysical problems. 
Something is known about those cases in which cells 
have pecial locomotor organs, such ag cilia or flagella, 
but when ft comes to ¢ells moving with their free, un- 
stable surfaces devoid of structural specializations that 
could be related to motility, ignorance is profound. 
Cyclosis in plant cells, the gliding of slime molds, the 
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Fic. 10. Spindle cell expanding into a flat disk 
on impact with glass surface. 


shift of a sheet of skin to coveraraw wound, the invasion 
of tissues by metastasizing cancer cells, or the penetra- 
tion of leucocytes through capillary walls to converge on 
a focus of infection—in none of these cases is it known 
just how the cell achieves these movements, except that 
it is suspected that gelation-solvation cycles or con- 
traction-relaxation alternations may somehow be in- 
volved. This is one of the most neglected areas of 
physical approaches to biology. Not only is the mode 
of locomotion shrouded in ignorance, but also there is 
equal uncertainty about the reasons why a free cell, 
which can extend in many directions, often advances 
steadily in one direction to the exclusion of others. As 
was just said, a cell left to its own devices in an isotropic 
environment strays at random with no net dislocation. 
There have been many theories and speculations 
about the directive movement of free cells. Once again, 
one can illustrate how progress has come from 
eliminating among such competing concepts the ones 
ruled out by factual’ analysis. Turning again to the 
sample object of cells in tissue culture, contrary to the 
scattered population of stationary isolated cells dis- 
cussed before, the cells of a solid fragment of tissue 
explanted into culture behave quite differently. They 
move in droves from the explanted piece into the empty 
medium, giving rise to the well-known phenomenon of 
“peripheral outgrowth.” Why do they move centrifu- 
gally? For a long time, I had considered this question 
synonymous with that of cell orientation and had in- 
voked the fibrous guide rails as explanations of “oriented 
movement.” But, as was just explained, orientation and. 
displacement are two different things, and for the dis- 
placement there has been no cnicial explanation. 
“Tropisms” and gradients of various kinds Have been 
proposed to explain the phenomenon. It has been as- 
Sumed that cells respond either positively or negatively 
to differential concentrations of hypothetical “attrac- 
tive” or “repelling” substances emanating from point 
sources, even though it has never been possible to 
demonstrate just how a cell could translate such direc- 
tional cues into actual convection towards or away from 
the sourge. Recent observations _on que tissue-culture 
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strain have, however, turned up a wholly different 
story, and it is this. j 

The only way to get locomotion of a cell with two 
motor engines at opposite ends is to stop one of the 
engines, at least for a while. Then the remaining engine 
ean tow the cell away without opposition. This is pre- 
cisely what happens in tissue cultures whenever free 
tips of two cells make accidental contact with each 
other: the colliding, ends become temporarily paralyzed. 
A wave of retraction runs over the affected processes, 
which become partly detached from the substratum, 
and the two cells thus come under the exclusive pull 
from their remaining motile ends facing away from each 
other. Thus, they move apart in opposite directions. 
After some time, the paralyzed ends gradually recover 
their motility and again take hold on the ground, so 
that the cells are stalled once more, but at a greater 
distance from each other. 

Extrapolating this process to a cell population with 
a gradient of densities, such as a tissue explant, it is 
evident that, of a pair of cells moving apart, the one 
shifting peripherally has a lower probability of en- 
if countering another cell than has the one which moves 
fi toward the explant. Statistically, this leads to a prev- 
alence of outward migration even though cells are 
actually free to move in any direction. Eventually, a 
situation obtains in which the cells have become so 
widely spaced that random collisions are no longer 
likely ; at this point further migration ceases. Thus, the 
i only gradient which plays any role is a gradient of popu- 

lation density. The phenomenon is formally comparable 
to the diffusion of molecules from regions of higher to 
lower concentration, except that one deals with the 
À random collisions not of molecules but of complex 
entities which may be treated as units for the purposes 
of description. Again, a close observation of the be- 
havior of the real living object has brought answers far 
more concrete than what could be anticipated from such 
generalities as “attractive” or “repulsive” forces be- 
tween cells. Parenthetically, it should be stressed that 
the type of contact separation between cells mentioned 
above is characteristic of the species of connective-tissue 
cells here described, as well as of several similar strains, 
but-it does not apply to other cell strains, for instance 
of the epithelial variety, where reaction of two cells on 
contact can be just the opposite—namely, their draw- 
ing much closer together, provided they are both of 
the same kind. Further details on this behavior are 
given by Weiss (p: 449). 

As still another instance of the dangers inherent in 
dedling with the living, cell in terms of broad generalities, 
one-mayeonce more cite the behavior of cells of diverse 
shapes observed in plasma clotsof different pH or con- 
tions. As was reported before, such cells have a 
i able number of processes, gach of which now may 
be thought of as a train behind an “engine.” In such 
it is possible to calculate the average rates of 
f the various.cell types in a given diréction 
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by dividing the total distance spanned in a given period 
by the time elapsed. Since such cells move neither 


steadily nor in a straight course, however, any such 
average value for “rate of advance” is wholly unrepre- 
fo} 


sentative of the true velocity of cell motility. As plasma 
concentration is increased, the cells tend to become bi- 
polar. This means, inevitably, that the PORE a of 
locomotion increases simply because the cotirse is less 
toztuous and the cell is stalled less frequently by simul- 
taneous divergent pulls from multiple processes to 
which multipolar cells are subjected. Consequently, in 
the lower concentration range, where most cells are 
multipolar, there is a progressive increase of the average 
rate as the number of processes declines toward the 
liminal value of two (Fig. 11). Once the great majority 
of cells have attained bipolarity, the average rate of 
advance remains constant, expressing more nearly the 
true velocity. By comparing Fig. 11 with.Fig. 5, it can 
be seen that, at a plasma concentration above 50%, 
where the “rate” curve levels off, more than three- 
fourths of the cell population are actually bipolar. Just 
as one cannot tell the true speed of a railroad train if 
the times of departure and arrival at the terminal are 
the only data available and the frequency and duration 
of station stops on the way are not taken into account, 
so the mere establishment of the average rate of loco- 
motion of a cell under various conditions has little 
practical meaning. Yet, the literature is full of examples 
in which such average rates have been used to assess 
the effects of a variety of agents or drugs on cells, with- 
out due attention to the effects of these agents on the 
medium, which may alter the whole setting in which 
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the cells move. Withoùt knowledge of these effects, no 
meaningful comparisons can be made. This illustrates 
some of the hazards of operating with average quantities 
when one deals with systems such as cells, the composi- 
tion and behavior of which are inhomogeneous in space 
and time. 

The foregoing discussion has illustrated how a con- 
trolled modification of the medium can indirectly modify 
cell morphqlogy and behavior, including locomotion, Tt 
has explained the orienting effect of tensions on the fiber 
systems of the medium; how such structures evoke 
conforming organization in the cell population residing 
in them; and how cell-to-cell interaction and, in the 
last analysis, population dynamics govern cell loco- 
motion. In this chain of events, an outside experimenter 
applying tensions to a culture medium appeared as the 
primary agent. This, of course, immediately raises the 
question as tg what factors serve this organizing func- 
tion within the living body. The sole agency of the body 
is its own cells and it was comforting, therefore, to find 
that cells, by their own activity, can create the type of 
orderly structural patterns in the intercellular spaces 
which had been imitated crudely by extraneous tensions. 
The cells engender in their own environment physical con- 
ditions and orderly restraints which then in turn play 
back on them as guides and regulators of their own 
behavior. Thus, a further step of complexity is added 
to the picture. There are innumerable examples, mostly 
poorly understood, all showing how, through an enor- 
mous variety of mechanisms, the same basic principle 
is served; which is that the ce]l population, through its 
products and interactions, sets up conditions modifying 
the behavior of the enclosed cells, and thus often leading 
to new settings and interactions which may cause 
further alterations of the cells, and so forth, in sequences 
of interactions of ever increasing complexity. 

To illustrate this, consider a piece of tissue embedded 
in a fibrous network and let the boundary of the tissue 
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Fic. 12. Effect of an expanding (top) or contracting (bottom) 
center on the architecture of the surrounding meshwork. 
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Fic. 13. Effect of two simultaneously contract- 
ing centers on a common meshwork. 


expand, as happens, for instance, in a vesicle or tube or 
cyst swelling by the secretion of liquid into its closed 
lumen. Evidently, as shown in Fig. 12, top, the meshes 
of the fibrous medium become circumferentially com- 
pressed so as to assume a predominantly tangential 
orientation around the fragment. Cells happening upon 
such a territory would obviously be forced to circle 
round and round producing an envelope or tunic. Many 
of the connective-tissuesheaths and capsules in the mesh- 
work of the body owe their origin to this mechanism. By 
contrast, if somewhere in the meshwork there arises an 
area which contracts, the fibrous components will be 
gathered toward that center in a predominantly radial 
orientation (Fig. 12, bottom). Cells in such zones then 
become likewise disposed radially. This type of effect 
obtains frequently in the vicinity of rapidly proliferating 
cell groups. It seems that chemical agents, as yet un- 
defined, are being discharged by such cells which cause 
intensive synaeresis of the surrounding colloids, which 
means condensation of the fibrous components with loss 
of bound water, the resulting local shrinkage being much 
greater than the gain of mass by cell growth. In other 
words, a purely scalar change in a piece of tissue— 
increase or decrease of volume—can, through the inter- 
mediary of a fibrous continuum, translate itself into 
marked vectorial effects, establishing well-defined geo- 
metrical and structural patterns. A further degree of 
ordered complexity is introduced if two or more cell 
masses are explanted together in a common clot 
(Fig. 13). Through their joint constriction effects, they 
deflect the fibrous meshes of the intervening medium 
into a line connecting the active centers, and this, of 
course, constitutes a path for direct cell traffic between 
them. The cells growing out from the two centers simply 
follow the submicroscopic bridge which has been laid 
down for them automatically by the tension-engender- 
ing chemical activity of their sources (Fig. 14). Here is 
one primitive example of how chemical action-can trans- 
late itself into physical organization.» . wate 
- Extrapolating briefly from these model systems of 
living cells, one may assume that the same sot GE inti- 
mate interdependence shown here for the microscopic 
dimension repeats itself both in submicroscopic and in 
higher supracellular dinfensions. One is led to the con- 
clusion that there is a tie between physical structure 
and chemicah activity which is Stubs and which 
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Fic. 14. Automatically established triangular cell-bridge con- 
nections between three separate tissue cultures (dark masses) in 
a common blood-plasma membrane. 


it is not enough to assert, but which has to be explored 
systematically. Where, in order to study a particular 
metabolic reaction in isolation, the biochemist provides 
optimum conditions for that process, the cell produces 
the unique prerequisites for that same reaction only in 
certain strictly confined localities. It does it through 
the metabolic products of other chemical reactions going 
on in another equally confined sample of cell space. 

Thus, one system feeds another and, in turn, is fed by 

a third, in a vast system of mutually interdependent, 

symbiotic, and harmonized partial reactions subsidiary 

one to another. This interdependence, characteristic of 
the living state, is what I once termed “molecular 
ecology.” As a field of investigation, it barely has 
reached infancy. Yet its significance is pointed up by 
the fact that, in the living cell, the various biochemical 
partial systems coexist in a common space, without 
preformed rigid partitions, but rather compartmentalize 
themselves by structural effects of their own activities 
of the sort exemplified in crude and elementary fashion 
by the case just cited. Physical structure and physico- 
chemical conditions limit the types of enzymatic and 
other chemical reactions that can go on in a given spot 
—for instance, along interfaces of fibrillar, lamellar, or 
corpuscular systems—while the resulting reactions, in 
turn, modify the physical substratum; and by continual 
interplay of this sort between physical structure and 
chemical action, the cell system passes first through its 
progressive developmental transformations and then is 
stabilized in the steady state of maturity. To some 
extent, physical structure then is frozen into static 
arrangements of cytoskeletons, but even then physical 
structures,still are regenerated continuously by cellular 
activity, leaving at least part of the cellular system in 
a state of incessant development and self-renewal. 

A realistic concept of cell behavior also must take 
count the limits set to interactions between dis- 
t parts by the formation of compartments within 
artments. A pean of the organism (Fig? 15) 
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in the diagram indicate the comp: ; ; 
that one must bear in mind. Oversimplified mental 
pictures form when, for instance, the simple statement 
is heard that a gene “controls” a particular featuse ol 
the organism, without allowing that it can do SO only 
through interactions with the outer shells, which in 
themselves have become progressively modified in their 
long developmental history by countless chains of inter- 
actions with other shells, including, of course, the inner- 
most, the gene. Since it has been my assignment to 
sketch the living cell as it truly is in all its highly ordered 
complexity, I feel compelled to caution against the illu- 
sion that a simple statement, such as, “a gene controls 
a character” reflects any similar degree of simplicity 
in the phenomenon covered by the statement. 

A final example projects the principle of interaction 
between physical structure and chemical activity up- 
ward into the realm of supracellular order, a field so 
baffling in its problems that many an investigator 
prefers to look the other way when he encounters them. 
The example is chosen from an almost diagrdmmatically 
simple object which, because of this, holds at leastsome 
promise of more penetrating analysis by the combined 
physical and chemical tools now at one’s disposal. It 
refers to the origin of the internal architecture of carti- 
lage. Cartilage is formed by groups of cells producing 
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ground substance between them. That ground substance 
has been identified partly as a mucopolysaccharide— 
chondroitin sulphuric acid—and collagen. But cartilage 
is not just chemical substances; it assumes characteristic 
shapes and configurations, depending upon its sites, 
owing to developmental processes under genic control. 
The nature of the problem becomes clear when one 
contemplates that the peculiar pattern of convolutions 
of one’s earlobes, which are but coveted cartilage, is*’an 
individual inherited characterisțic. How do such pat- 
terns form? Cartilage of the limb is different from carti- 
lage around the eye ; the former grows in compact, 
whorl-shaped masses, the latter develops as a flat plate. 
It has long been known that, if the cell group that is 
destined to form limb cartilage is reared in tissue culture, 
it wall grow according to the standard limb pattern, 
giving rise to recognizable skeletal elements. Similarly, 
the author observed years ago (jointly with R. Amprino) 
that the precursor cells of the scleral cartilage around 
the eye, after explantation as an intact group, go on to 
form cartilage in the shape of a plate. Evidently, in 
either case, the explanted tissue complex contained some 
physical properties that guided the cells contained in it 
into their respective typical arrangements and growth 
patterns. The crucial property differentiating between 
limb and eye cartilage had to be thought of as inherent 
in the block of tissue as a whole, the further development 
in vitro merely amplifying some distinctive architectural 
pattern already present in the tissue fragments at the 
time of their isolation from the embryo. 

To test this conjecture, the author (with A. Moscona) 
recently resorted to a technique that permits one to 
disintegrate®a tissue into its constituent cells and there- 
by te destroy any intercellular structures and supra- 
cellular arrangements that could have acted as guides 
for subsequent development. Cells can be separated 
from one another and from their matrix by trypsin, 
washed in a loose suspension, and then seeded out in a 
tissue culture where they can aggregate into random 
clusters. More about the manner of aggregation is re- 
ported in another article (Weiss, p. 449). When 
a piece of prospective limb dartilage and a piece of 
prospective eye cartilage were dissociated in this manner 
into their component cells and the cells of each type 
were permitted then to aggregate in tissue culture and 
to continue with their*actual cartilage-forming activi- 
ties, it turned out that the cells from the limb site 
produced cartilage of the typical whorl-shaped massive 
limb pattern (Fig. 16, top), while the cells that had 
originated at the eye site produced the plate-shaped 
laminated structure to which they would have given 
rise prior to their disaggregation (Fig. 16, bottom). In 
other words, the “blueprint” òf the architecture of the 
groûp performance is ingrained in each individual cell 
and not fust carried over into the culture by the block 
of tissue as a whele. Now, how to convert this figure of 
speech into concrete terms is problematical. But one 
may speculate that the architecture lies in the specific 
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Fic. 16. Cross sections through cartilages developed in tissue 
culture from dispersed and reaggregated precartilage cells from 
a prospective limb (top) and from a prospective eye coat (bottom) 
of the chick embryo, forming whorl-shaped and lamellar struc- 
tures, respectively. 


ground substances secreted by the cells. One would have 
to suppose that each cell type secretes a complex ground 
substance of a distinctive pattern of macromolecular 
stacking or crystallinity of such a kind that it would 
predispose a planar'array of the mass in the form of 
layers in one case, and a more massive isotropic arrange- 
ment in the form of three-dimensional whorls in the 
other. The cells then would dispose themselves in con- 
formity with these structural patterns of ground sub- 
stance for which they had furnished the elements in 
the first place. : 

This is sheer conjecture. No facts are known that 
would either support or contradict it. Yet this very 
uncertainty helps to point up the immensity of igno- 
rance in matters of supracellular organization.’ If one is 
convinced that such higher-order organization is to be 
explained solely in terms of cellular dynamics, then one 
must raise one’s sights to the domain -where’ those 
phenomena occur, which is no longer the intracellular 
microcosm. = Fetal te ; z 

The study of tissue architecture, used here’ as an 
example, may be oe of the easier inroads into tiie maze 
of perplexing problems .presented by the orderliness of 
the complex living organism, as against -the relative 
disorder and simplicity of its shattered fragments, on 
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which one preferably concentrates, mostly after homo- ` 


genizing them either physically or conceptually. 
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The lessons of this story repeat themselves as the 
viewpoint is shifted from the organization of the body 
in its structural aspects to the supracellular coordination 
and ordering systems of its functional activity, whether 
this be the homeostatic maintenance of blood compo- 
sition, the integrative action of the nervous system in 
behavior; or the mobilization of defense and repair 
mechanisms in response to pathological disturbances. 
Of course, the operative tools in all of these performances 
are the individual component cells, which in turn oper- 
ate through physically ordered subsystems in incessant 
chemical interaction. But unless one remains cognizant 
of the fact that the level of the organism is reached from 
the level of molecular biology not in one single jump 
over the conceptual gap customarily bridged by the 
word “‘organization,” and unless one learns to think in 
terms of a hierarchy of ordered systems, each one with 
some degree of identity and stability even on supra- 
molecular levels, one obscures rather than enucleates 
the problems to which thinking and research must be 
directed. The most impressive feature of a cell is not 
the constant flux, reshuffling, and variability of its 
population of molecules and particles (of which our 
cinematographs presented at the Study Program have 
given a clear expression), but the fact that, in spite of 
this ever-present change, each cell remains so remarkably 
invariant in its total behavior; that indeed, as an entity 
each behaves so much like millions of other entities 
equally variable in inner detail, that one comes to 
recognize them as essentially alike. Such relative in- 
variance of the whole presupposes the harmonious sub- 
ordination of the behavior of the parts to the conditions 
of the collective group. It presupposes that the free 
interactions among thesubunits are subject to restraints, 
the nature and direction of which vary adaptively with 
the state of the system as a wholé. I can think of no 
more propitious introduction to a biophysics program 
than by re-emphasizing that the restraints in question 
are not the sole province of chemistry, especially the 
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ichi ic bran hey include crucial 
Seo ical nature, with nonrandom molecular 
AT most prominent and analytically most promis 
G of the existence of these poem S z 
way detracts from the pragmatic value 0 ana Ys ag 
each elementary component process n its pie z: 
and of using whatever models, however simphhed, may 


be*found to be constructive as aids to Dee 
But it is precisely ‘the phenomenal advances ma r > 
the study of the partial and isolated system, ın cont Has 

to the dearth of information on the organized inler- 
actions of such systems in the living cell and organism, 
that lead me to conclude with a plea for a more bal- 
anced effort and more extensive occupation with the 
latter problems than there are at present. Cellularand 
tissue dynamics are population dynamics of species of 
molecules, of cellular subunits, and of cells. Such group 
dynamics cannot be derived solely from looking at the 
members of the population in isolation, but only from 
a study of the ecology and technology of their behavior 
in the group. It is hoped that this modest effort at 
presenting the case of the living cell not only has given 
a fair introduction to the problem but also has shown. 
that practical techniques for its eventual solution are 
at hand, with physical and chemical approaches indis- 
solubly interwoven. 


ch, but that’ t 
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[S.papsr is Concerned with some aspects of the 
organization of cells, particularly as seen with the 

light microscope, and the results of some experiments on 

dividing cells in tissue culture as shown by time-lapse 
motion pictures. The paper is illustrated by portions of 

a few key frames from the motion pictures. 

There have been several important periods in the 
study of cellular organization. One of them was the 
periog of descriptive histology, embryology, and cellular 
patholog? of the nineteenth century. In it a variety of 
important processes in the development and function of 
cells were portrayed largely, but not exclusively, on the 
basis of dead material. This was followed by a period of 
extreme skepticism in which a number of investigators 
claimed that nearly everything seen after fixation and 
staining was artifact ; they believed that only what could 
be seen in the living cell existed in it. There were even 
claims that nuclei could not be seen in living cells and 
hence were artifact. Actually, nuclei are easily demon- 
strable during life. This period of skepticism was helpful 
in some respects but it also did great harm because 
some of its claims were too destructively critical. One 
of the rewards of living in the present period is to see 
that many of the so-called artifacts of the previous 
century have been shown to exist in living cells by our 
more moderħñ methods of investigation. I remember 
distingtly the first time I saw a living cell with the 
newly developed phase-contrast microscope. What I 
saw was like an iron-hematoxylin stained cell moving 
around unger the microscope and offering complete 
verification, at the light-microscope level, of the exist- 
ence of certain structures previously known mainly in 
preparations of dead cells. 

As our methods of morpholegie, chemical, and physical 
analysis become more powerful and as the study of the 
minute structure of living systems reaches ever closer to 
the molecular level, it becomes more and more obvious 
that the old problem of what is artifact in our observa- 
tional material is still wth us. 

I shall show a series of newt cells cultured in diluted 
chicken plasma. In the time-lapse movies, constituents 
of a typical cell are shown as it goes through the process 
of cell division (mitosis). In the film, the reality becomes 
apparent of some structures that to some of you may 
have been only words. Three cells show the effects of 
either ionizing or ultraviolet, tétal cell irradiation. The 
succeeding five dividing cells demonstrate some of the 
effects of localized irradiation of smal parts of them. In 
the cells which were irradiated throughout their extent, 
several abnormalities appear, while in those irradiated 
locally, the number of abnormalities is much smaller. 
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The comparison of local versus total cell irradiation gives 
a little insight into some aspects of the machinery of 
cell division, a problem which Raymond Z rkle and the 
author began to work on some eight or nine years ago 
with the aid of our proton microbeam.! A few years 
later, Robert Uretz began to work with usin our study 
of mitosis; the simple ultraviolet microbeam, which he 
devised, has been of great help2.* With it, one can aim 
directly at a structure and watch it as it is being 
irradiated.*> Others in our laboratory, especially E. W. 
Taylor and R. P. Perry, have also contributed to our 
knowledge of what happens in dividing cells. Zirkle® has 
recently reviewed the methods and results of microbeam 
irradiation which started with the work of Tschachotin 
with ultraviolet light in 1912.7 

In our proton microbeam, the protons start with an 
energy of 2.0 Mev but lose perhaps a quarter of this as 
they pass through two layers of mica, each 5 u thick. One 
of these seals the delivery opening of the Van de 
Graaff generator; the other is the cover slip for the 
culture. Although there is a small percentage of scatter, 
if the culture is essentially in contact with the micro- 
aperture, 85 to 96% of the protons bombard an area 
about 2.5 » in diameter, and practically all of the pro- 
tons pass within a circle of 8 to 10 diameter at the 
target. (The indicated variations reflect slight differ- 
ences in approximation of the mica cover slip to the 
microaperture.) The scatter is not important with small 
numbers, but obviously becomes serious with hundreds 
of thousands. With a more powerful generator, it would 
have been possible to deliver a single proton into a 
target of 1’, but the smallest number we have been 
able to deliver with our generator is a burst of about 20. 

The Uretz microbeam makes use of a reflecting ob- 
jective which brings all wavelengths of visible and 
ultraviolet light into approximately the same plane. 
This device has an important advantage over the proton 
microbeam in that the size and shape of the area to be 
irradiated can be varied easily from circles 2 » in diame- 
ter to much larger ones, or to slits, or to irregular shapes 
of various dimensions. We can also compare the effects 
of microbeams of monochromatic and heterochromatic 
ultraviolet light with those of the proten mictobeam. 

The machinery by which a plantor animal cell be- . 
‘comes two cells is probably much the same in all, 
although much mofe is known about thé process in 
nucleated cells than in bacteria. Much of what is shown 
here has been known in the.Gescriptive sense since about 
1880 in the work of Stra§sburger. and espéci 
Flemming, among others. Buf.we know very li 
today>than these early workers knew about: ne 
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Fic. 1. This sequence [ (A) to J shows some of the main stages in mitosis. The nucleus at (A) is in a very early stage in the change 
of a relatively nongranular nucleus into chromosomes. The two large bodies are nucleoli. At (B), the cell has moved slightly and three 
nucleoli are visible, The nuclear granules begin to form vague cords which are much more definite at (C) and especially so at (D) where 
ecome typical chromosomes. (E) shows the change in arrangement of chromosomes from prophase into metaphase at (F) 
chromosome is dividing into two daughter chromosomes (as at arrow) which are beginning to move to opposite poles of the 
the chromosomes have completely separated and in (I) reconstruction of the daughter nucleoli is beginning and 
(arrows). The arrow in (E) points to a centriole. & & 


they have b 
In (G), each 
cell (anaphase). At (H), 1 
the cell body is constricting 
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mechanisms involved in this complicated, meticulous photographed with medium dark, phase-contrast mi- 


; process hy which the two daughter cells inherit equal- croscopy; 1400X. All cells are from e 


5 5 A o 

amounts of genetic material. . h . ina mixture of one part chi i i 
In the films, the area of the cells shown is approxi- hibi Pp icke pe ma adne pan 
amphibian Tyrode solution. The division process in 


ately 40 by 60 u, and the motion pictures are speeded “ 
z i 100 iN When present, the clock indicates these cells at 70°F may take as fong as 6 or 7 hr or as 
j Figures 1 to 9 are from the time-lapse little as 4hr. These differences, at the same te mpatie 


t the presentation of the paper, are unexplained. 
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UNTREATED CELL IN MITOSIS 9 


In cell 1, a typical mitosis, one can see tiny fat 
droplets surrounding the nuclear membrane, the fila- 
mentous and granular mitochondria, the nucleus with 
large nucleoli, and numerous, irregular small granules 
[Fig. 1(A)]. The author used to question whether the 
movement of granules in the cytoplasm was Brownian 
motion, since one sees it only in the speeded-up time- 
lapse movies, and not by direct observation of, the living 
But Taylor studied this Phenomenon and con- 
vinced him that it was Brownian motion. 

As the film continues, the small granules in the 
nucleus become more numerous and darker and gradu- 
ally form long snake-like bodies—the chromosomes 
LFigs. 1(A)-1(D)]. They are about 2 » thick and some 
30 u Sr mpre long. The nucleoli disappear suddenly and 
we do not see what happened to them [between Figs. 
1(B) and 1(D) ]. The nuclear membrane also disappears 
and ahout the time that it does the chromosomes begin 
a continuous, irregular movement. Warren Lewis, who 
has spent many years studying living cells, called this 
the “Dance of the Chromosomes.” 

The cell contains two other important small struc- 
tures called centrioles, one above and another below the 
group of chromosomes. The poles of the spindle are 
connected with the centrioles. The area with practically 
no granules is part of the developing spindle [ Fig. 1(E) ]. 
After some time, a pale stripe appears within each 
chromosome and extends through its length. Suddenly, 
the chromosomes divide along this stripe and the 
daughter chromosomes (or chromatids), each a half of 
an original ghromosome, move apart [Fig. 1(G)] and 
form the two separate daughter nuclei [ Figs. 1(H) and 
1(1)] Then the cell body constricts, the whole mass 
separating into two new cells, each with a nucleus. The 
temporary excrescences on the surfaces of the cell are 
normal wh@n: they are not more extensive than in this 
cell. In the irradiated cells, which are seen next, they 
may be much larger, more numerous, more violent in 
movement, and have longer persistence. 

The process of cell divisidn goes on in our bodies 
hundreds of millions of times a ‘day in order to replace 
those cells which are normally lost for various reasons. 
In the vast majority of cases, these mitoses are executed 
as meticulously as in this cell. 


RESPONSES TO TOTAL CELL IRRADIATION 


The next three cells show a number of different phe- 
nomena as a result of total cell irradiation. Cell 2 re- 
ceived 200 r of 200-kv x-rays. The film starts with the 
cell’in an early stage of mitosis. As a consequence of the 
irradiation, an abnormality develops instead of the 
ordewly cell division which occurred in the untreated cell. 
Chromosomes start to separate but th# separation is not 
complete [Fig. 2(B)}. Then cell constriction occurs 
about the mass of chromosomes [ Fig. 2(C)_] and forms a 
dumbbell-shaped nucleus connected by a long bridge of 


chromosomal material extending between the two 
daughter cells [ Fig. 2(D)]. 

Cell 3 received 350 r of 20¢-kv x-rays. Instead of 
producing a relatively simple malformation, as produced 
by 200 r in the previous cell, several abnormalities occur. 
The chromosomes make an abortive attempt to separate 
into their daughter chromosomes [Fig. 3(A)], violent 
“bubbling” appears on the cell surface, and two con- 
strictions of cytoplasm occur [ Fig. 3(B)J. We have seen 
bubbling go on for days in cytoplasin pinched off as in 
this case. Nuclear material now separates into two 
irregularly sized groups [Fig. 3(C)], but at one point 
there is a tiny connéction between thém, and the 
smaller group of chromosomes spurts over into the 
larger one, making a bi-lobed nucleus. 

Cell 4 received ultraviolet light over its whole body. 
Despite the marked abnormalities which result, the cell 
will eventually give rise to daughter cells which are able 
to exist for a time at least in tissue culture. Immediately 
after being irradiated, some of the motion of structures 
within the cell ceases, but some of it gradually returns. 
For a short period, there is an irregular arrangement of 
chromosomes and then terrific bubbling movements 
begin to appear [Fig. 4(B)]. The chromosomes and 
cytoplasmic granules move in and out of the bubbles. 
We call this “type 4 bubbling,” the most extreme degree. 
The term “bubbles” is a misnomer as they are pro- 
tuberances of cytoplasm and not bubbles; their nature 
is completely unknown. Through it all, the chromosomes 
stick together, more or less. The bubbling motion 
gradually begins to subside. Then the chromosomes 
separate into one large and one small group which move 
apart and a constriction of the cytoplasm develops 
[Fig. 4(C)]. As nuclear reconstruction begins, it be- 
comes apparent that the two nuclear masses are con- 
nected by a thin chromosomal bridge. 


RESPONSES TO PARTIAL CELL IRRADIATION 


In the next 5 cells, very small areas of nucleus or 
cytoplasm were irradiated by protons or ultraviolet 
light. . è 

There is a widely held view in the radiobiological 
literature that chromosomes are damaged secondarily as 
a result of irradiation of the cytoplasm. The experiments 
on the next two cells clearly refute this claim for the 
newt cells in our culture. In cell 5, the area indicated by 
cross-hairs [Fig. 5(A)], about 14u from the nearest 
chromosomes (arrows), was irradiated with 28 300 pro- 
tons. The intensity of irradiation of this bit of the cell is 
enormous when one realizes that each proton equals 600 
rep. Nevertheless, the resulting mitosis is perfectly 


‘normal [Figs. 5(B) and 5(C)]. Oe 


Cell 6 received 60 protons localized over a Small area 
of two chromosomes [see cross-hairs and arrowhead in 
Fig. 6(A)]. The daughter chromosomes on the nonir- 
radiated side separate cleanly, while those on the 
irradiated side are stuck together. The daughter chromo- 
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me ene is the maximum that the daughter chromosomes have separated. At (C), the 


sromosomal mass and produces a dumbbell-shaped nucleus (D) with a thin bridge ( 
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00 r of 200-kv x-rays. (A) is immediately after irradiation and shows no chang®. (B) was uA during the 


constriction of the cell (arrows) 
pean h (yn i He ine 3 lA are'on page 25.) arrow) connecting two main portions 
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somes move apart as a group, much as in an operfing 
hinge pinioned on the irradiated parts of the chromo- 
somes [Fig. 5(B)]. This produces a temporary chro- 
mosomal bridge [Fig. 6(C) ]. The difference between the 
failure of the 28 300 protons delivered to cytoplasm to 
produce any visible change in the chromosomes of the 
previous cell is in sharp contrast to the effect of 60 
protons delivered locally to chromosomes in this cell. 
Areas 8u in diameter of the nextethree cells were 
irradiated with ultraviolet light. The first of these ceils 


+. “>> (number 7) is in early mitosis (pfophase). The spindle 


has not yet formed and, as a result of cytoplasmic 
irradiation, will not form. There is an irregular arrange- 
ment of chromosomes characteristic of this stage. After 
the localized, cytoplasmic, ultraviolet irradiation, the 
irregular configuration persists [Fig. 7(B)]. The chro- 
mosomes ehift around from time to time and there are 
pulsations and temporary extrusions of the cell surface. 
After several hours, a regrouping of the chromosomes 
occurs suddenly. They do not give rise by longitudinal 
splitting to a double number of daughter chromosomes, 
as in normal mitosis. Instead, haphazard numbers of 
whole chromosomes aggregate into two groups which 
separate one from another [Fig. 7(C), arrows]. Then 
the cytoplasm constricts and two daughter cells are 
formed. Such cells obviously differ completely in their 
genetic complements from each other and from those 
cells arising from a normal mitosis. 

In the next cell (number 8), it is possible to follow the 
aggregation of the two groups of chromosomes more 
easily than in the previous one wherein cytoplasmic 
irradiation had prevented the appearance of the spindle. 
In this cell, which is in the mid-phase of mitosis, the 
spindle is fully developed. The area to be irradiated is 
shown Ut the cross-hairs [Fig. 8(A) ]. After irradiation 
of a small part of the cytoplasm (8 u) with ultraviolet 
light, the spindle breaks down and the chromosomes 
lose their characteristic, butterfly-like arrangement, be- 
coming irregularly disposed in the center of the cell 
LFigs. 8(B) and 10]. A pale longitudinal line becomes 
visible in each chromosome and marks the usual line of 
cleavage of each chromosome into its two daughter 
chromosomes (chromatids). The pale line disappears, 
however, before the whole chromosomes aggregate into 
two irregular groups [Fig. 8(C), arrows]. In cells so 
treated, the number of chsomosomes in each new nucleus 
may vary from 9 to 15 or even more, instead of 22 which 
is characteristic of this species. Constriction occurs and 
the daughter cells move apart. 

Since, in the experiments on cells 7 and 8 (and nearly 
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100 others similarly treated), irregularly sized groups of 
whole chromosomes move apart in the absence of the 
spindle, it would seem that the spindle serves primarily 
as a mechanism by which daughter chromosomes are 
guided as they are moved apart by unknown forces. 

From the days of Flemming on, there have been many 
theories about the machinery of mitosis. Some of the 
models, such as those based on rubber bands, have some 
tesemblance to what is seen in some stages of mitosis, 
but they contribute nothing to an umderstanding of the 
forces determining the processes in mitosis. It has been 
claimed that the daughter chromosomes become sepa- 
rated by electrostatic forces or by a localized imbibition 
of water, or that the spindle is made of a contractile 
material which pulls them apart. Although cognizant of 
many claims to the contrary, it is probably fair to say 
that there has been no delineation at all of the nature or 
amount of the forces involved in mitosis. 

Nor do we have any idea of the nature of the force 
which moves the irregular groups of whole chromosomes 
apart in the absence of the spindle. It is known that, in 
the early stages of mitosis, the two centrioles separate 
and take position at the apices of the spindle, and that 
there is a temporary attraction between the centrioles 
and the kinetochores of some of the chromosomes. 
Then, this attraction ceases and all the chromosomes 
move into a position midway between the two centrioles 
(metaphase). In the next stage of division (anaphase), 
the daughter chromosomes separate from their mates 
and move with their kinetochores foremost toward the 
centrioles. In those cellsin which the spindle is abolished, 
we assume that, during the separation of the groups of 
whole chromosomes, those chromosomes close to each of 
the centrioles go with it to a daughter cell. 

The last cell (number 9) shows a different phenome- 
non after localized irradiation (8u in diameter) of 
chromosomes with ultraviolet light [Fig. 9(A)]. A 
dramatic change in the refractive index of the chromo- 
somes appears in the irradiated area and produces what 
we call a “pale” spot [Fig. 9(B), arrows]. This change 
does not appear if much more than half of the chromo- 
some mass is irradiated. However, irradiation of as little 
as 2u of a chromosome will produce it. The mass of 
chromosomes immediately around the pale area becomes 
sticky and cannot separate into chromatids, while those 
some distance away can. From this there results an 
irregular, lobated single nucleus with malformed chro- 
mosomes and a pale spot in the center [Fig. 9(C)]]. 

In fixed preparations, the pale area also appears pale 
with most of the usual stains. A similar appearance 
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Fic. 3. This cell was irradiated with350 r of 200-kv x-rays. In it the anaphase movement did not progress as much as in the previous 
one [compare with Fig. 2(B)]. Shortly after GA), the cell begins to show violent “bubbling” and multiple constrictions. 


e 


a result, 


portiors of cytoplasm with or without chromosomal material become separated. At (C), the uppermost part of the picture shows a 


newly formed daughter nucleus and attached/below it is an irregular chromosomal mass arrow). o 

Fic. 4. This cell received,a 2.5-min exposure of a germicidal ultraviolet lamp 2.5 cm from the cukure. At the time of irradiation 
cell showed typical metaphase configuration. At (A), the chromosomes show an irregular rosette appearance. A few minutes lat 
cell begins to undergo violent bubbling and at (B) the chromosomal mass is being extruded into one of the bubbles. Later, there 


multiple constrictions (C). 
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resulés from the application of the Feulgen stain for 
deoxyribonucleic acid. This suggests that, in the de- 
velopment of the pale spat, the nucleic-acid moiety of 
the irradiated part of the chromosomes may have been 
so altered that it does not respond to the stain or has 


actually left the area. Ultraviolet-absorption stydies of 
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Fics. 5 AND 6 show the difference in 
effect of localized irradiation of cyto- 
plasm versus localized irradiation of 
chromosomes in dividing cells. The cell 
shown in Fig. 5 received 28 300 pro- 
tons at the area indicated by the 
double cross-hairs. The nearest chro- 
mosomes are indicated by the arrows. 
This cell divided into two by a per- 
fectly normal mitosis as shown in (B) 
and (C). Figure 6 shows a totally 
different picture as a result of irradi- 
ating small parts of two chromosomes 
(indicated by cross-hairs and arrow) 
with 60 protons during metaphase. As 
shown in Fig. 6(B), the chromosomes 
separated normally into daughter 
chromatids except those at the point 
of irradiation where, as indicated by 
the arrow, the chromosomes are stuck 
together. As a result, the two daughter 
nuclei (N,N) are joined by a chromo- 
somal bridge indicated by the three 
arrows. s 
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a FIG. 8. F Fic. 9. ee” 
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Fic, 7. This cell in prophase received localized, ultraviolet irradiation at the area indicated by the cross-hairs. The nearest chromosomes 

are shown by. the arrows. The spindle did not form and the chromosomes occupied an irregular clumped area (B) and later separated as 

whole chromosomes into two unequal groups, indicated by the arrows in (C). 


FIG. 8. The cell, in met&phase with a clearly marked spindle, received localized, ultraviolet irradiation of the area indicated by the 
poms The spindle ee promptly and the chromosomes show a haphazard clumping (B) much like that in 7(B). In cell 8, as 
in the previous one, whole chromosomes separate into two daughter clumps, shown by the arrows in (C). 


) 
črc. 9. This cell, in metaphase, received ultraviolet, localized irradiation ofan area,of chromosomes 8 u in diameter, indicated by the 
eee Within a eS the chromosomes showed a typical “paling” reactfén (B). The cell attempted t 


f “ycivice but was unable to 
execute a complete anaphase as the chromosomes immediately around the pale area were very stick 


y. There resulted single irregularly 
lobed nucleus with a aledGpot (arrose) ini (Giversity Haridwar Collection. Digitized by S3 Foundation USA k 
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general absorption spectrum typically shown by pro- 
teins: containing aromatic amino acids. There is a minie 
mum “at 250 my and a broad, maximum between 260 
and 280 mp. : : 

The relative effectiveness of the two types of irradia- 
tion is roughly as follows: Irradiation with ultraviolet 
light of an 8- spot of a chromosome will require 5 sec to 


make it sticky and about 10 sec to make. it palep while 
2 . l 


ST an 


BLOOM pte à 


Fic. 10. Newt cell in metaphase 
as seen with polarized light. (A) 
The spindle is prominent before 
irradiation. (B) The spindle has 
disappeared shortly, after localized, 


plasm. Courtesy R. B. Uretz. 


Fic. 11. Newt cell in prophase. 
(e) Before irradiation; (f) shortly 
after ultraviolet irradiation of 4 
diam circle of nugleus. The paled 
area at the place of irradiation is 
shown by the arrow. (g)-After ir- 
radiation, photographed by 260 mz 
light, and (h) photographed by 310 
my light. The diminished absorp- 
tion of the paled arga at 260 my is 
obvious (after R. P. Perry)8. 


sd 


pple destruction will need about 150 sec of irradiation 
; a Su spot of cytoplasm. Chromosomes irradiated 
Ocally with 2 Mev protons become sticky with 20 
brotons and pale with tens of thousands. It seers to 

gue a million\or more protons delivered*to a small 
area ot cytoplasm to destroy thé spindle; we have not 


done many such J 
; experiments this i 3 
proximate figure, and this is a very ap 
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SUMMARY absence of a spindle, groups of whole chromosomes can 


move apart and form the nuclei for the daughter cells. 

(5) Localized ultraviolet irradiation of chromosomes 
changes the index of refraction at the affected area. Such 
pale spots stain very faintly for DNA with the Feulgen 
method and absorb very little at 260 my. 


(1) Irradiation of dividing newt cells in culture with 
a: 200-kv x-rays or with ultraviolet light produces few or 
many abnormalities depending on the amount of radia- 

tion applied. 

(2) Localized irradiation of chromosomes with a few 
tens of protons makes them sticky. But no abnormalities 
in the mitotic process result when a small area of : eats 
cytoplasm is-irradiated with 28 300 protons. , o ae = es re Me Boom, g Pee, 197 

ə, == (3) Localized irradiation of cytoplasm with ultra- (1954), D E Hr PE ae acetal ie 
violet light prevents formation of the spindle or destroys 3R. B. Uretz and R. P. Perry, Rev. Sci. Instr. 28, 861 (1957). 
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= ped Sine TABLE I.—Continued. 
$. 
x iy (a) Residue nafe 1 S 
i l; ®) Amino-acid name (trivial) _ Residue formula 
a (c) Amino-acid name (systematic) 
5 cH, —CH—NH— 
| =cyS-* (a) cystinyl —CO—CH—CH:—S—S—CH: va 
me sit “et x 6} cystine | co | 
pte -cyS- (c) B-B’-dithiobis (a-aminopropionic NH ] 9 
We), acid) | i 
NH: (a) asparaginyl —CO—CH—CH;—CO—NH. J ae S 
| (b) asparagine | me 
-asp- (c) a-aminosuccinamic acid i) H 
NH: (a) glutaminyl —CO—CH—CH:—CH:—CO0—NH: 
` (b) glutamine 
-glu- (c) a-aminoglutaramic acid ie S 
EEE aee 
Amino-acid residues usually found in proteins 
Acidic (ionized form 
_ -OH-* terminal carboxyl —07 (—CO:") E e 
-asp- (a) aspartyl —CO—CH—CH:—C07 pKa=3.9-4.7 
io A (b) aspartic acid 
easy (c) aminosuccinic acid Me 
glu- @ glutamyl —CO—CH—CH,—CH:—CO2 pKa=3.9-4.7 
b) glutamic acid | 
(c) a-aminoglutaric acid jes 
HAH 
Si C=C 
= Ayr (a) tyrosyl Y 
ra (b) tyrosine —CO—CH—CH:—C C—O- pKa=8.5-10.9 | 
+ (c) a-amino-B-(p-hydroxypheny!) | WA © | 
propionic acid NH C—C | 
| H H ° 
(a) cysteinyl —CO—CH—CH,—S- : he 
_ (b) cysteine | d pKa=ca 10 
KO) -amino-B-mercaptopropionic acid He 
Basic (ionized form) © 
—CO—CH—CH,—C == | 
G 1 [ea | pKa=6.4-7.0 | 
P ARIEN Mee NON | 
(c) ewamino-6-1 yl-propionic aci 
Deenik 
—H,+ (—NH;* 
ee pKa=7.4-8.5 


; CO CH- CH CH, —CH,—CH,—NH;+ 
TEE NH 
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TABLE I.— Continued. 


(a) Residue name 


A Residue (b) Amino-acid name (trivial) 
symbol (c) Amino-acid name (systematic) Residue formula 
Amino-acid residues occasionally found in proteins 
Neutral 
-hypro- (a) hydroxyprolyl —CO—CH— CH: C 
(b) hydroxyproline © 
(c) hydroxy Py TOLL gine 2, earoary he o CH—OH 
7 5 aci x 
ER e (Found only in collagen) N—CH: 
| 
Acidic (ionized form) 
HeT 
* (a) diiodotyrosyl ew 
a) diiodotyrosy Vl 
(b8 diiodotyrosine —CO—CH—CH.—C C—O- pKa=6.5 
(c) 3,5-diiodotyrosine N VA 
NH C=C 
| H I 
(Found only in marine organisms and thyroglobulin) 
H Br 
C=C 
(a) dibromotyrosyl WA X 
(b) dibromotyrosine —CO—CH—CH:—C C—O- pKa=ca 7 
(c) 3,5-dibromotyrosine | NX Wi 
NH C=C 
| H Br 
(Found only in marine organisms) 
H I HER 
T i E por 
(a) thyroxy Yi f 
(b) thyroxine —CO—-CH CH: C C—0O—C C=O pKa=ca 6.5 
(c) 3,3/,5,5’-tetra-iodothyronine | NN Va N Wa 
NH C=C C=C 
o iai i Jal JI 
5 (Found only in thyroglobulin) 
Basic (ionized form) 
E 
-hylys- (a) ħydroxylysyl —CO—CH—CH:—CH:—CH—CH:—NH;t pKa=ca 11 
(b) hydroxylysine 
(c) a, -diamino-hydroxy-caproic-acid NH 


(Found only in collagen), 


oxidized to form the disulfide linkage, 


—NH—CH—CO— To ee 
CH: CH: 
SH oxidation l 
a = | 
i, reduction i 
lee re 
o 
—NH—CH—Co— ` —NH—CH—CO— 
or 2 
O —cySH— oxidation —cyS— 
2 + E i 
—cySH—e reduction —CyS— , 
| is found in all but a few proteins. This linkage may be 


between two otherwise separate peptide chains, or it 


a 2 
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may be an additional intrachain link. Both of these 
types of disulfide bonds are found in the insulin mole- 
cule, shown diagrammatically in Fig. 3. Calvin* has 
recently discussed the geometry of such C—S—S—C 
linkages. The actual spatial arrangement of a disulfide. 
bond is shown in Fig. 4. The intrachain disulfide of 
insulin leads to a “link” in the polypeptide chain. This 
link contains six amino-acid residues (20 atoms), and 
it is of great interest to find that links of exactly the _ 
Same number of residues are found in theepeptidé 
hormones oxytocin, arginine vasopressin, agd lysine 
vasopressin (see Stetten, p. 563). 

Recent studies by Perlmann® have done much to 
elucidate the various types of crosslinkages which 
involve phosphoric-acid residues. Orthophosphate 
-O PQT O and pyrophosphate —0— PO 


i 36 S 5 To tbe 


Positive charge - 


—— 


Net charge 


Negative charge 


Changé resulting from reactions with protons 


fi ` 
10 11 12 13 


Fic. 5. Titration curve of bovine serum albumin. Also repre- 
sented is the total positive charge and the total negative charge. 
These values are computed for 1 C-terminal carboxyl residue, 
99 aspartyl and glutamyl residues, 19 tyrosyl residues, 1 cysteinyl 
residue, 16 histidyl residues, 1 N-terminal amino residue, 57 lysyl 
residues, and 22 arginyl residues. 


figuration of these molecules. Each amino acid contains 
a-NH;+ and o-COO- groups, and the acidic or basic 
amino acids also contain an additional charged group. 
When amino acids are combined in peptides or proteins, 
the a-NH;* and a-COO- groups are lost through the 
formation of the peptide bond, and only the terminal 
NH;+ and -COO- residues can become ionized. But 
the acidic and basic groups of the amino-acid residues 
of category two (Table I) remain capable of being 
ionized, and it is primarily these residues which give 
the dipolar and ionic properties to the protein molecule. 
Titration of a protein solution with strong acid or base 
allows a calculation of the number of available acidic 
and basic groups and an estimation of their pk’s. Such 
a titration curve for bovine serum albumin calculated 
from the measurements of Tanford? is shown in Fig. S. 
Tt can be seen that, when the net charge of the protein 
is zero, there are approximately 94 positively charged 
groups (basic groups of histidy], terminal amino, lysyl, 
| and arginy] residues) and an equal number of negatively 
ti charged groups (carboxy! groups of terminal carboxyl, 
aspartyl, and glutamyl residues). The properties of a 

protein are greatly influenced by the presence of these 

many charged groups, and the dipole moment of the 

uncharged protein is a measure of the symmetry of the 
distribution of these charges. Studies of the titration 
curves of many proteins have indicated that most of 
the acidic and basic amino-acid residues are available 
reaction. [Rice (p. 69) discusses some of the diffi- 
jes jn the analysis of titration curves of proteins 

polyelectrolytes. ] Ir ceftain instances, how- 
e of these residues appear to be unavailable 
iction unless the protein is taken to extremes of 

wo well-established examples of this behavior 
moglobin and in ribonuclease. In hemo- 


fe ëz At 


ONCLEY 
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AJ S ` 
of aspartyl and glutamyl residues, and 18 e-amino 
groups of lysy! residues) are found to be unreactive at 
neutral pH values,” whereas in ribonuclease, 3 of the 
6 tyrosyl residues are found to be unreactive at their 
normal pK.” ; 

One of the most exciting developments concerning 
the structure of proteins has been the evolution of 
methods which shave led to the identification of the 
sequence of aming-acid residues in the peptide chains 
of a number of protems. These developments have been . 
made possibly largely through the work of Sanger and 
his colleagues who blazed the way by determining the 
sequence in insulin, a protein of about 6000 molecular 
weight.” Sanger’s approach was to develop a method 
for the identification and estimation of the N-terminal 
residues of proteins and peptides. The reacáion of the 
N-terminal residue with 1.2,4-fluorodinitrobenzene 
(FDNB) leads to the yellow dinitrophenyl (DNP) 
compound, 


j 
ong _ > PEN aaa > 
NO: R 


ee ee ‘eo 


K 89 ond 


NO: NO: 


This reaction is carried out under mild (slightly alkaline) 
conditions where the peptide bonds are quite stable. 
Acid hydrolysis of the resulting DNP compound leads 
to a mixture of the various amino acids present in the 
protein and the DNP-amino acid from the N-terminal 
residue. The DNP-amino acids are reasonably stable 
under the conditions of acid hydrolysis, and can usually 
be obtained in good yield. There are certain DNP- 
amino acids, for example DNP-proline* and DNP- 
cysteine, which are considerably less stable than others, 
however, and substantial corrections must be made for 
the destruction of such DNP-amino acids during 
hydrolysis. $, 

If the DNP-protein is only partially hydrolyzed, 
DN P-peptides can be isolated, and subsequent complete 
hydrolysis of these purified DNP-peptides reveals the 
nature of the amino-acid residues and the N-terminal 
sve ; whereas a partial hydrolysis of the purified 
RAST Pepias can lead to the arrangement of the 
P S ee the N-terminal peptide. By this method, the 
cae Eee adjoining the N-terminal residue 
a eens n sequence. In the studies on insulin, 
DN essence parte ce os 
Aan ene ae Ks were identified in this way. 
oe ethod of attack has been used to 

rmine the sesidue sequence in j rified 
peptides obtained f i P aac a 
‘pide aa tom eithes acid or enzymatic 
Ade eae Daan Mier a largemumber of such 
re completely identified (about 65 in the 


sath 


USA ; 
sulin), it was found that there 
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.TaBLE II. Peptides identified in h¢drolyzates of fraction B of oxidized insulin.* 


Dipeptides from acid and alkaline hydrolyzates 


H-phe-val-OH H-his-leu-OH 


H-his-leu-OH H-ala-leu-OH H-gly-glu-OH H-thr-pro-OH 
H-val-asp-OH H-leu-cySO;H-OH  § H_-leu-val-OH H-leu-val-OH H-glu-arg-OH H-lys-ala-OH 
H-asp-glu-OH H-cySO;H-gly-OH H-val-glu-OH H-val-cySO;H-OH H-arg-gly-OH 
H-glu-his-OH H-ser-his-OH H-glu-ala-OH H-cySO;H-gly-OH H-gly-phe-OH 
Pripeptides from acid and alkaline hydrolyzates 5 


H-phe-val-asp-OH H-leu-cySO;H-gly-OH 
H-val-asp-glu-OH H-ser-his-leu-OH 
H-glu-his-leu-OH H-leu-val-glu-OH 
H-his-leu-cySO;H-OH 


H-ala-leu-tyr-OH 
H-tyr-leu-val-OH 

H-leu-val-cySO;H-OH 

H-val-glu-ala-OH 


H-gly-glu-arg-OH H-pro-lys-ala-OH 


H-val-cySO3H-gly-OH ° < 


Higher peptides from acid and alkaline hydrolyzates 


H-fhe-vgl-asp-glu-OH 
H-phe-val-asp-glu-his-OH 
H-glu-his-leu-cySO;H-OH H-his-leu-val-glu-OH 


H-his-leu-cySO3H-gly-OH H-leu-val-glu-ala-OH 


H-ser-his-leu-val-OH 


H-ser-his-leu-val-glu-OH H-tyr-leu-val-cySO;H-OH 
H-ser-his-leu-val-glu-ala-OH H-leu-val-cySO3;H-gly-OH 


H-thr-pro-lys-ala-OH 


Sequences deduced from above peptides 


H-phe-val-asp-glu-his-leu-cySO3H-gly- 
-ser-his-leu-val-glu-ala- 


-tyr-leu-val-cySO3H-gly- 


-thr-pro-lys-ala 
-gly-glu-arg-gly- 


Peptides identified in peptic hydrolyzate 


H-phe-val-asp-glu-his-leu-cySO;H-gly-ser-his-leu-OH 


H-val-glu-ala-leu-OH 


H-his-leu-cySO3H-gly-ser-his-leu-OH 


H-leu-val-cySO3H-gly-glu-arg-gly-phe-OH 
H-tyr-thr-pro-lys-ala-OH 


Peptides identified in chymotryptic hydrolyzate 


H-phe-val-agp-glu-his-leu-cySO3H-gly-ser-his-leu-val-glu-ala-leu-tyr-OH 
H-leu-val-cySO;H-gly-glu-arg-gly-phe-phe-OH 


is} 


H-tyr-thr-pro-lys-ala-OH 


Peptides identified in tryptic hydrolyzate 


H-gly-phe-phe-tyr-thr-pro-lys-ala-OH 


Structure of the B-(phenylalanyl terminal) chain of insulin 


H-phe-val-asp-glu-his-leu- (cyS-) -gly-ser-his-leu-val-glu-ala-leu-tyr-leu-val- (cyS-) -gly-glu-arg-gly-phe-phe-tyr-thr-pro-lys-ala-OH 


a Taken from F. Sanger, Advances jn Protein Chem. 7, 56 (1952). 


was only one sequence which would fit all of the experi- 
mental results, assuming that there was a single poly- 
peptide chain of about 30 residues. It was further 
assumed that no rearrangements of the residues had 
occurred during the hydrolyses. 

The attack outlined in the foregoing is applicable 
only to single peptide chains devoid of interchain 
linkages, such as the disulfide link described earlier. 
Since insulin and most other proteins contain interchain 
and/or intrachain disulfide linkages, these must be 
broken before the sequence studies can be undertaken 
byothe foregoing methods. In the case of insulin, oxi- 
dation of all of the disulfide linkages to sulfonic-acid 
groups was carried out with performic acid. This reagent 
thus converts cystine to cysteic acid (H—cySO;H—OH), 
a strong acid. It also converts methionine to the cor- 


responding sulfone, and tryptophan to unidentified 
products. Since insulin contains no methionine or 
tryptophan, performate oxidation was capable of con- 
verting the insulin to two chains, called the A-(glycine 
terminal) chain and the B-(phenylalanine terminal) 
chain, with each half-cystine residue (-cyS-) converted 
to a cysteic-acid residue (-cySOsH-). The A- and 
B-chains were then purified, and the sequences of amino- 
acid residues in each of the two peptide chains were 
determined by the method outlined in the following. 
Table II records a number of the peptideSssbtained 
from such a study of fhe B-chain, and Fig. 8 shows the 
entire sequence for the insulin ‘molecule. The acid 
hydrolysis used in theedetermination of the amino-acid 
sequence in insulin caused the liberation of six moles of _ 
ammonia, originally present as the amide groups of the 


> 
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NH: 


lo 
H_gly-ileu-val-glu-glu-cyS-cyS-ala-ser- 
(A) 5 


i} NH: NH; 


| 
Bee glu- his-leu-cyS-gly 
B) 5 


Sheep Insulin (M. W. 5103) 


5| -cyS— pee ety À 


Oxylocin (M. W. 1007) 


1 o ep y Prolewey-N H H-cyS-tyr-phe-glu- i 
} 3 
NH: 


asparaginyl and glutaminyl residues. Reduction of the 
i insulin by lithium borohydride (LiBH,) in tetra- 
___ hydrofuran converted the w-carboxyl groups to primary 
alcohol groups, and acid hydrolysis of the reduced 
h insulin then showed that three asparaginyl residues, 
r3 three glutaminy] residues, and four glutamyl residues 
were present in insulin. In order to locate these amide 
groups on the individual glutamic-acid residues, two 
methods were used which depend on the fact that 
enzymatic hydrolysis does not break the w-amide 
_ linkage. The purified peptides obtained after enzymatic 
hydrolysis could then be tested for amide content, 
either by studies of their electrophoretic behavior (since 
the amide derivatives had one negative charge less than 
the corresponding w-acid derivatives), or by studies of 
the extent of ammonia liberation during their acid 
hydrolysis. In this way, it was found that residues S, 
15, 18, and 21 of the A-(glycyl terminal) chain and 
residues 3 and 4 of the B-chain were present in the 
= amide form. 
Im order to find the distribution of the disulfide 


d to completely identify the sequences and 
ages in beef insulin,” and resulted in the 
iN en in Fig. 6. The intrachain disulfide linkage 
lycyl terminal) chain is of special 
ystinyl residue at position 6 is com- 
tion 11 half-cystiny] residue through 
Studies on horse, sheep, whale, and 
e sequences similar to those found 
for the amino acids in positions 
A cy! chain. These 


Yo Ibs ONCLEY 


| 
val-cy$-ser-leu-tyr-glu-leu-glu-asp-tyt 
10 15 


-ser-his-leu-val-glu-ala- Jeu-tyr-leu-val-cyS-gly- gl 
10 15 20 


Horse Insulin (M. W. 5747) 


Eaa nS 


Arginine Vasopressin (M. W. 1084) 


asp-cyS-pro-arg-gly-NH2 


Beef Insulin (M. W. 5733) 
NH: 


NH: NH: 
cyS-asp-OH 
21 


-arg-gly-phe-phe-tyr-thr- pro-lys-ala-OH 
u-arg-gly-P eve y 1 e 


o 


` Whale and Pork Insulin (M. W. 5777) 


eE 


Lysine Vasopressin (M. W. 1056) 


a 


NH: NH: 
NH: 


NH: 


Fic. 6. Amino-acid sequences in insulin and some hormone peptides. 


residue differences are all within the hexapeptide disul- 
fide link. Figure 6 also shows the sequences of the insulins 
of the other species studied. Whale and pork insulin are 
seen to be identical. 

*As mentioned in the discussion of the intrachain 
disulfide linkage, three peptide hormones isolated from 
the anterior pituitary and synthesized by du Vigneaud 
—oxytocin, arginine vasopressin, and lysine vasopressin 
—all contain a similar hexapeptide disulfide link. The 
formulas of these peptides are shown in Fig. 6 also, 
and the possible significance of this structure in terms 
of hormone structure is discussed by Stetten (p. 563). 

The methods developed by Sanger and his colleagues 
for the study of the amino-acid sequence in insulin have 
subsequently been applied by a number of laboratories 
to the elucidation of the structure of several other 
proteins. Other methods of attack involving replace- 
ment of the DNP compounds by other types of deriva- 
tives have been developed, and the stepwise freeing of 
N-terminal and C-terminal residues by enzymatic 
hydrolysis catalyzed by aminopeptidases and carboxy- 
peptidase has proved to be fruitful. Methods have also 
been developed to replace the performate oxidation 
procedure for the elimination of disulfide bonds, and 
ae ha ee in the ay of proteins containing 
Se Enay A R oxidized amino acids. A 
sequence a en. R j e 

-teview by Anfinsen and Redfield." eee 
Fo Ge mote comp 
large peptides. Them T i pe A ar 
study of a series of r ie A ee Pane 
melanocyte-stimulati Re ete mork onthe 
ing hormones (MSH) and adreno- 


J corticotropi : 
a ngr"University Haridwar Cleat RiS hormones (ACTH, Figure 7 records the 
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: Beef Melanotropin Seryl-8-MSH) (M.W. 2134) 
H-asp-ser-gly-pro-tyr-lys-met-glu-his-phe-arg-try-gly-ser-pro-pro-lys-asp-OH 
5 10 15 18 


Pork 6-Melanocyte-Stimulating Hormone (Glutamyl-B-MSH) (M.W. 2176) 
H-asp-glu-gly-pro-lyr-lys-met-gla-his-phe-arg-try-gly-ser-pro-pro-lys-asp-OH 
5 10 15 18 


Pork a-Melanocyte-Stimulating Hormone (a-MSH) (M.W. 1665) 
CH3-CO-ser-tyr-ser-met-glu-his-phe-arg-try-gly-lys-pro-val-NH2 ° 
5 10 13 


> 


e 21 
© 


ACTH Preparation 


Sheep a-corticotropin 


Pork corticotropin-A (Cyanamid) 
Pork corticotropin-A (Armour) 


Fic. 7. Amino-acid sequences in melanocyte-stimulating and adrenocorticotropic hormones. Sheep a-corticotropin contains one 
more amide group, probably in position 27, 28, or 29. Beef a-corticotropin appears from preliminary structural investigations to be 
identical with sheep a-ACTH. All of these ACTH preparations seem to have no loss of biological potency when residues 29 to 39 (11 


residues) are removed by limited enzyme hydrolysis. 


results of extensive studies on these hormones in some 
five or six laboratories. Recent reviews of this work 
have been presented by Li,!*~!” Harris,! and by Anfinsen 
and Redfield. The amino-acid sequences of all of the 
MSH and ACTH materials studied to date have shown 
almost identical sequences . . .-lyr-.X..-met-glu-his-phe- 
arg-lyr-gly-.Y .-pro-... . This sequence has been shown 
in italics in Fig. 7, and is seen to occur at different 
distances from the N-terminal residue in the B-MSH 
preparations (residues 5-15) and in a-MSH and the 
various ACTH preparations (residues 2-12). The 
residues X and Y in this unique sequence are seen to 
be lysyl or seryl, and seryl or lysyl, respectively (see 
Stetten, p. 563). The two pork corticotropins-A, as 
prepared in the Cyanamid and in the Armour Labora- 
tories, are known to differ with respect to the total 
content of amide groups (none for the Armour product, 
and one for the Cyanamid product). The sequence dif- 
ferences in residues 25-28 of the two corticotropin-A 
preparations have not been definitely shown to indicate 
different sequences in the two preparations, since tech- 
nical difficulties may have occurred in the sequence de- 
termination. The sheep a-corticotropin preparation of 
Li differs from the corticotropin-A, in that Li’s prepara- 
tion contains two amide groups (one is*glutaminy] resi- 
due 33, and one not yet located, but probably in posi- 
tion 27, 28, or 29), anal also one more seryl and one less 
leucyl residue. It may be noted that the component 
amino acids in positions 25-28 are the same in all three 
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© Sheep a-Corticoiropin (a-ACTH) (M.W. 4540)* 
H-ser-tyr-ser-mel-glu-his- phe-arg-try-gly-lys-pro-val-gly-lys-lys-arg-arg-pro-val- 
5 10 15 20 


| 5 
-lys-val-tyr-pro-ala-gly-glu-asp-asp-glu-ala-ser-glu-ala-phe-pro-leu-glu-phe-OH 
25 30 35 39 


25 26 27 28 29 30 31 32 33 
-ala-gly-glu-asp-asp-glu-ala-ser-glu- 


| 
-asp-gly-ala-glu-asp-glu-leu-ala-glu- 
-gly-ala-glu-asp-asp-glu-leu-ala-glu- 


NH: 


Residue No. 


NH: 


NH: 


ACTH preparations, but differ in order. The different 
location of the amide residue (position 33 in sheep 
a-corticotropin and 30 in pork corticotropin-A) appears 
to represent a real difference in sequence. Preliminary 
structural studies by Li have indicated that the amino- 
acid sequence for beef a-corticotropin is identical with 
that for sheep a-corticotropin.” 

The two 6-MSH preparations are seen to differ only 
in the second residue, seryl for the beef hormone and 
glutamyl for the pork hormone. These materials are, 
therefore, often referred to as seryl-8-MSH and glu- 
tamyl-8-MSH. The acetylated amino group of the 
N-terminal seryl residue in a-MSH has only recently 
been identified," and its presence is rather unusual. The 
presence of this group, as well as of the amide group of 
the C-terminal valyl residue in e-MSH, causes a drastic 
change in the hormonal activity of this compound as 
compared with that of the longer a-ACTH peptide.!® 
This effect, as well as other aspects of the relation 
between the structure of these hormones, and their 
physiological activities, is discussed by Stetten (p. 563). 

The amino-acid sequence in glucagon, a small pro- 
tein involved in glucose metabolism, has recently been . 
determined and is shown in Fig. 8. This profess, -like 
the ACTH and MSH préparations, contains n® cystinyl 
residues. Unlike these hormones, however, the sulfur- 
containing methionyl residue is present in glucagon. 
The relationship between glucagon ‘and insulin in glu- 
cose tpetabolism is discussed by Stetten (p. 563). The 
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Hhis-ser-glu-gly-thr-phe-thr-ser-asp-tyr-ser-lys-tyr-leusiP- 
5 i 


NH: NH: NH: 


-ser-arg-arg-ala-glu-asp- he-val-glu-try-leu-met-asp-thr-OH. 
ser-arg-arg-a al a P-P ny p 
Fic. 8. Amino-acid sequence in glucagon (M. W. 3647). 


sequence arrangements in all of the unrelated proteins 
so far evaluated show no more similarity than could be 
expected from simple probability considerations, and 
none of them indicates that repeating sequences are 
important elements in the structure of these small 
globular proteins. Whether or not repeating sequences 
occur in the larger globular proteins is as yet unknown. 
Some strong evidence of repeating elements has, however, 
been found in certain fibrous-protein structures, and in 
histone and protamine, proteins found associated with 
nucleic acid in the nucleoproteins.™ 

Ribonuclease is the most complicated protein in 
which an almost complete amino-acid sequence has 
been determined. This enzyme, containing 124 amino- 
acid residues and four disulfide bonds (molecular 
weight 12000), contains more than twice the number 
of residues found in insulin. The presently known 
sequence as shown in Fig. 9 has recently been discussed 
by Hirs ef al. and by Anfinsen.*° About 23 residues 
remain to be located definitely in the sequence. No 
unusual partial sequences seem to occur, with the 
possible exception of a number of repeats (-ala-ala-ala- 
in positions 4-7, and -met-met- in positions 29-30). 
These partial sequences can be compared with the 
unusual sequence -phe-phe-tyr- in the B-chain of insulin 
(positions 24-26). 


H-lys-glu-thr-ala-ala-ala-lys-phe-glu-arg-ser-thr-ser-ser-asp-his-met-glu-ala-ala-ser-ser-asp-ser- 
5 10 ae glu-ala a SeI-Ser-asp-ser. Cease 
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A lyS-asp- @5 70, glu-cyS-tyr-glu oe tyr-ser-thr-met. Snr asb-cyS-arg:glu-ser-thr- 


individual residues within the 
oman numerals. Work of Anfinse 
{ studies by cH. W. 


Stai 


L. ONCLEY 


nces have been deter- 


iS) artial-residue seque: 
aA teins. A few of the more 


‘ned for a large number of pro of the 
eat a sequences are reported in Figs. 10 
and 11. The partial sequence shown for beef cyto- 
chrome-c (Fig. 10) is interesting 1m that this sequence 
contains the residues to` which the porphyrin-c pros- 
thetic group is attached. The exact attachment of the 
porphyrin-c residue may be either that shown in Fig. 10, 
or that of an otherwise identical structure where the por- 


phyrin-c'residue is rotated about the plane of the page 


by 180°. The corresponding partial sequences in a series ° 


of cytochrome-c preparations from sources other than 
beef are also shown in Fig. 10. Here, alanyl residues are 
sometimes replaced by seryl or glutamyl residues, lysyl 
residues by arginyl residues, valyl residues by lysyl resi- 
dues, and glutaminy! residues by threony] residues. The 
italicized sequence -cyS-.A.-.B.-cyS-his-thr-vail-glu- has 
been found to occur in this order in each of the six cyto- 
chrome-c molecules. This study, involving work by 
Theorell, Tuppy, and others, has been reviewed by 
Tuppy” and by Anfinsen and Redfield.“ The partial 
sequences shown for lysozyme and for human serum 
albumin (Fig. 11) are very incomplete. Many disulfide 
bonds are known to be present in these molecules, and 
serum albumin is somewhat unusual in that it contains 
a single cysteinyl residue. Many other partial sequences 
have been found for egg-white lysozyme,''” but a study 
of these sequences shows that certain of them must be 
spurious, and it has not been possible to locate the vari- 
ous partial sequences in relation one to another. Anfin- 
sen and Redfield“ suggest that the study of lysozyme 
illustrates the limit of structural information that can 
be obtained by acid hydrolysis alone, and that more 
specific degradative methods must be applied “before 
the complete amino-acid sequence can be obtained 
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Other species of serum albumin have been studied, and 
it has been shown that bovine serum albumin, while 
containing the same N-terminal residue (H-asp-), has 
threony! as the second residue and ends with a different 
C-terminal sequence, possibly -(ala, leu, thr, val, ser)- 
ala-OH." Porter” has obtained a fragment of the bovine 
serum-albumin molecule after mild chymotryptic 
hydrolysis which seems to represent about one-fifth of 
the entire molecule (molecular weight about 12000). 


Beef Cytochrome-c (M.W. 13 100) 


NH: NH: 


H-his- - - - peers —ala—glu —cyS-his-thr-val-ghu-lys-.. - 
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Pic. 10. Partial amino-acid sequence in cytochrome-c. The 
exact attachment of the porphyrin-c residue may be either the 
residue shown or the structure where the porphyrin-¢ residue is 
rotated by 180° [from H. Tuppy in Symposium on Protein 
Structure, A. Neuberger, editor (Methuen and Company, Ltd., 
London; John Wiley and Sons, Inc., New York, 1958), p. 71]. 
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Egg-white Lysozyme (M.W. 14 900) 
H-lys-val-phe-gly-arg-............- 


NH: NH: 


; | 
Stee o E Oo koe asp-gly-ala-asp-leu-OH 


Human Serum Albumin (M.W. 68 000) 
H-asp-ala-...... -cysSH- 


yin oe -gly-val-ala-leu-OH 


Fic. 11. Partial amino-acid sequences in lysozyme and serum 
albumin [from C. B. Anfinsen and R. R. Redfield, Advances in 
Protein Chem. 11, 1 (1956)]. 


This fragment appeared to have very nearly the same 
configuration as part of the original albumin molecule, 
since it effectively inhibited the reaction of a specific 
antiserum to bovine serum albumin with its antigen 
(see Kauzmann, p. 549). This fragment contained a 
single N-terminal amino acid (H-phe-), the unique 
cysteiny] residue, and a single disulfide bond. 

Further structural studies of amino-acid sequences 
are reported in other contributions to the Paris Sym- 
posium on Protein Structure" Considerable sequence 
data exist for many other proteins, notably papain, 
growth hormone, trypsin, chymotrypsin, pepsin, hemo- 
globin, and tobacco mosaic virus. 

It is necessary to always bear in mind the difficulties 
in the isolation of purified and homogeneous components 
in their native state. Proteins occur in nature as com- 
plex mixtures, and are often found in the presence of 
highly charged polysaccharides and/or other macro- 
molecules. These naturally occurring colloidal mixtures 
are often enclosed by membranes of varying stability. 
The extraction of a particular protein from such a mix- 
ture is always difficult, and is often impossible without 
the use of procedures which may cause permanent 
changes in the configuration or even in the covalent 
linkages of the resulting protein preparation. A number 
of useful methods for the isolation and purification of 
such protein systems are currently available for the 
separation of these components. These methods include 
precipitation and differential extraction by high concen- 
trations of salts or organic solvents, adsofstion and 
partition between immiscible solvents, as Well as such 
physical methods as electrophoresis and ultracentrifuga- 
tion. A discussion of shese methods cannot be under- 
taken here, but it is important to remember that the _ 


purified components isolated by these methods may | 
e r “ES r 
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Si ot be truly homogeneous, and artifacts may be intro- 
< duced by the isolation procedures. 


Tests for the homogeneity of the purified components 
must -be evaluated critically. Physical methods are 
most useful; they include studies of behavior in the 
"ultracentrifuge and in electrophoresis, measurements of 


al 
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solubility (akin to the phase-rule methods of constant 
ing or boiling points as applied in the case of 
ler molecules), and determination of the distribu- 
oefficient between various solvents. Chemical 


individual amino-acid residues, for example) 
ch value. Biological assay methods meas- 
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CARBOHYDRATES 


Carbohydrates comprise another of the major groups 
of naturally occurring organic materials. They are often 
found in combination with proteins as glycoproteins 
and mucoproteins, especially in the higher animals. 
They also occur in combination with lipids to give 
cerebrosides, and in combination with purines and 
phosphoric acid to give nucleotides and nucleic acids. 
They form the structural elements of plants, bacteria, 
and certain animals in the forms of cellulose, chitin, and 

“other high molecular-weight polysaccharides. They 
provide a means fpr the Storage of chemical energy in 
the forms of amylose and amylopectin in plant starches 
and glycogen in animals. 2 i : 

Although most monosaccharides are fairly stable 
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CH,OH s 


H NH, 


D-Glucosamige 


CH,OH 


Fic. 13. Configuration 
of some charged sugar 
derivatives. 

© 


4 H NH: CO: CH3 


N-Acetyl-D-glucosamine 


OPO; OH 
D-Ribose-3- phosphate 


transformations when dissolved in water, particularly 
in the presence of acids or bases. Although the chemical 
formulas of the monosaccharides are often shown in a 
linear form, much evidence suggests that, in the main, 
they exist in the form of five- and six-membered rings 
called the furanose and pyranose forms. Each of these 
forms exists as a- and B-isomers, termed anomers, and 
interconversions between anomers and between ring 
isomers occur even under the mildest possible condi- 
tions of acidity and temperature. One of the simplest 
methods for demonstrating and studying the equilibria 
between the various forms is by the measurement of 
changes of optical rotation with ime (‘‘mutarotation”), 
which can easily be observed with freshly prepared 
sugar solutions. These various equilibria (for D-glucose) 
are illustrated in Fig. 12. The linear forms are shown 
in the Fischer formulation where the carbon atoms 
must be thought of as tetrahedrons, bonded together 
by angles projecting into the page and with the H— 
and —OH groups projecting out from the page. The 
ring forms shown in the Haworth formulation are to be 
viewed as a perspective representation with the heavy 
bonds extending out from the page. Side chains in 
the Haworth formulas are written according to the 
Fischer convention. The rings, shown here as planar, 
are oversimplified, since the valence angles in a single 
coplanar ring would be appreciably greater than those 
in a “strainless” structure having valence angles of 
109°. The usual conformation of the pyranose ring is 


D-Glucose-1,6-diphosphate 


CH,OH coo 


H NH,* 


D-Galactosamine D-Glucuronic acid 


CH,OH CH,0SO; 


H NH S007 H NH- COCH3 
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N-Acetyl neuraminic acid 


probably a chain form, but the free sugars in solution 
probably can occur to some extent in any of the possible 
conformations, since there is a spontaneous equilibrium 
with the aldehyde form. Derivatives in which the ring 
is fixed (because of substitution on C-1) probably are 
stabilized in the chair form.” 

The aldehydo-hexoses such as D-glucose and p-galac- 
tose contain four asymmetric carbon atoms, while the 
keto-hexoses such as D-fructose and aldehydo-pentoses 
such as D-ribose contain three. Thus, many optical 
isomers exist (16 for the aldehydo-hexoses and 8 for 
the keto-hexoses and aldehydo-pentoses), and a num- 
ber of these are found in various natural products. 
The ring structures contain an additional center of 
asymmetry at carbon atom 1, and the anomers of each 
ring structure are usually designated a- when the hy- 
droxyl group of carbon atom 1 is cis- to the hydroxyl 
group at carbon atom 2, and $- when these two hydroxyl 
groups are frans-. In the case of 2-deoxy-p-ribose, the 
anomeric forms are designated like the D-ribose forms. 
a-D-mannose does not follow the (czs-) (¢rans-) rule given 
in the foregoing, and the commonly used nomenclature 
sof Hudson is to be found in Pigman’s review.” The” 
designation 1- indicates that the asymmetyic carbon 
atom most remote from the reference group (e.g., 
aldehyde, keto, carboxyl, etc.) has the same configura- 
tion as L-glyceraldehyde (see Fig.. 1). The L- and D- 
forms differ in the configuration of all asymmetric 


carben atoms. In addition to the three monosaccharides . <4 
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Sucrose 


Cellobiose (B-configuration) 


Lactose (a-configuration) 


Fic. 14. Configuration of some disaccharides. 


mentioned above, those most commonly found in 
bacterial and animal sources include D-galactose (like 
D-glucose with the configuration of C-4 reversed) and 
D-mannose (like D-glucose with configuration of C-2 
i reversed). Three other monosaccharides of major 
i interest include t-fucose (6-deoxy-t-galactose), L-rham- 
i mose (6-deoxy-L-mannose), and 2-deoxy-D-ribose, 
i where the designation -deoxy- indicates that the 
H appropriate —OH is replaced by —H. 
ai Haworth formulas for most of the aforementioned 
thi monosaccharides are shown in Fig. 12. Each of these 
nif sugars would exist in the same sorts of conformations 
Ht as are shown for D-glucose, and the ring structures 
wi commonly found in oligo- and poly-saccharides are 
| portrayed in the diagrams. Only the a-anomer is 
| depicted. The relative amounts of the four-ring con- 
. formations which will exist at equilibrium in aqueous 
solutions of the various monosaccharides vary con- 
siderably. In the case of neutral solutions of D-glucose, 
nearly two-thirds of the sugar seem to exist as B-gluco- 
‘pyranose, and just over one-third as a-glucopyranose. 
‘The two glucofuranose forms and the aldehydo-glucose 
form are thought to exist in minor amounts, In the» 
case of D-mannose, over two-thirds exist as a-manno- 
syranose and just under one-third as 6-mannopyranose. 
ms of p-ribose probably contain considerable 
tities of the furanose and aldehydo- forms, and the 
tar n is complex, and exhibits a minimum, 
There 0 a number of biologically important 


derivatives of the monosaccharides, a few of which are 
listed in Fig. 13. Certain of these derivatives contain 
ionizable groups. Thus, glucosamine and galactosamine 
provide primary amine groups with a pKa near 7.8. 
Glucuronic acid and galacturonic acid provide carboxyl 
groups with pK, about 3.3; N-acetyl-neuraminic acid 
(sialic acid) has a stronger carboxyl group with pKa 
2.7; and the phosphate esters have two potential 
hydrogen ions, with pH. values from 0.9 to 1.5 and 
from 5.9 to 6.3. The sulfate group of the amino-N- 
sulfate and the sulfate esters have a very strongly 
acidic hydrogen with pK #<1. The phosphate esters of 
the monosaccharides are products of intermediary 
ae ee in a number of other papers (see 
ninger, p. . i ; ; 
Nites x 310), ; Calvin, p. 147; Roberts, p. 170; 
The monosaccharide units of disaccharides may be 
alike as in maltose and cellobiose, or different as in 
cane and lactose. The hydrolysis of maltose and of 
eee yields two molecules of D-glucose, whereas 
vale yields D-glucose and D-fructose, and lactose 
ae D-glucose and D-galactose. The monosaccharide 
peewee ad combined through an oxygen bridge of the 
an de ah ydroxyl to a second hydroxy] from another 
e possible number of combinations of two 

monosaccharide units i : 
four or five f umits is large, since there ate usually 
ree hydroxy] groups in the monosaccharide, 


and the o ; c 
B-anomer See bridge can come from either the a- or 
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Fic. 15. Structural elements in selected polysaccharides. 


structure of four common disaccharides, and illustrates 
some of the linkages possible. In sucrose, the two 
hemiacetyl hydroxyls are comblned through an oxygen 
bridge between the two anomeric carbons, with the 
p-glucose being in the a-pyranose configuration, and 
the D-fructose in the $-furanose form. In lactose, the 
hemiacetyl hydroxyl (C-1) of -p-galactopyranose 
bridges to the C-4 hydroxyl of p-glucopyranose. Both 
maltose andecellobiose contain 1 — 4 oxygen bridges 
between D-glucopyranose units, but in maltose there 
is an a-linkage while in cellobiose there is a G-linkage. 

There are numerous polysaccharides of biological 
importance, with monosaccharide units or their deriva- 
tives as the polymer unit, often repeating either a single 
unit, ...A-A-A-A-..., or alternating two units, 
. . A-B-A-B-A-B-.... The type of glycosidic linkage 
between these units has the same diversity as was found 
in the disaccharides. There are certain of the polysac- 
charides that show a more complex structure owing to 
branching of the polymer chain by means of linkages 
through a third hydroxyl group of the polymer unit. 
Usually, the naturally occurring polysaccharides contain 
many hundreds of residues. The lability of the linkages 
makes the isolation and purification of undegraded 
materials dificult, and many polysaccharide prepara- 
tions thus show molecular weights and polydispersity 
not characteristic of the native product. 

Cellulose and amylose provide two of the simplest 
polysaccharide structures (Fig. 15). Both of these 
polymers are made up of D-glucose residues in the 
pyranoseeconformation, with 1— 4 linkages between 
the units. In cellutose, the pyranose ring has the 
B-configuration and each linkage is like that in the 
disaccharide cellobiose; in amylose, the rings are in the 


a-configuration and each glycosidic linkage is like that 
in maltose. Chitin resembles cellulose, and is a polymer 
of N-acetyl-p-glucosamine units linked by a 1 —> 4 
B-glycosidic bond. This 1 — 4 B-glycosidic linkage leads 
to polymers of low solubility, whereas the 1 — 4 
a-glycosidic linkage gives polymers of much higher 
solubility, owing to a spiraling of the macromolecule in 
a helix-like fashion induced by this type of bond. 

Amylopectin and glycogen have a basic structure like 
amylose, but about five percent of the residues branch 
by means of 1— 6 a-glycosidic bonds in the amylo- 
pectins, and about nine percent in the glycogens. The 
branched structure leads to solutions of much lower 
viscosity than would occur with amylose macro- 
molecules of the same molecular weight. Detailed 
structures of glycogen and amylopectin have been 
established by the use of specific enzymes which dif- 
ferentiate between the 1 — 4 and 1— 6 linkages, and 
Fig. 16 shows a representation of a segment of a 
glycogen molecule as indicated by such studies. 

The aforementioned polysaccharides have been made 
up of uncharged polymer units. A more reactive class of 
polymers is made up from sugar derivatives. Less is 
known of their detailed structure, because the high 
charge density in the polymer makes the application of 
physicochemical methods much more difficult, and 
because the modifying groups are split from the poly- 


_merizing unit by many of the same reagents that are 


used to degrade the polysaccharide. Among ke more 
important materials of this type, one finds Siyaluronic 
acid, heparin, and chondroitin sulfate. These highly 
reactive polysaccharides are found in various animal 
tissues, and the:structures now thought most likely are 


recoyded in Fig. 15. None of these structures has been 
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Í Fic. 16. Representation of a segment of a glycogen or amylopectin macromolecule. 
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AP definitely established at the present time. Other poly- 
is saccharides of this general type are to be found in 
bacterial systems, but none of these has structures 
that have been completely elucidated. 

r In the glycoproteins and mucoproteins, one finds 
large amounts of carbohydrate combined with pro- 
tein.2°27 Thus, the œ-acid glycoprotein, a homogeneous 
HE, crystallized material contains about 42% of carbohy- 
| drate with the following composition (expressed as moles 
per mòle of glycoprotein, molecular weight 45 000) : 
WN-acetyl-p-glucosamine, 32; N-acetyl-nuraminic acid, 
= 16; D-galactose, 18; D-mannose, 18; L-fucose, 2. Recent 
nae studies by Eylar indicate that most of this carbohydrate 
meets can be removed in association with a small 


that the smaller carbohydrate moiety of serum 
bulin could be removed in association with a 
ee. containing gfutamyl, aspartyl, and 
Tiue These interesting results indicate that 


3 lin ages exist between the carbohycrate 
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-~ but to date no reall 


CC-0. Gurukul Kangri University Haridwar 


residues and certain of the aforementioned amino-acid 
residues. 


LIPIDS 


The classification “lipids” is taken to embrace the 
“fatty acids,” all actual or potential esters of fatty 
acids, and often includes other materials soluble in 
“fat solvents,” such as triterpenes, carotenoids, and 
fat-soluble vitamins. This large field is only touched 
upon here, since a recent short review by Lovern,”® as 
well as the comprehensive reference book by Deuel” 
provide adequate background material. The simple 
fundamental lipid structures are often found to occur 
m complex structures of intermediate molecular weight, 
y high molecular-weight lipids have 
On the other hand, these complex 
often found in association with pely- 
r proteins to form high molecular- 

e 


been discovered, 

lipid molecules are 
saccharides and/o 
weight materials,30 


Although complex hpids often contain no charged 
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(e) Cholesterol (f) Estrone 


o 


(d) Sphingomyelin 


(g) B-Carotene 


Fic. 17. Some selected lipid structures. The pKa shown indicates roughly the acidity of the charged groups. 


e 


esters, etc.), there are lipid structures containing pri- 
mary amine, tertiary amines, phosphoric-acid groups, 
phenolic hydroxy] groups, etc. It can easily be imagined 
that the lipids owe much of their importance in bio- 
logical systems to the fact that they represent classes 
of materials which can bridge the gap from water 
soluble to water insoluble phases,without necessitating 
a sharp discontinuity. They are often thought to be 
concentrated in biological membranes and interfaces. 
Lipids also provide the most compact form of storage 
of chemical energy. 

The term “fatty acid” is not easy to define accurately, 
but is usually taken to include all of the straight-chain 
members of the acetic-acid series of carboxylic acids, 
and a number of naturally occurring unsaturated 
straight-chain acids (e.g., oleic, linoleic, arachidonic 
acids, etc.). The esters of these fatty acids include 
compounds with glycerol to form triglycerides, with 
a-glycerylphosphoric acid to form phosphatidic acids, 
with a-glyeeryl phosphoryl esters to form lecithins, 
phosphatidyl ethanolaraines, and phosphatidyl serines; 
with sphingosine to form sphingomyelin and cere- 
brosides; and with inositol N-phosphate to form phos- 


phoinositides; etc. A few typical lipid structures in- 
volving certain of the lipid residues mentioned above 
are shown in Fig. 17. In (a) a typical triglyceride is 
shown, with arachidonic acid (4 double bonds), stearic 
acid (saturated), and linolenic acid (2 double bonds) 
joined to glycerol. The three fatty-acid residues are 
shown in their extended conformation, with the stearic- 
acid residue above the other two. The double bonds in 
the unsaturated fatty-acid residues are usually found 
in the cis-configuration. Phosphatidyl serine (b) repre- 
sents a typical phospholipid, and is shown with-one 
residue of linolenic acid and one of palmitic acid (Satu- 
rated), joined through a glycerol residue to phosphoser- 
ine. Three ionizable groups are seen here—the strongly 
acidic phosphoric-acid residue (pK. about 1), the weakly 
acidic carboxyl of serine (pK. about 4.5), and the weakly 
basic amino residue of serine (PKa about 10). In other“ 
phospholipids the serine, residue may be replaced ‘by 
ethanolamine (HO—CH2— CH2—NH;*) or by choline 
[ HO— CH:— CH2—N*+(CH3}53]. In such phospholipids, 
no pK, near 4.5 would beexpected, and the basic group 
would be found to have a pK value of about 10 in the 
case of,ethanolamine and above 14 with choline. 
x ; ; 
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The plasmalogens [Fig. 17(c)] are glycerophosp 7 
tides containing an active group (shown here as the 
—OH on carbon-1 of the stearic aldehyde residue) 
which reacts as an aldehyde group. The structure shown 
is one of several which have been proposed, and illus- 
trates stearic-aldehyde and oleic-acid (1 double bond) 
residues joined to glycerol, which in turn is linked to 
phosphoethanolamine. This structure contains two 1on- 
izable residues—the phosphoric-acid group (pKa about 
1) and the amine group (pKa about 10.5). Palmitic 
aldehyde sometimes replaces stearic aldehyde, and 
choline often replaces ethanolamine in typical plas- 
malogens. Other common types of lipids contain 
the lipid base sphingosine, CHs— (CH)ı:— CH= CH 
— CH(OH)—CH(NH.)—CH.OH. Illustrated here isa 
typical sphingomyelin [ Fig. 17(d) ] with a linolenic-acid 
residue joined through an amide linkage to the -NH— 
of sphingosine, and phosphocholine joined through an 
ester linkage to the terminal hydroxyl of sphingosine. 
This molecule would have ionizable groups in the phos- 
phoric-acid and the choline residues (pK. about 1 and 
above 14). The cerebrosides represent another typical 
sphingolipid type (not illustrated). Their structure is 
similar to sphingomyelin, but in place of phosphocholine 
a sugar residue, often D-galactose (as a pyranose ring), 
is joined to the terminal hydroxy] of sphingosine 
through a glycoside linkage. Such a cerebroside would 
not contain ionizable groups, but the many hydroxy] 
groups of galactose would serve as active sites for 
reaction with other molecules. 

Cholesterol [Fig 17(e)] is frequently found in 
animal tissues, either as the free alcohol (as illustrated 
here) or as the ester, in which the cholesterol hydroxy] 
is linked to a fatty acid. Neither cholesterol nor choles- 
terol esters contain ionizable groups, and the solubility 
Properties of these molecules resemble those of the 
triglycerides. Estrone [Fig. 17(f)], a typical female sex 
hormone, has a ring structure much like cholesterol ex- 
cept that the first (A) ring contains three double bonds, 
the second ring is saturated, and the C-18 methyl is 

lacking. Estrone contains a ketonic oxygen at C-17, and 
a phenolic hydroxide residue (pK, about 9) at C-3. 

B-Carotene [Fig. 17(g)] is a highly unsaturated 
hydrocarbon, occurring in many plants and green leaves, 
and also in animal systems. It contains a highly con- 
jugated system of eleven double bonds, responsible for 
the intense color and the reactivity of the hydrocarbon. 
An alcohol (vitamin A) and an aldehyde (vitamin A 
aldehyde) are of great importance in visual pigments 
and contain a hydrocarbon chain similar to one-half 
B-carotene. Two closely related hydrocarbons, a- 
-carotene, 
doublé bonds in the carotenoids usually are ee He 
in the érans-configuration (in contrast to the situa- 
on in the fatty-acid residues), The carotenoid hydro- 
bons, along with other ferpene-type compounds 
ee built up from repeating isoprene units 
CH:)—CH=CH;]. Another shydrocarbon, 


oy 
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squalene, [CH;— 
= @H— CH: CH: 
in large amounts in sh ) 
mediate for the synthesi 


r et 
ee de of attachment of these lipid structures to 


the peptide moiety of a lipoprotein is poorly under- 
stood.” Denaturation of the protem 2nd extraction 
with lipid solvents is usually sufficient to completely 
break the lipoprotein complex, so that it’ would seem 
that, if any covalent linkages are present, they must» 
be very labile. A number of lipo-polysaccharides are 
known, and lipids are found to be firmly linked with 
some of the glycoproteins. It may be that the carbo- 
hydrate moiety of such complex macromolecules 
provides a covalent attachment for the lipid molecules, 
Much of the lipid material now thought to oCcur in cells 
as “free lipid” will, in the near future, be found to 
have intimate molecular binding to protein, glyco- 
protein, or carbohydrate cellular components. 


C(CH:)= CH—CH2— ), is found 
ark-liver oil, and also is an inter- 
s of cholesterol in mammalian 
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T is possible to-make a rough division in biological 
systems between large molecules and small mole- 
cules. They may be regarded as having quite different 
roles. Small molecules are actively involved in metabolic 
cycles, and they provide most of the energy needed for 
the activity of a biological system. The large molecules 
often have quite a different role in that their configura- 
tion—the three-dimensional organization of the mole- 
} cule—has a particularly important part in determining 
: the function of the molecule. Indeed, on the molecular 
level, structure and function are closely related. 
Almost all large molecules that are found in biological 
i systems are polymers; that is, they are formed by the 
5 linear association of certain fundamental repeating units. 
i This review discusses some of the general structural 
i features of these polymeric molecules, starting with 
Í some synthetic polymers which illustrate certain fea- 
i tures of structural organization, and then continuing 
: on to a group of more biologically significant synthetic 
f and naturally occurring polymers. 
i} In recent years, there has been an increasing aware- 
y i ness of the widespread occurrence of helical polymer 
i molecules, both from natural and synthetic sources. 
This is not surprising, since the helix is the most general 
configuration for a regular polymer. The position of ad- 
jacent residues along a helix can be specified by a trans- 
lation distance and an angular rotation. The most 
familiar example of a helix is the twofold screw rotation 
in which adjoining residues are related by an angular 
rotation of 180°; however, this is a specialized con- 
figuration and is discussed in the following. 

Most molecular structure determinations are the re- 
sults of x-ray diffraction studies. For diffraction work 
on polymers, a very useful expression has been obtained 

= by Cochran ef al.,' who derived the Fourier transform 
for helical molecules. Their analysis utilizes the helical 
symmetry in the molecule and expresses the Fourier 
f transform in terms of Bessel-function contributions to 
various layer lines. In this way, the position and in- 
tensity of the x-ray scattering by helical molecules can 
þe computed relatively directly. Their analysis has been 
_ programed into computing machines in order to facili- 
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tanding of the factors which determine 
of polymers can be gained from studies 
thetic polymers. It is illuminatigg to 
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Molecular Configuration of Synthetic and 
Biological Polymers 
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. 
compare the configuration of polyethylene — (CH) ,— 
with that of polytetrafluoroethylene — (CF2),— (known - ‘ 
commercially as Teflon). The polyethylene molecule 
[ Fig. 1(a)_] consists of a series of CH: groups organized 
in a zigzag chain in which all of the carbon atoms lie in 
one plane. The distance between alternate CH» groups 
is 2.54 A. If, however, the hydrogen atoms are replaced 
by fluorine atoms, there is a change in configeration as 
is shown in Fig. 1(b).? Instead of lying in a planar 
zigzag, the carbon atoms in the backbone are now 


(b) p> 


Fic. 1. (a) Planar zigzag chain formed by polyethylene (side 
and end views). Tht hydrogen atoms are not in van der Waals 
contact so that the carbon backbone remains fully exterfded. i 
(b) Helical-chain configuration formed by the fluoroc&rbon mole- l 
cule (side and end views). The bulkier ffuorine atoms are in van i 
der Waals contact and force the backbone chain of carbon : 
atoms into a helical configuration [from C. L. Bunn and E. R. : 
Howells, Nature 174, 549 (1954) ]. 


(a) 
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twisted. In the hydrocarbon chain, adjacent residues 
are related by a rotation of 180°; in the fluorocarbon 
chain, this angle has changed to 166°, and the molecule 
has a helical appearance. The fluorocarbon chain as- 
sumes this form as a result of the steric restraints im- 
posed on it by the bulkier fluorine atoms. The hydrogen 
atoms with a van der Waals radius of 1.1 A are so small 
that they are not in contact in polyethylene. However, 


a 


Fic, 2. Configuration of the polyethylene-terephthalate mole- 
cule. The unsaturated benzene groups lie nearly in the same plane 
so that the molecule has a ribbon-like configuration [from R. P. 
Daubeny, C. W. Bunn, and L. Brown, Proc. Roy. Soc. (London) 
A226, 531 (1954). 


the fluorine atoms with a van der Waals radius of 1.35 A 
are too bulky to lie in a planar zigzag, and a helical 
twist is introduced into the carbén backbone in order 
to fit each fluorine atom into the groove between the 
two fluorine atoms on adjacent residues. This fairly 
small structural change brings about a considerable 
alteration in the properties of the two materials. Looking 
down the axis of the fluorocarbon molecule [ Fig. 1(b) J, 
one can see that the polymer is rounded in cross section. 
This rounded contour is responsible for the first-order 
phase transition which occurs near 20°C, and is un- 
doubtedly owing to the rotation of these molecules in 
the crystalline state about their molecular axes. The 
hydrocarbon molecule, with its flattened cross section, 
does not have a phase change of this sort. This is an 
example of the effect of atomic size on molecular con- 
figuration and chemical properties. 

Another feature of interest is illustrated by the poly- 
mer molecule polyethylene terephthalate (known com- 
mercially as Dacron or Terylene). The repeating mo- 
lecular unit contains a benzene ring as shown in Fig. 2. 
Succeeding residues lie almost in the same plane, hence 
the molecule has a ribbon-like configuration. Figure 3 
illustrates how these molecules are packed in the crys- 
tal.’ The flat, unsaturated rings lie largely on top of each 
other. In this orientation, they are stabilized consider- 
ably by the van der Waals interaction which is large for 
the highly polarizable m-electron system of the ring. 
This stabilization undoubtedly contributes to the high 
melting point (266°C) observed in the fiber. This type 
of interaction occurs in biological polymers and is most 
frequent in the nucleic acids where the large unsatu- 
rated ring systems usually pile on top of each other. 
These are discussed more thoroughly in the chapter 
devoted to the nucleic acids (p. 191). 

When polymers contain electronegative atoms such 
as nitrogen and oxygen, there is often an opportunity 
for hydrogen bonds to be important in determining the 
configuration of the molecule. An example of this type 
of polymer is seen in polyhexamethylene adipamide, 
known as 66 nylon.® The structure of the e-crystal illus- 
trated in Fig. 4 is most clearly understood by noting 
that the molecules form a sheet structure in which 
adjacent units on the “a” face of the unit cell are held 
together by the horizontal hydrogen bonds which- are 
shown as dotted lines. This polymer is of particular | 
interest, of course, because its residues are joined by 
peptide bonds (-NH—CO-—) as are the amino-acid 
residues in a polypeptide or in a protein. 

_ Hydrogen bonds represent fairly weak or secondaty 
forces between atoms. Nonetheless, they are Gen. of : 
considerable importance in determining the c@nfigura- 
tion of polymeric systems, because the total number of 
bonds involved is of the*same order of magnitude as 
the number of monomer units in the polymer molecule, 
This snaa large number so that the total stabiliz- 
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pendent also upon the nature of these side groups and 
the sequence in which they occur. 

In discussing the molecular configuration of biological 
polymers, one usually describes the results of structural 
investigations carried out in the solid state. This de- 
scription is useful, since the solid-state structure often 
is retained in solution or in the living cell. 

The biological polymers may be listed as follows: 

(a) The nucleic-acid polymers. These are discussed 
at length in another chapter (Rich, p. 191). However, 
it should be mentioned that, in this polymer, each re- 
peating residue has a negative charge located on the 
phosphate group. Thus, the configuration of a nucleic- 
acid molecule represents a compromise in which the 
internally repelling electrostatic forces are balanced by 
the cohesive forces in the molecule. a 

(b) The carbohydrate polymers. Only a few such as 
cellulose and chitin have a regular configuration with 
the polymer chains organized systematically. The ab- 


wey 


(b) 


i cules in the crystal. (a) View perpendicular to the fiber axis. 
oy (b) View down the fiber axis. The larger dots represent carbon; 
| smaller dots, hydrogen; open circles, oxygen [from R. P. Daubeny, 
i C. W. Bunn, and L. Brown, Proc. Roy. Soc. (London) A226, 


| | l Fic. 3. The arrangement of polyethylene-terephthalate mole- 
| | 531 (1954)]. 


ing energy per covalently-linked molecule may become 
considerable. 


BIOLOGICAL POLYMERS 


Tn all of the polymer structures described in the fore- 
going, there is one regular repeating unit and the con- 
figuration of the molecule is the resultant of systematic 
F and repeating steric effects, van der Waals stabilization, 
Hil electrostatic forces, and hydrogen bonding. The bio- 
i logical polymers, however, have an additional element 

of subtlety. In the proteins, the peptide backbone is the 

: repeatil unit, while’ the sugar phosphate groups form 
re one in the nucleic acids.*In addition to these 
repeating units, however, there is also a variety of side 

_ groups attached periodicalty in some definite sequence 
to the polymer backbone. Hence, in addition to the 
ic, repeating interactions described in the fore- 


temati 
the configuration of the molecule, must be de- 
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Fic. 4. The molecular structure of the a-crystal of 66 nylon. 
The molecules are hydrogen-bonded into sheets in the plane of 
the page [from C.4W. Bunn and E. V. Garner, Proc. Roy. Soc. 


(London) A189, 39 (1947) ]. 


€C-0. Gurukul Kangri University Hatidwar Collection. Digitized by S3 Foundation USA 


a 
LASS 3 s 


MOLECULAR CONFIGURATION 


sence of a highly ordered configuration in most carbo- 
hydrate polymers is probably an indication that their 
biological function does not require any elaborate 
three-dimensional organization. Instead, it may depend 
on the stereochemistry of the sugar-to-sugar linkage. For 
example, the amorphous particles of glycogen stored in 
the cell are apparently just a convenient depot from 
which glucose residues can be removed by enzymatic 
action when.they are needed. These structures are not 


„discussed here. 


(c) Proteins. Perhaps the most varied biological poly- 
mer is the polyamino-acid chain from which proteins 
are built. The repeating chemical unit is a single peptide 
bond with one a-carbon atom between each bond in 
contrast to the six carbon atoms between peptide bonds 
in 66*nylgn. In the proteins, however, about twenty 
different types of side chains can be attached to the 
single a-carbon atom. The nature of these side chains 
has a profound effect in determining molecular con- 
figuration (see following). 

(d) Lipids. There are no lipid polymers, even though 
there are several lipid molecules with extremely long 
side chains which are almost polymeric in nature. 

It is interesting to note that there are apparently no 
mixed polymers in biological systems. Thus, one never 
finds a polymer chain involving both nucleotides and 
amino acids or sugar residues. If such mixed polymers 
existed, they might be unable to interact systematically 
through their backbones, and hence would lack an 
important element which gives rise to configurational 
stability. 


POLYAMINO ACIDS 


Present knowledge of the structure of the simpler 
polypeptides and proteins has its foundation in the ex- 
tensive work of Pauling and Corey and their collabo- 
rators who worked out the crystal structures of a number 
of amino acids and small polypeptides. From their work, 
several important conclusions can be drawn. 

(1) The dimensions of bond lengths and bond angles 
of the peptide group were obtajned to an accuracy of 
+0.02 A and +2°.°§ They found a similarity in the 
dimensions of the peptide group regardless of the par- 
ticular amino acids involved in the linkage.” These are 
shown in Fig. 5. The carbon-oxygen distance in the 
carbonyl group was found to be 1.24 A, even though 
the sum of the double-bond covalent radii is 1.21 A. In 
a similar way, the carbon-nitrogen bond distance in the 
amide group is 1.32 A, which is less than the sum of the 
single-bond radii (1.47 A). This points to approximately 
40% double-bond character in the C—N bond, and 
60% double-bond character in the carboxyl group owing 
to resonance between the two. j 

(2) As °a direct consequence of the double-bond 
character in the C—N bond, the amide group is ex- 
pected to have a planar configuration. Experimental 
results amply confirm this expectation. It has been 
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Planar 
amide 
group 


Fic. 5. Fundamental dimensions of polypeptide chains as 
derived from x-ray analyses of amino acids and simple peptides 
Hose $ B. Corey and L. Pauling, Rend. ist. lombardo sci. 89, 10 

1955)]. 


estimated that the distortion energy is about 1 kcal/ 
mole for a deviation of 10° from planarity and about 
3.5 kcal/mole for a 20° distortion.®:7 In all of the peptide 
structures determined, the peptide group is found planar 
to within a few degrees. 

(3) All of the small peptides now examined show a 
trans-configuration for the amide group. This is shown 
in Fig. 5 where the a-carbon atoms are attached to the 
planar amide group on opposite sides of the C—N bond. 
This conclusion from crystal-structure work is also sup- 
ported by dipole moment studies.” Using infrared ab- 
sorption, Badger and Rubaclava" have shown that the 
trans-configuration is more stable than the cis- by more 
than 2 kcal/mole. 

(4) Another feature of the peptide group is the fact 
that it has a hydrogen atom attached to the nitrogen 
and is, therefore, capable of forming a hydrogen bond. 
All of the crystal structures analyzed were characterized 
by forming a maximum number of N— H: - -O hydrogen 
bonds. Most of the hydrogen bonds are linear to within 
a few degrees and have a length of 2.790.12 A.67 

Thus, the polypeptide backbone consists of a series 
of fairly rigid planar groups which may or may riot be 
able to rotate relative to each other, depending upon 
the nature of the side groups attached to the a-carbon 
atom. This rotational barrier which may be as large as 
3 kcal/mole is an additional important factor in de- 


termining the ultimate configuration of the molecules. . 


Pauling and Corey. first pointed out that a retatively 
small number of configurations will meet all oO the four 
criteria described above. Thus, one can expect a fairly 
small number of stable polypeptide configurations. The 


following paragraphs review several of the more im- 
portant oness 
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(a) (b) 


Fic. 6. A planar sheet composed of fully-extended polyglycine 
chains. The atoms are represented as solid out to their van der 
Waals radii. The hydrogen atoms attached to nitrogen have a 

| depression in them for use in illustrating hydrogen bonding with 
i carbonyl groups (>C=0). (a) Single extended polypeptide chain. 
H (b) Antiparallel arrangement of polyglycine chains to form a 
j sheet. Note that the hydrogen atoms in adjoining chains are in 
f 


em 


van der Waals contact. This would prevent the formation of this 
sheet from polypeptides other than polyglycine. 


Extended Polypeptide Configurations 


| The simplest configuration of a polypeptide chain is 
l one in which the polypeptide chain is extended so that 

all of the planar peptide units tend to lie in a common 

plane. In this extended form, alternate amino-acid resi- 

dues have carboxyl groups pointing in the same direc- 

tion, and adjacent residues are related by a translation 
of 3.6 A and a rotation of 180°. The usual extended form 

f of polyglycine has a configuration which is close to that 
described in the foregoing. When fully extended, the 
polyglycine chain should have a repeat distance of 7.2 A. 
l However, the fiber-axis repeat distance, obtained experi- 
mentally, is somewhat less’; hence, the chains are 
almost but not fully extended. An extended configura- 

= tion is shown in Fig. 6, a photograph of atom models 
that are designed to show all of the van der Waals radii 
for the various atoms.° In Fig. 6, a single extended chain 

of polyglycine is shown also. This structure is helical in 


n angle of 180°. Chains of polyglycine can 
_be hydrogen bonded to make a sheet structure, as shown 
in ae 6(b), where the chains alternate in direction. 


with a rofatio 


a configuration in which the successive planar 
le residues are almost in the same plane could 
polyglycine only and not in any other polypep- 


i 
ees 


the same sense that polyethylene [ Fig. 1(a) ] is helical, 
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tide, because steric interference would result in the 
presence of a bulkier side chain. This can be seen most 
clearly by noting that the hydrogen atoms in the sheet 
structure are in contact. Thus, as illustrated in Fig. 6, 
it would be impossible to have a larger -carbon atom 
attached in the position of the hydrogen atoms. 

It is possible, however, to make a sheet structure 
utilizing hydrogen bonds in the plane of the sheet by 
rotating the peptide units until the B-carbon atoms are 
no longer in contact as they are in the polyglycine 
configuration of Fig. 6. This configuration is important 
in the structure of silk fibroin.. 


Structure of Silk Fibroin 


There is a rotational barrier between adjacent lanar 
peptide units in the polypeptide chain, and) as a con- 
sequence, only a limited number of configurations are 
stable. One of the earliest to be investigated critically 
by Pauling and Corey was a semi-extended form in 
which adjacent planar peptide groups were related to 
each other by a translation of approximately 3.5 A and 
a rotation of 180°. A drawing of the configuration, 
looking down the plane of the peptide units, is illustrated 
in Fig. 7(a). In this configuration, the 6-carbon atoms 
emerge from alternate sides of the polypeptide back- 
bone, and the backbone itself has a zigzag rather than 
a fully extended form. The most significant feature of 
this configuration is the orientation of the C=O and 
N—H groups which line in one plane and are perpen- 
dicular to Fig. 7(a). These groups are oriented so that 
all hydrogen bonds can be formed by aligning a series 
of these polypeptide chains side by side. This has been 
done in Fig. 7(b) to form what is called the “antiparallel 
chain pleated sheet” structure. Alternate chains run in 
opposite directions, and in this way the successive amino 
and carboxyl groups are hydrogen-bonded to chains on 
either side of the original chain. By working carefully 
with molecular models, it is possible to show that the 
identity period along the fiber axis is 7.00 A, whereas 
the lateral distance between equivalent chains (alternate 
chains) is 9.50 A. ° 

There are several interesting features in the anti- 
parallel chain pleated sheet structure. The -carbon 
atoms protrude at right angles from the plane of the 
sheet on alternate sides for successive residues along 
one chain. Furthermore, the B-carbon atoms are in phase 
on either side of the sheet; that is, they all appear at 
the same points along the fiber axis. Thus, there are 
large grooves between adjacent rows of B-carbon atoms. 

The common form of silk fibroin is obtained from.the 
silkworm Bombyx mori, and it has been the subject of 
x-ray diffraction investigations for many years. Most 
recently, Marsh eż al. have carried out an extensive 
and careful investigation and have arrived at 4 structure 
for this form of silk fibroin based on the antiparallel 
chain pleated sheet structure. Although the diffraction 
pattern is not simple, most of the reflections can be 
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(a) 


Fic. 7. (a) A drawing of a single polypeptide chain in the configuration found in the antiparallel chain pleated sheet. The chain 
is viewed in the direction of the hydrogen bonds and, therefore, along the plane of the sheet. Note the zigzag arrangement produced by 
alternate planar peptide residues. (b) A drawing of the antiparallel chain pleated sheet. In a given polypeptide chain, alternate B- 
carbon atoms are found on the same side of the pleated sheet [from R. E. Marsh, R. B. Corey, and L. Pauling, Biochim. et Biophys. 


Acta 16, 1 (1955) ]]. 


accounted for by a simple unit cell which is orthogonal 
with dimensions 9.20 and 9.40 A at right angles to the 
fiber and a fiber-axis repeat of 6.97 A. The similarity be- 
tween two of these dimensions observed experimentally 
and the pleated-sheet dimensions (7.00 A along the 
polypeptide chain and 9.50 A between equivalent alter- 
nate chains) suggested to these investigators that the 
antiparallel chain pleated sheet structure might serve 
as the structural basis of silk fibroin. Experiments 
carried out with rolled and flattened samples of silk 
fibroin produce the double orientation usually associated 
with layered structures. In this way, they were able to 
show that the 9.20-A reflection represented the repeat 
distance between molecular layers in silk fibroin. To 
deduce the structure, they had to arrange the layers of 
the sheet structure such that an identity repeat of 
9.2 A was obtained. 

One of the most remarkable features of silk fibroin 
is its unusual chemical composition. The simplest of 
the amino acids, glycine, constitutes almost one-half 
(44%) of the total number of amino acids present, while 
the ñext simplest amino acids (alanine and serine) com- 
prise almost 40% of the residues. This suggested to the 
investigators that a simplified model of the structure 
might contain glycine residues with the alanine and 
serine residues in between at alternate sites along the 


polypeptide chain. In this way, all of the side groups 
protruding from one side of the sheet are from glycine 
residues and, therefore, consist simply of hydrogen 
atoms. The other alternate sites are occupied by other 
side chains—mainly alanine which has one methyl group 
or serine which has CHOH as a side chain. When the 


Fic. 8. A packing drawing of the structure of si i 
(Bombyx mort) viewed parallel to the pleated shen The aoe : 
is vertical in this view [from R. E. Marsh, R. B. Corey anal, 
Paulirfg, Biochim. et Biophys. Acta 16, 1 (1955)]. ee 
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Fic. 9. A packing drawing of the structure of silk fibroin 
(Bombyx mori) viewed along the fiber axis. The hydrogen-bonding 
sheets are vertical in this view [from R. E. Marsh, R. B. Corey, 
and L. Pauling, Biochim. et Biophys. Acta 16, 1 (1955)]. 


side chains are distributed in this manner, it is possible 
to pack pairs of sheets together so that the bulky side 
chains are located between the two sheets. In this way, 
the B-carbon atoms of one sheet protrude into the space 
between the -carbon atoms from the other sheet. The 
distance from one sheet to the next is 5.7 A when they 
are packed in this way. However, when these double- 
sheet structures are stacked together, the glycine side 
chains are in contact and the distance from sheet to 
sheet over the glycine contacts is 3.5 A. Hence, the 
identity repeat for such a head-to-head, tail-to-tail 
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e . 
stacking of sheets is 5.7+3.5=9.2 A. This is, of course, 
exactly what is observed in the simple unit cell derived 
from the x-ray diffraction study of silk fibroin. 

A packing drawing of the structure of silk fibroin is 
shown in Figs. 8 and 9. In Fig. 8, the view is perpen- 
dicular to the fiber axis and parallel to the pleated 
sheets. This view shows how .the bulky methyl side 
chains are packed together with adjacent sheets filling 
alternate positions along the fiber axis, and how the 


small hydrogen side chains of the glycine residues fit , 


together. Figure 9 is a view down the fiber axis. In this 
view, the hydrogen-bonding between the chains can be 
seen in the vertical direction, and the alternate packing 
arrangement of side chains is also clearly shown. 

The reflections calculated from a simplified structural 
model of this type agree very well with the,intefsities 
obtained from the x-ray diffraction study of silk fibroin. 
This model does not account for all of the amino acids 
found in silk, including many with bulky side chains. 
They act to perturb this simplified model and produce 
some of the minor complications present in the diffrac- 
tion pattern. Most of the properties of silk, however, 
can be understood from the simplified unit of structure. 


Coiled Helical Configurations 


The extended form of polyglycine shown in Fig. 6 is 
called polyglycine I and was the first form studied 
crystallographically by x-ray methods. In 1955, Bam- 
ford and his colleagues“ described a second form of 
polyglycine which they called polyglycine II. The struc- 


Fic. 10. The crystal structure of 
polyglycine II as viewed along the 
axis of the helical chains. The 
dashed lines represent hydrogen 
bonds [from F. H. C. Crick and 
A. Rich, Nature 176, 780 (1955) ]. 
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(c) 


Fıc. 11. (a) Molecular model of a single polyglycine chain arranged in the helical form found in polyglycine II. There are three 
peptide residues per turn in this form. (b) Three chains hydrogen bonded together to make a planar part of the polyglycine-II lattice. 
(c) Three chains packed in triangular form as found in the polyglycine-II lattice. The hydrogen bonds connecting these three chains 


are partly obscured. 


ture of polyglycine II has been worked out by Crick and 
Rich,’ and it is a particularly simple one. In it, all of 
the polypeptide chains are parallel, and each one is 
organized into a threefold screw axis so that adjacent 
residues are related by a translation of 3.1 A and rota- 
tion of 120°. The chains are packed in an hexagonal 
array, each chain being hydrogen bonded to six neigh- 
bors. The hydrogen bonds lie roughly perpendicular to 
the screw axis, and run in several directions and not 
solely in one plane as in the polyglycine sheet structure 
described in the foregoing. Figure 10 is a view of this 
structure down the fiber axis. Int it, seven polypeptide 
chains are hexagonally packed with dashed lines indi- 
cating the linear hydrogen bonds which hold the ad- 
joining chains together. The hydrogen-bond distance 
is 2.76 A, and the spacing between the polypeptide 
chains is 4.8 A. Since the translation along the axis 
is 3.1 A per residue, the axial repeat is 9.3 A because 
of the threefold screw symmetry. In this configura- 
tion, the polypeptide chains can fulfill all of the cri- 
teria developed by Pauling for stable configurations of 
polypeptide chains. It should be noted, however, that 
the polyglycine-II lattice could be formed only by a 
polymer of glycine. If bulkier side chains were present, 
the molecules would not be able to hydrogen bond, since 
the chains are already densely packed with only hydro- 
gen atoms attached to the a-carbon atoms. A side view 
of the polyglycine-II chain structure is seen in Fig. 11, 


using molecular models which show the atomic van der 
Waals distances. On the left [Fig. 11(a)] is a single 
polyglycine chain arranged in a threefold screw axis. 
In the center [Fig. 11(b)], three of these chains are 
hydrogen-bonded to make one plane through the poly- 
glycine-II lattice. It can be seen that only one-third of 
the hydrogen bonds is utilized in forming a sheet. The 
remaining two-thirds of the hydrogen bonds are used 
in holding the sheets together. Thus, in order to form 
all of the hydrogen bonds, a three-dimensional network 
must be built up in polyglycine II in contrast to the 
two-dimensional sheet structure seen in polyglycine I. 
In Fig. 11(c), the same three polyglycine chains are 
present as are in Fig. 11(b), but they are arranged so 
that they are no longer planar. One polyglycine chain 
is lying on top of the other two so that the threé are 
hydrogen bonded. In this form, they represent a group. 
of three polyglycine chains obtained by selecting a 
triangle of three close-packed chains (see Fig. 10). It 
should be pointed out that there are two ways of select- 
ing a group of these three polyglycine chains. This can 


be seen most readily by observing the directions of the ` 


hydrogen bonds (N—H:?--O) which hold toggther the 
group of three. In one group, they are clockwise and, in 
the other, they are counterclockwise. This difference in 
the two groups is a consequence of the space-group 
symmetry, since the chains are arranged about a three- 
fold strew axis and not about a sixfold screw axis. 
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London, 1957), p. 


Structure of Collagen 


Just as the structure of the antiparallel pleated sheet 
serves as a model for deriving the structure of silk 
i fibroin, the polyglycine-II lattice can be used to derive 
j the structure of collagen. Collagen, found in the form 
i of elongate fibrils, is the major structural element of 
skin and connective tissue in the animal kingdom. One 

of its most unusual features is the chemical com- 

position. One-third of its residues is glycine, and it has, 
| in addition, a large number of pyrrolidine rings. Proline 

plus hydroxyproline make up about 24% of the residues. 
H Because of the high glycine content, it is perhaps not 
surprising that there is a relationship between the poly- 
glycine-II lattice and the collagen molecule. This rela- 
tionship can be seen most clearly in Fig. 12, which shows 
a schematic development starting from the polygly- 
cine-II lattice and ending in the collagen molecule.'® In 
Fig. 12(a), two parallel polyglycine chains are sketched 
t with-the hydrogen bonds joining them (dotted lines). In 
| Fig. 12(b), only the a-carbon atoms of the glycine resi- 
dues are shown, with short solid lines schematically 
aean the peptide units connecting them. It is seen 
that these polypeptide chains coil around the two axes 
and are held together by the hydrogen bonds (dashed 
ines). The collagen molecule may be derived from three 
solyglycine chains such as is shown in Figs. 12(c) and 
d). Iñ Fig. 12(c), a third polypeptide chain is added 
ehind the two shown in *Fig. 12(b), whereas in Fig. 
the third chain is added in front of the original 


way, two groups of three polypeptide chains 
which are prototypes of two medels, cdilagen 
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(b) (c) (a) (e) 
Fic. 12. A schematic drawing which illustrates how the collagen molecule may be generated from the polyglycine-II lattice. (a) Two 
molecular chains from polyglycine II. (b) The same chains showing only the a- carbon atoms (solid circles) and hydrogen bon Is (dashed 
lines). (c) A third chain is added behind the two chains of (b). This generates collagen I. (d) The third chain is added in front of the 


two chains found in (b). This generates collagen II. (e) In the collagen molecule itself, the three molecular axes are coiled as shown 
in the diagram. The numbers on (c) and (d) conespond to the three kinds of sites found on the collagen molecule. Site number 1 is 
. C. Crick in Recent Advances in Gelatine and Glue Research (Pergamon Press, 


I and collagen II. These correspond to the two different 
ways of collecting groups of three molecules as shown 
in the end view of the polyglycine-II lattice (Fig. 10). 
In the collagen molecule, however, the axes around 
which the polypeptide chains coil are no longer straight 
as in Figs. 10(a)-10(d), but are coiled slightly as shown 
in Fig. 12(e). Thus, the collagen molecule is regarded 
as a coiled-coil structure. 

In silk fibroin, the spacings between the sheets in the 
molecule are uneven; the closer spacing (3.5 A) arises 
because there are only the side chains of glycine between 
the sheets. Similarly, in the collagen molecule, the three 
polypeptide chains can come close together, because the 
glycine residues are located near the center of the group 
of three chains, whereaS the other two-thirds of the 
residues are projecting outward from the center of the 
molecule. Thus, two-thirds of the residues can have any 
side chains on them. And, what is more significant, they 
can accommodate pyrrolidine rings without producing 
any steric difficulties. 

It is generally agreed that the collagen-II model is 
the most likely form for the collagen molecule,” and 
in it pyrrolidine rings can be found on two of the ex- 
ternally situated sites, while the third internal site 
must have a glycine residue. This explains why all 
collagens are found with 33% glycine in their composi- 
tion. Figure 13 Shows a form of the collagen-II molecule 
where two-thirds of the sites are filled with ‘pyrrolidine 
rings such that a repeating sequence is used in each 
chain. In it, the sequence glycine-proline-hydroxyproline 
is used, since this is a common constituent of collagen 


CC-0. Gurukul Kangri University Haridwar Collection. Digitized by S3 Foundation USA 


() 


& st 


ee rae 


MOLECULAR 


digests. This model of collagen was made using the poly- 
peptide chains of Fig. 11(c), except that the axes of the 
polyglycine chains are now coiled rather than straight. 
As seen in Fig. 13, the collagen-molecule model is elon- 
gated, and has a distinct helical groove running down it 
which arises from the absence of a bulky side chain on 
the glycine site. The helical ridge on the surface of the 
molecule arises because the bulky side chains, in this 
case pyrrolidine rings, all lie closely packed. In the 
collagen molecule, however, there is 4 variety of amino 
acids found in that ridge, including many which have 
polar side chains. These are undoubtedly important in 
stabilizing collagen by hydrogen bonding, as well as in 
forming electrostatic bonds from one molecule to its 
nearest neighbor. 

In the polyglycine-II lattice, the fundamental screw 
operation used in generating the lattice is a translation 
of 3.1 A and a rotation of 120°. Because collagen is a 
coiled-coil molecule formed from this lattice, the funda- 
mental screw operation in generating the collagen helix 
is a translation of 2.86 A and a rotation of 108°. Of 
course, the asymmetric unit in the collagen molecule 
now consists of a group of three amino acids in contrast 
to the single amino acid in the polyglycine chain. In 
collagen, the coiling of the three polypeptide chains 
about one another is produced by the steric interaction 
of the B-carbon atoms which are present in the collagen 
molecule but which are absent in the polyglycine-II 
lattice. 


Fic. 13. A model of coilagen II. The three polypeptide chains 
of Fig. 11(c) have been modified by the addition of proline and 
hydroxyproline residues to sites 2 and 3. This causes the three 
chains to coil about as shown in Fig. 12(e). 


CONFIGURATION 


Fic. 14. A drawing of the a-helix [from R. B. Corey and ^ 
L. Pauling, Rend. ist. lombardo sci. 89, 10 (1955). 


The «-Helix p 


An important feature of the structure of silk. fibroin 
is the fact that alternate residues are glycine units along 
the polypeptide chain. This permits the sheets to pack 
very closely. Similarly, in the collagen molecule, an im- 
portant feature is the fact that every third residue is 
glycine, thereby allowing the three chains to come close 


together so that they can hydrogen bond. In 4 sense, - 


both of these structures „are very specialized anu ate a 
consequence of a restricted amino-acid composition. The 
best example of a polypeptide configuration which can 
be formed with minimal ôr no compositional restrictions 
is the a-helix.% In the a-helix (Fig. 14), the peptide units 
are orented sọ that their planes are tangent to acylinder 
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around the axis of the helix. In this way, hydrogen 
bonds can be formed which are parallel to the axis, and 
thus serve to hold together the various turns of the 
helix. The fundamental operations for generating the 
a-helix are a rotation of 100° around the axis and a 
translation of 1.5 A along the axis. Each amide group 
is connected by a hydrogen bond to the third amide 
group from it along the peptide chain. There are 3.6 
residues per turn of the helix, and the total rise of the 
helix per turn is 5.4 A. The helix formed in this way is 
packed very firmly, and there is no open space in the 
center. f 

Of all of the configurations discussed hereto, the a- 
helix is by far the most important. It is a completely 
general structure in that it will accommodate any amino- 
acid side chain. When proline residues are incorporated 
into the structure, the helix axis changes direction, but 
this incorporation probably does not destroy the organ- 
ization of the helix. This structure has been found in a 
variety of synthetic polypeptides,” and there is evi- 
dence for its existence in many globular proteins as well 
as hair, muscle, wool, and myosin.” *4 The a-helix should 
produce a reflection at a spacing of 1.5 A, which is the 
translation distance along the fiber axis for each amino- 
acid residue. Perutz* has found this reflection in a 
variety of proteins, and this has provided strong in- 
ferential evidence for the a-helix. 

It has been suggested that the a-keratin proteins such 
as hair contain helices twisted around one another to 
form coiled-coil structures.*>® Thus, these molecules 
may form by hexagonal packing into cable-like struc- 
tures involving several strands which coil around each 
other. Structures of this sort give reasonable agreement 
with the observed x-ray diffraction photograph obtained 
from hair. 

The importance of the a-helix lies in its extreme gener- 
ality. Knowledge of the detailed configuration of globu- 
lar proteins, as discussed by Kendrew (p. 94), is just 
beginning to be advanced. It seems quite likely that 
the a-helix will form an important feature of many of 
these molecules. Significantly, the a-helix buries the 
systematically repeating part of the polypeptide chain, 
using it to form a supporting fabric of great stability. 
In so doing, it exposes to the external environment all 
of the side chains with their great diversity. Thus, the 

a-helix enables a protein structure to maximize its vari- 
ability and, thereby, produce the considerable chemical 
_ versatility of the proteins. 


SUMMARY 


On considering the molecular structure of biological 
polymers, the proteins and the nucleic acids, one can 
see that chey are more complex than are synthetic poly- 
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mers. Nonetheless, there are many common structural 
features in all of these materials. The helix is seen as the 
most general form for organizing successive residues 
along a polymer chain. The helix is found in many forms, 
ranging from fully extended to tightly coiled molecules. 
In all of these structures, space is filled efficiently, 
thereby maximizing the amount of van der Waals stabili- 
zation. Most synthetic polymers achieve a limited degree 
of structural complexity, as for example, coiling into a 
helical form. Materials of biological origin, by contrast, 
often go beyond this in complexity. They may form 
coiled-coils, as in collagen or in the a-keratin structure. 
On even a higher level, these polymers may fold in a 
very complicated fashion to produce globular molecules 
in which the interactions of the different side chains play 
a critical part in maintaining configurational stability. 
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LTHOUGH the x-ray diffraction, method appears 
to be the only means capable of yielding the com- 
plete three-dimensional structure of large macromole- 
cules (p. 94), an investigation of macromolecular struc- 
tures in solution is important for two reasons. In a 
number of cases, the macromolecule is either not crystal- 
lizable or insufficiently crystallizable; consequently, 
x-ray diffraetion methods cannot be applied. In the 
other cases, where x-ray structure determination is pos- 
sible, physicochemical examination is necessary to show 
whether or not the configuration characteristic of the 
solid state remains intact in solution. 

It is essential, therefore, that one find out from the 
physical methods as much as possible about these mole- 
cules in solution. These methods are complementary to 
the x-ray investigation which goes on in the solid state. 
In addition, they furnish unique information such as 
molecular weight and molecular-weight distribution. In 
general, the more that one can do to bridge these two 
approaches, the better. 

Nature seems to deal primarily with polymeric struc- 
tures in the production of macromolecules. Fortunately, 
there is not the chaos that might be present should all 
possible covalent bonds be joined together to make an 
infinite variety of large macromolecules. There has been, 
in fact, a great sorting out so that only three basic 
polymeric chains have been selected for wide use in 
living systems. Evolved over many millions of years 
are the polysaccharides, the polypeptide, and the nucleic- 
acid chains. These sometimes are tied together by cross- 
links, but this generally does not mar the basic sim- 
plicity of the long-chain structures which may, or may 
not, be folded in a particular way. It is these polymer 
chains, subject to many specific perturbations, that we 
wish to examine in the greatest detail that is practical. 


SPECIAL PROBLEMS IN CHARACTERIZING 
MACROMOLECULES 


¿One must consider at the outset three special and 
.constantly recurring problems which arise in the charac- 


SS < . . . 
„terization of macromolecules and have no simple coun- 
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terparts in the field of small molecules. First to be 
mentioned is the fact that there may be a distribution 
of molecular weight. This does not usually occur in 
globular proteins, but it does in most other biological 
macromolecules. Consequently, the molecular weight is 
no longer a unique and simple number. One must deal 
instead with chains of varying lengths, but which are 
otherwise homogeneous. Each physical method that can 


be used for molecular-weight determination reflects a 
particular average of the molecular-weight distribution 
and as a result this average must be specified. 

There are two widely used averages. The first of these 
is the number-average molecular weight—a very demo- 
cratic form of averaging wherein each molecule is 
counted with a weight of one regardless of the actual 
weight. The second, the weight-average, corresponds 
not so much to democracy as to a more primitive ar- 
rangement wherein the importance of the molecule in- 
fluences its count in accordance with its actual physical 
weight. The ratio of one of these to the other is a rough 
measure of the breadth of the molecular-weight distri- 
bution. This quality of having a distribution of molecu- 
lar weight is sometimes referred to as “polydispersity.” 

Also to be considered is the problem of the specifica- 
tion of molecular size and shape. There are a variety of 
situations which arise from the basic polymeric chains. 
At one extreme, there can be found no periodic internal 
structure to the macromolecule whatsoever. It will be- 
come a randomly coiled molecule which is a rather 
common but degenerate form, since it does not have 
any specific configuration that is generally thought to 
be required for the functioning of many biological 
macromolecules. At the opposite extreme, many species 
of the three types of biological macromolecules can form 
helices (see Rich, p. 50), most commonly made of one, 
two, or three molecular strands, in which atoms are 
grouped in a periodic, inflexible array. Now, it is clear 
that the specification of size and shape must be treated 
differently in these two cases. For the case of the ran- 
dom coil, one will wish to measure the volume of space 
occupied by the molecule or some effective radius; 
in the case of the compact helical structure, one ob- 
viously will measure the length of the molecule and its 
mass, and this ratio of length to mass will characterize 
the particular helix. Therefore, the kind of parameter 
used to characterize the size of a macromolecule de- 


pends very much upon the form the molecule presents. — 


The final feature, which arises here and which has no 
counterpart with small molecules, is a space-filling con- 
cept in solution. If one had to deal only with dilute soly- 
tions of small compact spheres, then these particles 
could be considered as spaced at random with no over- 
lapping and only rarely making contact. However, if 
these same molecules were not compact spheres, but 
random coils of the same mass, they would be expanded 


to such an extent that they would be continuously over- 


lapping. This feature is maximal in the case of the 
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random-coil form, but it is significant for rigid, asym- 
metric macromolecules as well. The consequence of this 
is that, if physical*measurements are to be used to 
characterize the individual properties of these molecules, 
the system must be diluted down to the point where 
molecules are clearly separated one from another. This 
requires a very high dilution which, in turn, puts a very 
severe requirement upon the sensitivity of the physical 
method to be employed. A number of physical methods 
which otherwise might be very attractive fail on this 
account. They are not sufficiently sensitive to respond 
to the very high dilutions necessary to allow the in- 

dividual expression of the intrinsic properties of the 
3 _ single macromolecule. 


Cr ew tr ee i 8 Ra 
7 —- aa - 


THERMODYNAMIC METHODS 


Keeping in mind these three special problems, a 
survey is undertaken now of what appear to be the most 
practical and general methods for obtaining quantita- 

f tive information about the anatomical character of bio- 
| logical macromolecules. It is of interest to note that all 
$ of these methods have their origins in the work of emi- 
i nent physicists at the turn of the century. The contri- 
butions of Einstein and Boltzmann appear in a number 
of places and our indebtedness to them is very great. 


f 
ie A 
l } Osmotic Pressure 
{ The thermodynamic methods, three in number, are 
| ji considered first. Perhaps the widest known is that of 
| ee. osmotic pressures. It is one property of solution which, 
ʻi | as it happens, gives a macroscopic response to the dilute 


macromolecular solutions that is sufficiently large to be 

practical. The osmotic pressure is a measure of the 

:] { | number of molecules of solute per unit volume in solu- 

i tion. Therefore, if the weight of the material per unit 

volume in solution is known, and if one measures the 

number by means of osmotic pressure, a number-average 
molecular weight can be obtained. 

This was perceived in the latter part of the last cen- 
tury, but only recently has the exact and rigorous rela- 
tion of the concentration-dependent behavior of osmotic 
pressure to that of gas pressure been demonstrated. This 
is important, because, in each one of these methods, 
particular attention must be paid to the way in which 
the physical property depends upon concentration, and 
so, in each case, measurements must be carried out on a 

` series of dilute solutions and extrapolated to an infinitely 
dilute solution in order to isolate the property of the 
individual molecule. 

In all of these thermodynamic methods, the concen- 

ion dependence bears a close analogy to that of the 

in a gas. Thus, the pressure-volume relation of 


perfect gas has the following form — 
PV /mRT=1FBH/VtCW/VP+-- D 
See nee 


Van ‘ third virial coeff- 
whe nd C are the second and thir 
pe ey with the concentration dependence 
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of osmotic pressure is exact. Osmotic pressure, 7, re- 
places gas pressure, P, and concentration (mass per unit 
volume) divided by molecular weight, c/M, replaces 
number of moles per unit volume, n/V. In this way,! 
one obtains the relation 


T IRRE C 
=—+—c+—< 
cRT M M? M° 


At sufficient dilution, the terms in ¢ and higher will be 
negligible. Under these conditions, a plot of +/cRT 
against c yields a straight line whose intercept is the 
reciprocal of the number of average molecular weight 
and whose slope is equal to B/M?. 

In small molecules of gases, the virial coefficient often 
is thought of in terms of excluded volume—that is, the 
amount of space not available for the center of one 
molecule because of the volume of another molecule. In 
the domain of macromolecular solutions, this effect can 
become very much larger because of the extra-large 
volume-filling capacity of some forms of macromolecules. 
However, if the molecules have an attraction for each 
other, this over-all effect becomes smaller and even can 
become negative immediately before phase separation. 
On the. other hand, if there is a large electric repulsion, 
the over-all effect can become very, very large making 
it impossible to establish the curve with precision. 

The range in which the osmotic-pressure method can 
be applied is from molecular weights of about 10 000 
upward to a value which depends upon the precision 
desired—usually, a few hundred thousand. The lower 
limit is set by membrane permeability. 

As an illustration, some measurements on collagen 
solutions? are shown in Fig. 1. The molecular weight 
obtained from the reciprocal of the intercept at the con- 
centration axis is 310000, and the scatter shows the 
uncertainty is of the order of 15%. This is about the 
upper limit for practical use of this method in aqueous 
solutions. The second virial coefficient can be evaluated 
from the slope, and, in this case, is found to arise almost 
entirely from the excluded volume effects. 


Light Scattering 


Light scattering, the second thermodynamic method, 
is considered next. Concentrating only upon its thermo- 
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Fic. 1. Osmotic pressure of collagen solutions at 2°C. 
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dynamic aspect, compounded, of course, by the central 
electrostatic features which go along with dipole scatter- 
ing, one finds again a close relation to the solution of 
the corresponding problem in gases. 

Rayleigh’s analysis of the problem of light scattering 
from dilute gases, worked out nearly a century ago, took 
the following form. The removal of light from an inci- 
dent beam by scattering may be expressed quantita- 
tively by means of the turbidity, r, defined as 


Slip, (3) 


where Jo and J are the intensities of a parallel beam of 
light before and after passing through a length, Z, of the 
gas. Alternatively, the scattering can be characterized 
by measuring its intensity at any angle 9, 7, and a dis- 
tance, °”. When 6 is 90°, r and igo are related in the 
following manner: 


167 gor” 167 
Toe = R50, (4) 
S hy 3 


where Roo represents the reduced form of the scattering 
intensity and is known as the Rayleigh ratio. This ratio 
was shown to be for any angle 


2n?(n—1)? 1 
Re=—————_ — (1+ cos”8), (5) 
Mé v 


where n is the refractive index of the gas, Xo the wave- 
length of light in vacuum and v the number of mole- 
cules per cc. 

The relevance of this to scattering from a solution of 
macromolecules is due to Einstein? and Debye.* At very 
great dilution, the same results hold except that the 
refractive index of the solvent, mo, replaces unity in 
Eq. (5). This then can be rearranged to give: 


Roo= 


2n (n—no)/c 
oL o)/ or 
NXot 


=KcM. (6) 


From this it can be seen that the scattering as measured 
by Roo is proportional to the molecular weight (the 
weight average) and the proportionality factor is ex- 
perimentally determinable. 

This result holds for a collection of independent scat- 
tering centers. In real macromolecular solutions, the 
space-filling property, previously referred to, produces a 
concentration-dependent correlation between scattering 
centers. Einstein showed this could be taken into ac- 
count in a manner that now can be expressed as follows: 


His = o (7) 
` Ro M M? 

One sees here a complete analogy with osmotic pressure 

except for a factor of 2 and the type of molecular-weight 


average that is obtained. Again, scattering measure- 


ments must be made at a number of concentrations and 
upon extrapolation yield the desired intercept. 

In contrast to osmotic pressure, ‘the sensitivity of this 
method increases with molecular weight. However, when 
the molecular size becomes significant with respect to 
the wavelength of the light, the angular dependence of 
the scattered light is affected in a manner discussed in 
the next section, and this must be taken into account. 
Finally, it should be noted that, since osmotic pressure 
and light scattering provide different molecular-weight 
averages, their combined results provide a measure of 
the molecular-weight distribution. 


Sedimentation Equilibrium 


The third and final thermodynamic method is the 
various exploitations of sedimentation equilibrium that 
are found to be useful in the study of macromolecules. 
It is very likely that this will turn out to be the most 
versatile of all the tools which one can have for the 
physical investigation of macromolecules. The key to 
these methods is found in the high speed of the ultra- 
centrifuge, which allows one to observe optically the 
distribution of macromolecules in a cell in which the 
gravitational field is increased to the order of 100 000 
times that of gravity. The versatility of the modern 
ultracentrifuge allows one to carry out a number of 
quite different kinds of measurements. In one case, one 
may allow the field to be exerted on the molecules in the 
solution until they have redistributed themselves at 
equilibrium.’ This would then correspond with the dis- 
tribution of gas molecules in the earth’s atmosphere, 
and the analysis of the situation can be readily adapted 
to yield an average molecular weight of the macro- 
molecules distributed in the much greater field provided 
by the ultracentrifuge. 

The basis of determining the molecular weight in this 
way can be seen if one recalls the relation between the 
pressure p and the height + in the earth’s gravitational 
field g. That is, 


1 : 
dp ES (3) 
p RT 


the pressure is replaced:by 
ecular mass M is corrected 
the solvent by multiplying 


In the ultracentrifuge cell, 
the concentration, the mol 
for the buoyancy owing to 
by (1—dp), where dp is th 
density, / is replaced by «v, 
of rotation, and g is replace 


the distance from the center 
py the centrifugal force 


wx. These exchanges lead to A ee lS 
2 =p) wexdx D . 
s mowr > 
c RT > 


e . . 
If there is only one macromal e E A 
molecular weight can be deters the cell. When a distri- 
is known at only two places * - 
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bution of molecular weights exists, its weight-average 
molecular weight can be obtained most readily.® Again, 
the observed molecular weight is found to be dependent 
upon the concentration in a manner governed by the 
second virial coefficient. Consequently, the reciprocal 
of the apparent value must be plotted against concen- 
tration and the true value of M» determined from the 
reciprocal of the intercept at zero concentration. 

Now, in actual practice this method is not widely 
used because a number of days are required to attain 
the equilibrium distribution of the macromolecules in 
the solution in most cases. However, the smaller the 
molecular weight of the macromolecule, the sooner such 
distribution is attained; consequently, it is practical for 
molecules whose molecular weights are only a few 
thousand. Waugh’ has pioneered in devising a particular 
kind of ultracentrifuge cell containing a retracting parti- 
tion, which greatly facilitates this kind of application. 
By determining the concentration of the two parts, 
above and below the partition, one can obtain the 
molecular weight. 

Within the last three years, an alternative version of 
the sedimentation-equilibrium method has come into use 
that promises to have wide utility. It is the Archibald 
approach to equilibrium method.’ Kegeles™! has been 
chiefly responsible for recognizing and developing its 
potential. Its basis is both novel and simple. The con- 
dition on which the general method of sedimentation 
equilibrium is based is that at equilibrium the net flux 
of solute species across any plane within the solution 
and perpendicular to the radius is zero. Archibald 
pointed out about 10 years ago that this condition of no 
net transport through a boundary perpendicular to the 
direction of sedimentation was always valid at the 
bottom and top of the solution, that is, at the meniscus 

and the cell bottom. Consequently, very soon after 
speed is attained, one can determine the initial redis- 
tribution of solute near the meniscus, or near the bottom 

of the cell; and from this one can derive the weight- 
average molecular weight by use of Eq. (9), together 
with appropriate extrapolation. Operationally, one now 
has precisely the opposite situation, a very short run 
instead of a very long run. 

This method appears to be applicable over a wide 
range of molecular weight, and fortunately it does well 
in the region of 1000 to 10 000 where most other methods 


. fail. As one illustration of this, I have chosen some work 


that we have done on determining the molecular weight 
of polypeptides and correlating this with intrinsic 
viscosity. 

‘Before the Archibald method was available, it had 
been pUssible to show that the weight-average molecular 


R weights? of poly-y-benzyl-t-glutamate were related to 
cs A _ the intrinsic viscosity in the expected manner. That is, 


mn a double logarithmic plét, a linear relation was 
r indicated by the open circles in Fig. 2. We were 
ble e to c carry measurements below 20,000, however, 
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and extrapolation was precarious. When the Archibald 
results were obtained (filled circles), the proper exten- 
sion down to molecular weights of 1000 could be made 
as shown.” 


INTERFERENCE OF SCATTERED LIGHT AND 
MOLECULAR SIZE 


If the macromolecules are quite small as compared to 
the wavelength of light, and if the incident light is 
vertically polarized, the scattering will be the same at 
all angles; that is, Re= Ro. However, for larger molecules 
with dimensions exceeding 200 or 300 A, interference 
arises from light scattered from different parts of the 
same particle. As a consequence, the scattering is di- 
minished, the effect increasing with the scattering angle 
0. The effect vanishes at zero angle and, as aeconse- 
quence, measurements over the accessible arfgular range 
(usually 30 to 135°C) can be extrapolated to give Ro 
which is required for molecular-weight determination. 
The character of the angular dependence, however, is 
of particular interest because it reflects the distribution 
of matter within the scattering article. The angular de- 
pendence can be separated from the concentration de- 
pendence because the following relation is generally valid 

Ke 1 
—=———_+ 2B. 


MP(6) 


(10) 


That is, the angular dependence can be shown to enter 
only as a function, P (0), known as the particle-scattering 
factor equivalent to the square of the structure factor 
in x-ray diffraction. The experimental determination of 
P(@) is used to obtain information of dimensions and 
shape through the relation ; 


p*| 4r sin(6/2) 7? 
et 
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Fic. 2. A double logarthmic plot of intrinsic viscosity measured 


in dichloroacetic acid against weight-average molecular weight of 


poly-y-benzy]l-1-glutamate. 
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radius of gyration of the macromolecule. This radius of 
gyration is related to the dimensions of simply shaped 
particles. For example, the length of a rod-like molecule 
is (12)!p. The root-mean-square end-to-end distance 
((r?))* of a randomly coiled polymer is given by (6)#p. 
For quite large macromolecules, higher terms in Eq. (11) 
become significant in such a way that shape, as well as 
the size, can be independently determined or at least 
estimated. 

As an illustration of these methods, one can consider 
a particular fraction of cellulose nitrate.!* Osmotic- 
pressure measurements plotted according to Eq. (1) 
yielded a value of 234000 for M,. Light-scattering 
measurements showed substantial angular dependence. 
A double extrapolation against concentration and sin? 
(0/2) in accordance with Eqs. (10) and (11), known as 
a Zimm ot, showed that M,=400000 and ((7?))} 
— 1500 A (Fig. 3). This indicates a rather broad molecu- 
lar-weight distribution for a sample that has already 
been fractionated and the size is indicative of a quite 
extended random coil. This is evident when the follow- 
ing is considered. Each monomer unit has a molecular 
weight of 294 and a length of 5.15 A. Hence, the number 
of units (degree of polymerization) making up a chain 
of 400000 molecular weight is 1350. Completely ex- 
tended, this would be 6950 A in length. By comparing 
this with ((r?))!=1500 A, the extent of coiling is readily 
visualized. If there were no hindrances to rotation at the 
glycosidic linkages, the value of ((r?))! would be several 
times smaller. The extent to which this molecule is ex- 
tended because of steric and potential hindrances to 
rotation is the highest yet found for single chains. One 
can appreciate that this “stiffness” is put to good use in 
the biological role of this material since this will not only 
give rigidity to the crystallites of cellulose, but will make 
firm the amorphous regions in between the crystallites. 
Moreover, this natural tendency toward rod-like be- 
havior in localized regions of the chain is effective in 
lowering the entropy of melting, and thereby contributes 
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Fic. 3. Cellulose-nitrate fraction Ab in acetone at 25°C. 
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Fic. 4. Reciprocal particle-scattering factor of tobacco mosaic 


virus solution: sample A-4, experimental points,—theoretical 
scattering curves. 
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to the high melting point and hence the negligible solu- 
bility of cellulose. 

The Zimm plot for tobacco mosaic virus is shown in 
Fig. 4. It can be seen that the experimental points are 
bounded by the theoretical scattering curves for rods 
(having negligible diameter) of 2900 and 3200 A. Boedt- 
ker and Simmons" were able to conclude from these data 
that the length was 3000+50 A. In contrast to the 
complete linearity of the plot for the randomly coiled 
cellulose derivative in the previous figure, one finds here 
a pronounced downward curvature characteristic of rod- 
shaped scattering elements. From the molecular weight 
of 39.5 million obtained, from the intercept and the 
length and density, an effective diameter of 150 A can 
be computed. Thus, the shape and dimensions of this 
virus are completely determined from light-scattering 
measurements. we 

An application of light-scattering studies to collagen? 
is shown in Figs. 5 and 6. In the former, the Zimm plot 
for the native-collagen molecule in solution is shown. 
The downward curvature indicates a rod-like shape and 
the quantitative interpretation of the results show that 
the molecule is 3000 A long and has a molecular weight - 
of 360 000. The diameter, in this case is only 13.5°%A and 
as described by Rich (p. 50), consists of only three 
polypeptide chains in a helical arrangement. As in other 
cases which are dealt with in the paper by Rich thi 
macromolecule is essentially a one-dimensional ee 
lite. Hence, it should melt on raising the temperature. 
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Fic. 5. Collagen (ichthyocol) in citrate buffer (pH 37) at 15°C. 
My.=310 000; P=955 A; L=3300 A; M/L=94. 


Upon melting, the secondary bonding holding the three 
chains together disappears and the “melted” or dena- 
tured state should consist of the three polypeptide chains 
in randomly coiled configurations. This configurational 
transition can be observed directly in light scattering. 
In Fig. 6, the reciprocal envelopes are plotted before and 
after heating. The intercept is seen to be nearly three 
times higher indicating a correspondingly lower molecu- 
lar weight, and the slope is likewise greatly diminished 
indicating a much smaller spatial extent. The curvature 
is also gone, consistent with the assumption of a ran- 
domly coiled configuration. 
Light-scattering studies of deoxyribonucleic acid 
((DNA) have been particularly useful in showing the 
nature of these molecules in solution. These macro- 
molecules are so highly extended in space that very 
great dilution is required in their physical characteriza- 
tion, and as a consequence few methods could be 
properly applied. A typical light-scattering result is 
shown in Fig. 7. The slope relative to the intercept is 
seen to be much greater than those encountered before. 
For this case, its interpretation as a coiled molecule 
leads to an estimate of 7000 A for the average end-to-end 
length and a molecular weight of 8 000 000. The contour 
length of this molecule is about 40 000 A on the basis of 
the Watson-Crick model. Thus, it is only very modestly 
curved and the mild downward curvature reflects its 
intermediate status between a coiled and rod-like shape. 


Actually, the size of this molecule is greater than present 


ight-scattering techniques can handle because the extra- 
‘tion should be based on measurements down to 5°C 
r to eliminate the effects of polydispersity on the 
wer. Thus far, however, the errors have not 
“ious because the polydispersity has been such 
a linear extrapolation from the higher 
good fortune cannot be expected to 
o d 
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This situation holds in the region of x-rays as well, and 
as a consequence the scattering of long x-rays can lead 
to the assignment of dimensions of the order of hundreds 
of A, just as light scattering deals with dimensions of a 
few thousands A. Time does not permit an examination 
of this, but reference should be made to the work of 
Beeman" and Luzatti!® in this connection. 


HYDRODYNAMIC METHODS 


The motions of a macromolecule in solution can be 
resolved into those of translation and rotation. If the 
motion is simply that of diffusion, one terms it transla- 
tional or rotatory diffusion and characterizes it by a 
diffusion constant D for translation and © for rotation. 
Tf the motion arises from an imposed gradient, two other 
situations occur. If the imposed force is that,of a°gravi- 
tational gradient as in the ultracentrifuge, sedimenta- 
tion is observed and one characterizes it by a sedimen- 
tation constant s. If the imposed force arises from a 
hydrodynamic gradient, the molecule is caused to ro- 
tate with a definite bias instead of at random. The dis- 
sipation of energy that this produces gives rise to an 
increase in viscosity over that of the solvent. The frac- 
tional increase is known as the specific viscosity, and, 
when divided by concentration and extrapolated to zero 
concentration, it becomes known as the intrinsic vis- 
cosity [7 ]. 

These four types of motion are all the result of a 
certain amount of resistance to an applied force (or 
couple). This resistance is hydrodynamic in nature and 
arises from the size and shape of the molecule. On very 
general grounds, it can be argued that for.a given mole- 
cule the resistance can be characterized by a frictional 
factor f, which is the same in all four cases.” Conse- 
quently, two developments are possible. Theoretical 
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Fic. 6. Denaturation of collagen in solutions. @ =before 
denaturation; O =after denaturation. 
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investigations can aim at calculating the frictional factor 
for various macromolecular models as a function of di- 
mensions and provide, in this way, a basis of obtaining 
dimensions from measuring one or more of the foregoing 
quantities. On the other hand, the measurement of any 
two quantities provides at least the possibility of elimi- 
nating f and obtaining the molecular weight. This is 
illustrated in the well-known Svedberg equation. Here, 
the expressions for D and s are combined to eliminate 
f as shown. z 


D=kT/f, s=M(1—ip) Nf, (12) 
SRT 
Ma c (13) 
D(1— õp) 


9 
Thus, if t&e partial specific volume ō and the solvent 
density p are determined, the molecular weight can be 
obtained from measurements of s and D. 

The dificult and time-consuming nature of the 
measurement of D, particularly for chain-like and rod- 
like macromolecules, has led to a search for other means 
of achieving the same result by using the much more 
easily measured quantity, the intrinsic viscosity [n], in 
its place. This can be done rigorously for ellipsoids, 
making use of the work of Perrin and Simha. The 


result is 
sin }noN 1? 
(ee 
B(1—dp) 
where the constant 6 is a slowly varying quantity de- 
pendent only upon the axial ratio. Its values range 
from 2.12 for spheres to 3.50 for infinite axial ratio. The 
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Fic. 7. Light scattering of DNA. 
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axial ratio can always be determined from [7] with 
enough precision to permit the correct choice of $. 

The usefulness of intrinsic-vistosity determinations 
is by no means limited to Eq. (14). Quite the contrary. 
Its application stems from a series of impressive theo- 
retical derivations that have been made relating it to 
size and shape. The first of these was owing to Einstein 
who showed that the intrinsic viscosity of compact 
(unpenetrated by solvent) spheres in solution was 2.5 
times their specific volume in units of cc/g. This value 
represents a lower limit: almost all values observed for 
macromolecules are much larger. For example, 4 to 10 
for a number of globular proteins, 10 to 100 for a number 
of proteins known to be asymmetric, and extremely high 
values in the case of collagen (1200) and deoxyribo- 
nucleic acid (7000). Later theories showed that these 
higher values could all be accounted for in terms of two 
effects: asymmetry of the macromolecule, or solvent 
immobilization. The latter effect could arise from hydra- 
tion owing to the binding of water (or other solvent) at 
specific sites on the macromolecule or by loosely in- 
corporating it with the domain of the macromolecule 
through swelling. Thus, a typical randomly coiled poly- 
mer pervades a domain that is perhaps 100 times or 
more greater than it would occupy in a compact form; 
yet, the solvent in this domain moves with the polymer 
and is, therefore, hydrodynamically immobilized. As a 
consequence, the intrinsic viscosity measures this vol- 
ume of immobilized material, and, in a rough way, its 
numerical value expresses in units of cc/g the ratio of 
the volume of the domain to the mass of the polymer 
molecule. Extremely useful developments in the last few 
years have led to the recognization of the relation be- 
tween intrinsic viscosity (and sedimentation constant 
as well) and the mean end-to-end length, ((r*))}, of 
polymer chains.!® As a consequence, the size of such 
polymer molecules can be determined from the intrinsic 
viscosity, if their molecular weight is known. Moreover, 
the large variations in intrinsic viscosity observed when 
the solvent or temperature is changed are seen to be the 
result of swelling and shrinking of the polymer cojls. 
Finally, the previously observed empirical relation be- 
tween intrinsic viscosity and molecular weight for a 
series of homologous polymers, 


[nJ=KMe, _ (15) 


becomes understandable from this point of view. The 
exponent in this relation is a constant with a value 
between 0.5 and 1.0 for any given polymer-solvent 
system. These limiting values correspond to tightly 
coiled molecules with many intramolecular contacts 


(0.5) and to very highly swollen molecules (1.0), with © 


most actual cases distributed in between. 

In the case of rigid particles, the exaltation of the 
intrinsic viscosity arises frofn the increased energy dis- 
sipation resulting from the end-over-end rotations in 
solution. The analysis of this problem, particularly by 
Simha,!” showed that the intrinsic viscosity was a func- 
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Number average 2450 


Light scattering 3000 


Flow birefringence 
2600 - 2950 


Intrinsic viscosity plus; 
molecular weight 
2970 
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Fic. 8. Weight distribution of lengths of ichthyocol macromole- 
cules as measured from electron micrographs compared with re- 


sults from other methods. Total number represented is 238 [from 
C. E. Hall and P. Doty, J. Am. Chem. Soc. 80, 1269 (1958)]. 


tion only of the axial ratio of the particle and that this 
dependence rapidly approached that of the square of the 
axial ratio. In many typical proteins, one is faced with 
the possibility that the intrinsic viscosity reflects both 
molecular asymmetry and moderate amounts of hydra- 
tion. The resolution of this problem has been worked out 
by Oncley.'* 
The final point to make on intrinsic viscosity is about 
its ease of measurement relative to the other three 
hydrodynamic properties. 
The determination of the rotatory-diffusion constant 
from streaming birefringence of flow is a rather special- 
ized technique and is limited to rigid particles. In these 
cases, provided the length of the molecules exceeds a 
few hundred angstroms, the length and often the length 
distribution can be measured with considerable accuracy. 
As a single illustration of the methods discussed in 
this section, as well as a demonstration of their self- 
tency, the results on soluble collagen are presented 
in Table I.? The intrinsic viscosity was 1150 cc/g, the 
. sedimentation constant was 2.96 Svedbergs and the dis; 
tribution of rotatory-diffusion constants corresponded 
a range in Jength from 2500 to 2950 A. ae 
This brief discussion of the physical determination of 
acteristics of macromotecules would not be com- 
without i the mention of a new technique that has, 
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within the last two years, become practical for asym- 
metric, rigid macromolecules. This is the perfection of 
the direct viewing of such macromolecules in the elec- 
tron microscope by Hall.'’ Thus, nucleic acids, some 
proteins, and even simple polypeptides in the a-helical 
configuration have been measured and found to be in 
agreement with the results of other physical methods 
applied in solution. As one illustration of this, shown 
in Fig. 8 is the weight distribution of the lengths of 
collagen molecules compared with the average values 
shown in Table I. The agreement is seen to be quite 
good. It is possible that the distribution was somewhat 
broadened as a result of some damage in spraying and 
shadowing the electron-microscope preparation, but the 
effect has been modest. F. O. Schmitt deals with the 
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TABLE I. 
Length Diameter 
Method Mol. Wt. A 
Osmotic pressure (Mn) 310 000 Soc cee 
Light scattering (Mw) 345 000 3100 13.0 
Intrinsic viscosity and Mw sae 2970 13.6 
Sedimentation and viscosity 300 000 12.8 
Flow birefringence and viscosity 350 000 2900 13.5 


way in which these macromolecules are united in col- 
lagen fibrils and the extent to which the dimensions 
found here are compatible with his studies (p. 349). 
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I. INTRODUCTION 


HE past decade has witnessed an extensive de- 

velopment in two directions*of the theory of 
electrolyte solutions. On the one hand, the investiga- 
tions of Mayer! and of Kirkwood and Poirier? finally 
have established from first principles the validity of the 
Debye-Huckel limiting law. These studies provide a 
formal framework for extending the theory of electro- 
lytes ato concentrations higher than the traditional 
“slightly Polluted water,” and they provide insight 
into the nature of the approximations which make valid 
the use of the Poisson-Boltzmann equation in dilute 
solution. The recent works of Onsager and Kim, of 
Iuoss and Onsager,’ of Falkenhagen and Kelbg,®> and 
of Pitts’ have extended, within the framework of the 
Poisson-Boltzmann equation, the theory of transport 
processes to include the effects of finite ion size. While 
it is doubtful if the validity of the Poisson-Boltzmann 
equation permits further development, the inclusion of 
finite ion size does permit the description of dissipative 
processes, in terms of an ion-size parameter, to be 
extended to much higher concentrations than heretofore, 
and it suggests a more satisfactory interpretation of the 
varying behavior of different strong electrolytes. Fur- 
ther extension of the theory of dissipative processes 
probably requires the more powerful tools of the Mayer 
theory,combined with the general statistical theory of 
transport.” 

On the other hand, there has been an almost explosive 
development of the theory of very highly charged 
macromolecular ions. Whereas, prior to 1940, there 
existed almost no quantitative theory of colloid solu- 
tions, and whereas the measurements of Kern repre- 
sented the only investigations of flexible polyelectro- 
lytes, there exists now a number of theories for the 
equilibrium properties of these substances, and there 
have been several attempts at the theory of transport. 
Tn all instances, the theory of polyelectrolytes has been 
developed within the framework of the Poisson-Boltz- 
mann equation; in some cases, the analysis uses a 
jinearized form, while in others the analysis adopts 
different approximations equivalent to the assumption 
that the domain occupied by the polyion has zero net 
charge. Although there are many difficulties associated 
with the current theory, and, though qualitative or 
semiquantitative agreement with experiment is all that 
can-be attained at present, it may be asserted safely 
that the broad features of the basic physical processes 
responsible for the observed phenomena have been de- 
lineated and that the major problems are quantitative 
rather than qualitative. In this article, the theory of 
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polyelectrolyte solutions is reviewed briefly, with 
emphasis on the basic physical processes. The available 
experimental data suitable for comparison with the 
theory, and the directions which future developments 
are likely to take are mentioned only cursorily, owing 
to limitations of space. ° 


Il. GENERAL SURVEY AND QUALITATIVE PICTURE 


The fundamental theory of rigid, impenetrable 
macroions first was published in 1948 by Verwey and 
Overbeek’ in their well-known monograph; it has been 
refined continuously by Levine, by Booth, and by 
Kirkwood and co-workers. Briefly, the model 
adopted for the interaction between particles represents 
the macroion as a rigid sphere with a specified surface- 
charge density. The high surface charge tends to 
force the counterions to cluster close to the surface of 
the macroion, creating a thin double layer which shields 
the surface charges from similar charges on other 
colloidal particles. The net repulsive force between two 
macroions usually is obtained from a suitable solution 
of the Poisson-Boltzmann equation, although more 
sophisticated methods also have been used.“—!§ The 
fact that the macroion is rigid and impenetrable simpli- 
fies the problem to the extent that almost no features 
are intrinsically different from the low molecular weight, 
strong electrolyte case. There are, however, qualitative 
differences in the relative importance of the various 
contributions. 

For colloidal ions with one type of charge, one may 
imagine the free energy of building the double layer to 
consist of four contributions. First, if the charge is due 
to adsorption or desorption of ions from the solution, 
there is an intrinsic chemical potential change due to 
the dissociation reaction. To this must be added the 
sum of the chemical potentials of all of the species in 
solution, the difference in self-energy (interaction of 
central ion and atmosphere) of an adsorbed ion from a 
free ion, and the electrical free energy of chargiig up 
and assembling the double layer. When considering the 
interaction of two macroions, it is important to note 
that the overlap of double layers causes a redistribution 
of small ions in the solution with the net result that the 
mutual interactions of the small ions are changed. Thus, 


in addition to the expected screened Coulomb force’ 


between macroions, there is a force arising from the 
change in interaction energy of the small ions as they 
redistribute under the infiuence of the overlappin: 
double layers. x E 
When consideration is 
charges of both signs m 


given to the possibility that 
ay exist on the surface of th 
: it = 


Be ba 
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molecule, a new mechanism of interaction between 
macroions becomes apparent.!® If there is a net differ- 
ence in the numbers of positive and negative charges, 
at the isoelectric point there will be a number of un- 
charged groups otherwise identical in chemical char- 
acter with some of the charged groups. For one of the 
i types of charge, there exists under these conditions a 
l large number of different possible arrangements of the 
i charges on the surface of the molecule. A number of 
these arrangements is of approximately equal electro- 
static energy and one may expect the charges to fluctu- 
ate in occupation of first one and then another set of 
sites. If two such molecules are brought close to each 
other, the charges on one ion tend to polarize the 
charges on the other in such a manner as to separate 
like charges. This is possible if there exist surface-charge 
distributions of roughly equal energy and the resultant 
#1 decreased repulsion is, of course, equivalent to a net 
attraction, since the net charge of each particle is zero. 
It has been postulated that this fluctuation force is 
operative in enzyme reactions.!” 
| Another fundamental difference between macroions 
l and small ions is that, in the former case, consideration 
4 must be given to the mutual interactions of charges on 
He the surface of the ion. The magnitude of these inter- 
| 
i 
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‘| actions depends upon the geometric arrangement of the 
Ut = charges, upon the dielectric constant, and upon the salt 
P concentration, and readily may be ascertained experi- 

l mentally from the titration curve. Early attempts to 
characterize this electrostatic energy made the crude 


i approximation of smearing out the discrete-charge 
Ai structure and applying the Debye-Huckel theory in 
re substantially its original form. Refinements, such as 
pean" allowing the ion to be partially penetrable by solvent, 

+ en alter the model only slightly.!* On the other hand, direct 


calculation: of the electrostatic energy for discrete- 
charge distributions reveals a marked dependence of 
the titration curve upon the details of the distribution 
of charges, as was to be expected. Physically, this corre- 
sponds to the fact that neighboring acid and base 
groups tend to become zwitterions, whereas neighboring 
groups of the same kind resist ionization because of the 
large repulsive interactions created by such ionizations. 
Consequently, the amount of work required to ionize 
a group.depends markedly upon the group environment, 
5 and a smeared-out charge distribution gives an errone- 
‘ous, even if fortuitously numerically accurate, picture 
of the molecule. 
In the case of flexible polyelectrolytes, attention has 
een focussed almost entirely on the interactions be- 
n the charges of the polyion and on the relationship 
en configurational and thermodynamic properties 
> trength, degree of neutralization, and so 


ctions betweên polyions only now are 


| detail. 
y of linear polyelectrolytes 
1 Rule and Katchalsky,” 
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who described the configuration of a polyion as that of 
a chain which was randomly coiled with the restriction 
that its mean end-to-end distance be such as to mini- 
mize the sum of the configurational and electrostatic 
free energies. This simple model was able to predict 
qualitatively the size changes accompanying changes in 
charge on a polymer. This treatment was approximate, 
however, in that it assumed the charges on the polymer 
to have an interaction energy determined.only by the 
end-to-end distance, rather than by all of the distances 
between charges on the chain. Moreover, it was not 
shown that a screened Coulomb potential was the proper 
expression to use in deriving the interaction energy 
between a pair of fixed charge in a solution containing 
mobile ions. 

In the same year, Hermans and Overbeek? proposed 
a model in which the polymer was regarded as a porous 
spherical-charge distribution, whose radius was to be 
varied to minimize the total free energy. By using a 
linearized form of the Poisson-Boltzmann equation, they 
computed the electrostatic free energy without using 
the screened Coulomb potential. Although the Hermans- 
Overbeek model avoids explicit consideration of the 
chain configuration, it predicts the size changes of 
polyions moderately well. But neither of the models 
discussed thus far was conspicuously successful in 
interpreting the titration curves of polyelectrolytes. 

Two years later, Huizenga, Grieger, and Wall! re- 
ported experiments which, for the first time, fully 
showed the importance of ion binding in polyelectrolyte 
solutions. They found that up to 60 percent of the 
sodium ions counter to a carboxylate polymer moved 
with it in electrophoresis. Since these polymers are 
thought to be approximately free draining under the 
conditions of this experiment, one concludes that the 
counterions are not merely within the volume of the 
polymer, but are relatively tightly bound to it. 

The foregoing work, and contributions by Flory,” 
by Kimball, Cutler, and Samelson, by Osawa, Imai, 
and Kagawa,” and by others, led to the proposal of a 
model” which was new in that (a) the interactions be- 
tween charges of the polymer influenced its local con- 
figurational properties as well as its end-{o-end dis- 
tance, (b) these interactions affected the tendency 
toward ionization at each ionizable group, making the 
various ionizable groups of a polymer interdependent, 
rather than independent, as heretofore postulated, and 
(c) the interactions were large enough to cause im*\ 
portant amounts of binding of counterions, even though 
negligible amounts of binding would occur if the 
ionizable groups did not interact. In implementing this 
model, it is necessary to allow for the interrelations 
among its several features. In particular, to determine 
the mean size, degree of ionization, and degree of bind- 
ing of a polyion infa solution of given pH, counterion 
concentrations, and ionic strength, one would deter- 
mine first the free energy as a function of all six of these 
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variables, and then minimize this with respect to the 
first three, which are not controllable externally. The 
expression for the free energy must reflect the fact that 
not all chain configurations of the same end-to-end 
distance will have the same electrostatic interaction, 
nor will all arrangements of the same total number of 
charges be equienergetic for a given chain configuration. 
A further discussion of these and related points is 
presented after a more detailed definition of the mo- 
lecular model. ; 

To specify the model more completely, consider a 
weak polyacid, each functional group of which can be 
un-ionized (zero charge), ionized (—1 charge), or ion 
paired (zero charge). It is assumed that an intrinsic 
“chemical”? free energy change can be assigned to the 
convefsion, of an ionizable group, from one to another 
of these charge states, so that, ignoring interactions 
among the charges, the ionization and binding would be 
describable by equilibrium constants. The binding is 
thus treated phenomenologically. The ion pairs intro- 
duced here may be described as “‘site bound,” so as to 
distinguish them from ions merely required to be near 
the polymer for the maintenance of electrical neutrality. 

The chain configurations available to the polymer are 
assumed to be just those accessible to an otherwise 
similar uncharged polymer, so that the effect of the 
electrostatic interactions is to alter the distribution 
among these configurations. The chain configurations 
are described in the manner first proposed by Kuhn,?® 
by regarding the polymer as a series of rigid links con- 
nected by universal joints. The lengths of the links are 
chosen so as to allow, as well as possible, for all charge- 
independent forces restricting the short-range flexi- 
bility of the chain. The Kuhn model makes all of the 
accessible configurations of equal nonelectrostatic en- 
ergy, so that the charge interactions completely deter- 
mine the manner in which various configurations are 
weighted. Thus, the charge interactions have the two 
following effects: they influence the equilibria among 
un-ionized, ionized, and bound functional groups; they 
affect the configurational diStribution. These inter- 
actions, of course, are to be calculated, keeping in mind 
that the space between the charges is occupied by an 
ionic solution, and hence the calculation depends upon 
the ionic strength and upon other properties of the 
solution. 

In accordance with the over-all approach sketched in 
the third preceding paragraph, one turns first to the 
formulation of the free energy when the mean size and 
degrees of ionization and binding are specified. It is 
convenient to begin by assigning to the polymer a 
specific chain configuration, and by specifying which 
particular functional groups are to be un-ionized, 
ionized, and ion paired. One may consider then the free 
energy of this visualizable, but unattainable, “state” 
of the polymer. On a completely microscopic scale, this 
state is really a large number of states, because the dis- 
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tributions of the small mobile ions of the system, the 
energy levels of the various species, etc., are not speci- 
fied. Thus, it is relevant to consider the free energy of 
this state. Each change in the configuration or in the 
status of the function groups results in a new state, 
with its associated free energy. These states, together 
with their free energies, may be taken as the starting 
point for a statistical mechanical calculation of the 
over-all free energy when only the numbers of ionized, 
un-ionized, and bound groups, and the average con- 
figuration are specified. 

The free energy of a state of the type outlined above 
consists only of the intrinsic free energies of the func- 
tional groups plus the electrostatic interaction free 
energy. The configuration of the chain does not con- 
tribute by virtue of the Kuhn chain model. The only 
problem is in the calculation of the interaction free 
energy for the charge distribution of each state. It has 
been shown” that, in spite of the presence of fixed 
charges, the Debye-Huckel potential can be used for 
this purpose. The demonstration involves approxima- 
tions not much more restrictive than those of the ordi- 
nary Debye-Huckel theory. 

The methods outlined above provide, at least in 
principle, a way to calculate the free energy of a state 
of specified configuration and charge distribution. At 
this point, mathematical complexity forces an approxi- 
mation. We choose to calculate exactly for each in- 
dividual state only the electrostatic interaction between 
nearest neighboring functional groups along the chain, 
and to approximate the remainder of the interaction 
under the assumption that the charge is distributed 
evenly along the chain. This approximation has the 
effect of preserving the strong influence that a charged 
group has upon the ionization and upon the binding at 
nearest neighboring groups, while making tractable the 
calculation of the less sensitive dependence on the inter- 
actions which promote configurational expansion. To 
be more explicit, it has just been assumed that, if the 
net charge is specified, the distribution of that charge 
along the chain is independent of the chain configura- 
tion, and thgt the free-energy difference among con- 
figurations is to be calculated assuming the charge to 
be uniformly distributed along the chain. : 

The remainder of the over-all free energy now sepa- 
rates into two parts: (a) the configurational free energy 
of a uniformly charged chain, and (b) the distributional 
free energy of a chain of charged and uncharged sites 
with near-neighbor interaction. The computation of (a) 
is essentially a biased random-walk problem, while (b) 
is formally equivalent to the calculation of the spin 
distribution in a one-dimensional Ising lattice. The 
details of both of these computations are described 
elsewhere.?® Qualitatively, it‘is apparent that increasing. 
the charge ona chain tends to restrict its configurational 
freedom, or to increase its configurational free energy. 


One would expect likewise that computation (b) might 
night 
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result in weighting heavily only those charge distribu- 
tions in which not very many more charges than neces- 
sary are nearest neighbors. 

When the free energy, as computed by the methods 
outlined above, is minimized with respect to the 
number of ionized and ion-paired groups, it is found 
that the electrostatic interactions cause the net charge 
of thé polymer to be much less than if no such inter- 
actions existed. This means that the free-energy in- 
crease associated with forming un-ionized groups or ion 
pairs is more than compensated for by the reduction in 
electrostatic free energy, until many more than the 
“normal”? number of ion pairs and un-ionized groups 
have been created. The effect is much larger than one 
might suppose at first, inasmuch as the charges are 
frequently close enough together such that the effective 

dielectric constant of the region between them is far 
lower than that of normal water. Moreover, as the 
number of charged groups increases, the vast majority 
of the possible distributions of the charges along the 
chain becomes extremely unfavorable energetically, so 
that the free energy contains a large negative entropy 
term. The interaction free energy thus makes it pro- 
gressively more difficult to remove successive protons 
from a weak polyacid, spreading the titration curve in 
the manner experimentally observed. When the protons 
finally are removed, many of them are replaced by 
counterions having little intrinsic affinity for ion pairing. 
The configurational expansion accompanying neu- 
tralization of a polyacid is calculable by minimizing 
the free energy with respect to its average end-to-end 
distance. It is found that the binding and the reduced 
ionization cause the expansion as predicted for the 
polymer to be in much better agreement with experi- 
ment than if the binding or the reduced ionization were 
ignored. In addition, the amount of binding predicted 
to satisfy the titration and configuration properties is 
consistent with the amount of site binding observed in 
electrophoresis.” 

The model as set forth in the preceding section may 
bé applied to polyelectrolyte gels** which may be 
analogues of some of the functions of cell membranes. 
The basic concepts are the same as those already de- 
scribed, but the details are significantly affected by the 
crosslinking of the gels. The specific systems for which 
the theory was developed are strong eee ion 

‘exchangers in equilibrium with various kinds o. mone 
tomic counter ions. Examples of such ae e 
sulfonated crosslinked polystyrene Caen Cis A equa 
ihr with mixtures of alkali metal ions. There is 
gee inthe theory restricting the analysis 
to this case, and the general concepts are applicable to 
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pair. As before, it is assumed that the intrinsic free 
energy associated with the formation of each type of 
ion pair has a value dependent upon the kind of counter 
ion and upon its concentration in the external solution. 
The ion-exchange selectivity is introduced by assuming 
different intrinsic free energies for the formation of 
different kinds of ion pairs. Exchange selectivity results 
only, however, when the conditions are such that sig- 
nificant numbers of ion pairs can be formed. 

Just as for linear polyelectrolytes, it is desired to 
compute the free energy for a given configuration and 
charge distribution of the gel. As pointed out, the 
Debye-Hiickel screened Coulomb potential may be used 
to calculate the electrostatic interactions. The cross- 
linked systems, however, present problems not en- 
countered when dealing with linear polyelectrolytes. 
Many exchange resins are crosslinked sufficiently such 
that there are reasonably well-defined regions inside and 
outside the exchanger, respectively. The interior region 
is characterized by different ion concentrations and by 
the presence of enough organic matter to cause the 
solution to be quite different from that outside. The 
analysis confirming the use of the screened Coulomb 
potential?” shows that the screening constant to be 
used must involve the proper dielectric constant for the 
interior region, and must recognize that the space 
occupied by organic matter is not accessible to the 
mobile ions. The bulk concentrations should be used, 
as the different ion concentrations within the resin are 
allowed for in the derivation of the screened Coulomb 
potential. 

The large concentration differences between regions 
internal and external to an ion-exchange resin have led 
many investigators to include an “osmotic” contribu- 
tion to the free energy when describing an ion-exchange 
resin on a phenomenological basis. The present model, 
however, is entirely molecular in character, and it 
would be incorrect to include such a contribution in 
addition to the direct calculation of the detailed inter- 
actions. The concentration differences are merely a 
consequence of the forces already considered. The 
point of view presented here removes a conceptual 
difficulty in treating resins of low crosslinking. For, if 
an osmotic description is used in preference to a detailed 
model, ambiguity arises when the degree of crosslinking 
is diminished to the point where the regions inside and 
outside the resin cease to be well defined. 


Although there is in principle a means of determining - 


the free energy at every specified configuration and 
charge distribution, to simplify the calculation approxi- 
mations again must be made. The difference in con- 
figurational Properties of linear systems and gels sug- 
gests that the approximations used here be somewhat 
different from those of the preceding section. In a gel, 
many of the spatially near-neighboring functional 
groups are not near neighbors along the polymer chains, 
so that it is definitely necessary to consider interactions 
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between groups which are not closely connected. ‘In 
fairly concentrated gels, near neighbors to a given group 
in general are distributed so that most motions at con- 
stant gel volume do not affect the electrostatic energy 
seriously, with some pairs of charges approaching more 
closely as others are carried further apart. This is a 
radically different state of affairs from that of a linear 
polymer, where at a constant end-to-end distance the 
electrostatic energy depends critically upon the local 
coiling of the chain. It is assumed, ‘therefore, without 
gross inconsistency, that all gel configurations of the 
same volume are electrostatically equivalent, and that 
the interaction between the charge state of the gel and 
its configuration depends only upon the gel volume. 
For the purpose of calculating the electrostatic inter- 
actions, the functional groups of the gel are assumed 
to be on a lattice whose scale is a function of the gel 
volume. The distributional free energy of the gel then 
can be calculated subject to the same sort of approxima- 
tions as enters into the simple cell theories of liquids. 
The calculation of the gel free energy is complete 
when the configurational free energy and the electro- 
static free energy are each characterized as a function 
of gel volume. As pointed out above, the electrostatic 
interactions may be handled by the methods of the 
cell liquid theory. The analogy is exact if ion-paired 
groups are identified as the holes of the liquid theory, 
and the charges as occupied cells. The details have been 
given elsewhere.?* The configurational free energy is 
simply that of an uncharged gel, so that the theory of 
rubber elasticity may be applied. We have found it 
convenient to use the theory in the form given by 
Flory.?6 
From the free-energy function determined in the 
manner just described, one can relate the external solu- 
tion composition to the volume and to the ion pairing 
in a resin. The model differs from the traditional 
approaches in that the only source of selectivity enters 
through the intrinsic ion-pairing constants. It has been 
assumed that the size of the ions enters in no other way. 
Another difference is the assumption that there is an 
equilibrium between paired and free charged groups, 
with only the paired groups contributing to a selective 
effect. This concept leads to the conclusion that ex- 
change selectivity is increased when conditions are 
such that ion pairing increases. Since the ion pairing 
“occurs only when necessary to reduce strong electro- 
static fields, one may understand why resins of high 
exchange capacity should be more selective than resins 
of ‘lower capacity. Likewise, resins unable to reduce 
their electrostatic interactions through expansion be- 
cause of high crosslinking are more selettive than loosely 
crosslinked resins which easily can swell. 
Flexible polyelectrolytes have been analyzed here in 
terms of ion-pair formation. There is some controversy 
on this point though the author believes the available 
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evidence supports the concept of ion pairs. Further dis- 
cussion of this topic may be found elsewhere.” 

The preceding description of fléxible polyelectrolytes 
may be extended easily to polyampholytes. No new 
considerations are necessary, excepting those already 
discussed for rigid polyampholytes—i.e., fluctuation 
attraction, local charge structure and its effect on titra- 
tion behavior, and possible ion binding. However, for 
flexible polymers, it must be remembered that the 
internal molecular configurations are subject to the 
forces arising from these phenomena, a complication 
absent in the case of rigid ions. A quantitative formula- 
tion of the models as discussed in this section is con- 
sidered next. 


Ill. SUMMARY OF THE MATHEMATICAL 
THEORY OF RIGID POLYIONS 


Consider as the first and simplest case a rigid polyion 
of arbitrary shape in a volume v of electrolyte.” For 
simplicity, attention is restricted to the case where the 
charge on the polyion is due to the adsorption or dis- 
sociation of groups of one species of charge. The exist- 
ence of adsorbed ions also implies the existence of a 
dissociation equilibrium which may be characterized, 
of course, by the change in chemical potential per ion 
adsorbed. If all of the ions are discharged, the Helm- 
holtz free energy may be written 


Ac) =a Ne (0) ada) 


i=0 


(1) 
ao(n) = foros, 


where a(n) is the change in free energy when m dis- 
charged ions are adsorbed on the polyion, u:(0) is the 
chemical potential of an ion of species 7 when there are 
no adsorbed ions, v is the surface density of adsorbed 
ions, and the integration is over the surface of a macro- 
molecule. Each element of volume-containing ions inter- 
acts not only with the charges on the polyion surface, 
but also with other parts of the medium external to the 


macromolecule. If r; is the chemical potential of ionic 
species t, 


i ‘ 
remta: f Oma, » > @) 

0 
where À is the fractional charge on an ion (a hypo- 


thetical charging process is used to calculate the change 
in free energy due to electrostatic forces), x:(1) is the 


free energy arising from the interaction of the fully 


charged (A=1) central ion and its ionic atmosphere 
and ¢;() is the average electrostatic potenijal in the 
bulk of the solution at very large distances from the 
polyion. If the average electrostatic potential at the 
point r is W;(7), one has ` 


: > Wilr)=¥.(r)+¢4:(n), (3) 
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with y;(7) the change in average potential experienced 
by ion 7 in being brought from the bulk of the medium 
to the point r. The“total change in Helmholtz free 
energy due to the excess charge and hence excess elec- 
trostatic interaction of the double layer over that in 
the uniform bulk medium easily is seen to be 


A= f ale q: | nadota: f somas), (4) 
0. 0 v s 


= where it is assumed that the surface charge is due to 
species 1. Note that Eq. (4) is of the simple form, 
charge multiplied by potential, with ,(r) the volume 
density of ions of species 7 at r and p(s) the correspond- 
ing surface-charge density. The total free energy is then 
(solvent is represented as species zero) 


~ A(n) EOE N35 
: 1 


+Hao(n)— nx (1)+nxi™(1)+A (n). (5) 


To determine the equilibrium number of adsorbed 
ions, the free energy represented in Eq. (5) must be 
minimized with respect to n, giving the result 


Odo ðA e 
x bulk (1) — x5 ince (1) =—+ 
on 


on 


ð r J 
+— a f oan- (6) 
ð 0 
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_ with the last term accounting for the change in com- 

Fl position and hence mean potential of the bulk medium 
= when n ions are adsorbed. 

ns Consider now the interaction of two polyions, fixed 
in position a distance R apart. The free energy of this 
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system differs from the preceding only in that the 
relevant mean potentials must be obtained as a func- 
tion of R. One may write 


A (n,R) = Nouo (0) +} Nir: 
1 


H2ao(n)— 2n (xh (1)— x, e(1)) +A .(7,R), (7) 


where A.(n,R) is to be computed now for: two double 
layers, and the force between the macroions is 


ðA (n,R) ð 
(5) = a e(n, R) 
OR n=Neq OR 


+E af [am —-60,0)4| > (8) 


n=Neq 


The derivatives are to be evaluated along the equi- 
hbrium-adsorption isotherm which may be a function 
of the interparticle separation. As before, the second 
term on the right-hand side of Eq. (8) is the correction 
arising from the change in mutual energy of the small 
ions of the solution. To implement this general result, 
the right-hand side of Eq. (8) must be evaluated. Solu- 
tion of the linearized Poisson-Boltzmann equation 
gives the following complicated results!—”: The average 
potential of each particle may be shown to be 


HC) 


exp 
1+ , (9) 
(R/a) (1-+ka) 


where a is the radius of a spherical macroion and x the 
usual screening parameter. The interaction energy has 
two terms, 
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For very large R, 


ôWa(R)~— (13) 


30k0q17q2 C) 
D’akT (ka) \R 
and both forces are repulsive. 

Direct numerical integration of the complete Poisson- 
Boltzmann equation leads to potential distributions 
differing quantitatively but not qualitatively from that 
in Eqs. (10) and (11). Graphical and numerical results 
can be found in the paper by Hoskin and Levine.!°-” 

At most distances, 6W\>é6W2, so that the dominant 
repulsive potential is given by Eq. (10), but at very 
large distances the term of Eq. (12) is dominant. The 
potential, Eq. (10), was considered by Verwey and 
Overbeek® and shown to lead to quantitative description 
of the stability of lyophobic colloids with respect to 
changes in ionic strength, coagulation, etc. For further 
details, the reader is referred to their book. 

A rather different study of the nature of the repulsive 
interactions was undertaken by Steiner.*! Because of 
the strength of the repulsive forces, there is considerable 
ordering of the polyions in solution. Consequently, by 
inversion of the angular distribution of scattered light 
from suitably chosen colloidal solutions (i.e., those that 
exhibit diffraction due to interparticle interference), 
the radial-distribution function may be determined 
experimentally. Steiner has carried out such a program 
for AgI sols in which the charge is due to the adsorption 
of I~. Figure 1 shows the radial-distribution function de- 
duced by Steiner (reproduced from his Ph.D. dis- 
sertation). The structure of the solution alters with 
dilution and ionic strength in the expected manner. It 
is found that the strong repulsive forces prevent the 
particles from approaching closer than 1500 A at low 
ionic strengths. The maximum in the distribution func- 
tion occurs at about 3000 A, so the long range of the 
repulsive force is not unreasonable. Further, the shift 
in position of the maximum in the distribution function 
is approximately of the magnitude expected from the 
concentration change, and tke addition of a quantity 
of bivalent cation produces an effect roughly equivalent 
to that of one hundred times its concentration of 
monovalent cation, in accordance with the Schulze- 
Hardy rule. 

Quantitative examination of the data reveals, how- 
ever, deviations from the theoretical predictions. From 

tthe shape of the radial-distribution function, it is 
known that the forces are repulsive. To determine the 
nature of these forces, one may plot log(V/kT) against 
logR (appropriate to a repulsion proportional to R-») 
and log(RV) against R (appropriate to a repulsion of 
the form e~*"/R). Neither one of these plots is at all 
linear, nor does either one appear to be rectifiable by 
the addition or subtraction of linear parts. It is possible, 
even probable, that this deviation is due to the known 
polydispersity of size among the colloidal particles. 
Certainly, though the results are fragmentary, the re- 
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Fic. 1. A typical radial-distribution function for AgI sol. 


pulsive forces calculated by Levine and by Verwey 
and Overbeek are not of the form found in experiments 
of Steiner. A discrepancy other than that due to poly- 
dispersity would be expected to show up in the radial- 
distribution function owing in part to the linearization 
of the Poisson-Boltzmann equation. Still another pos- 
sible source of error is the neglect of third-body polyions 
in the calculation of the potential of mean force, since, 
in the solutions investigated, the range of the force is 
great enough to involve simultaneous interaction of 
more than two particles. Despite the apparent sophisti- 
cation of the mathematical treatment, it is evident 
that much work remains to be done before quantitative 
agreement with experiment is attained. 


IV. RIGID POLYAMPHOLYTES!“s 


Polyampholytes contain both positive and negative 
charges on the surface of the macromolecule. It is 
fairly obvious that, far from the isoelectric point when 
charge of only one sign is present, the considerations 
presented in the previous section suffice to describe the 
interactions. At the isoelectric point where the average 
net charge is zero, conditions are rather different. 

Consider two polyampholytes with their centers of 
mass separated by the distance R. The interaction be- 
tween the two molecules may be written 


a 


Z1 22 4;Un;@@ expl —r(Ry®—a)] > 
V= D 14 
Ekaa Qo ( ) 


i j=1 DR; 
where there are Z, and Zs sites on polyions one and two, 
respectively, g is the magnitude of the charge, R,;“2) js 
the separatign of site 7 on molecule one from site j on 
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molecule 2, and n:® and n;® are occupation variables 
which can have the values zero for an uncharged site 
and plus or minus “unity for positive and negative 
charges on the specified site. Now the potential of 
mean force is defined by 


exp(—W/kT) = (exp(—V/kT)), (15) 
where the average is taken as unweighted and with 
molecules one and two fixed. It may be shown readily 
that the potential of mean force may be expressed in 
terms of unweighted averages of the potential energy 
V.2 We use the truncated form, 


1 
oN aA), (16) 


correct to terms of order (k7)—. At the isoelectric 
point, the average net charge is zero and the first term 
vanishes, as do all powers of (V). Now define 


gi? = qn) = (qi )+AGi™ 
Agi = gin: — i) = Gani. 


The use of Eq. (14) in Eq. (16) gives 


(17) 


qi Z| Z| Z2 Z2 
W(R)=—— EE D E mY) 
2kTD?R? i=1 1=1 k=l s=1 
: e-2«(R—a) R 
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i where the last factor accounts for differences in charge- 
charge separation due to distribution of charges on the 
surface of the molecule. With the substitution of Eq. 
(17) into Eq. (18), the potential of mean force becomes 
(setting (R/R: R }=1, valid at large separations) 


Ci Zı Z2 
—— SS 3 Cln: On) = (n: Yn Y] 
2kTD?R? i t=1 k,s=1 
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XCenene)— (ne ne) J (19) 
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= This term represents the interactions between the 
x% uctuating charges and fluctuating multipoles of both 
= molecules. The charge fluctuations may be obtained 
easily by differentiation of the Grand Partition Func- 
oe, fas is seen later. To calculate the excess chemical 
at atal one uses the relation 
` : o 
=r] | Rieu R—-1]R 
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where, in the usual notation, 
=u (T, p)+kT Incyt+y1° 
uP (T, p) =lim[u—kT Inc; ] 


Ww oo 
g (R)=exp] - | 
10 RT 


W an 
gu(R)= exo| -—]| 
kT 


with W° and W the potentials of mean force be- 
tween a macromolecule and a solvent molecule and 
between two macromolecules, respectively. The dimen- 
sions of a solvent molecule are small relative to those 
of the macromolecule in almost all cases, andthe solvent 
plays the role of a dielectric continuum. One finds, 
therefore, after substitution and integration, that 


(21) 
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where the macroion contribution to the ionic strength 
is neglected. 

At high salt concentrations, xo is large and the second 
term is negligible relative to the first term. Under these 
conditions, the excess chemical potential is positive, 
representing the net repulsive forces which result in an 
excluded volume. When the salt concentration is very 
low, the second term dominates and the excess chemical 
potential is negative. Careful light-scattering experi- 
ments have verified Eq. (22) quantitatively. The 
magnitude of the fluctuating charge may be obtained 
independently from the titration curve of the poly- 
ampholyte and is in complete agreement with the 
values determined from the independent light-scattering 
experiments." 

A very interesting application of this theory has been 
presented by Kirkwood!’ in a discussion of enzyme 
kinetics. Suppose the enzymatic reaction can be de- 
scribed by the Michaelis-Menten relation 


23 aS] X ka LEJES] 


(23) 
dt KmatES] 
in which [S] and [£] are the substrate and enzyme 
concentrations, Km the Michaelis-Menten constant, and’. 
ks the intrinsic rate constant for the enzymatic reaction. ` 
Now the potential of mean force is the work required 
to bring a pair of molecules from infinite separation to 
the distance R. If the standard free energy of formation 
of a complex is AG®, then this is equal to the potential 


id 


of mean force at contact jie, 


AG?=W (a), (24) 


and, therefore, 


kT nK=—W (a), (25) 
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where K is the equilibrium constant for the formation 
of the complex. Let the complex now undergo reaction 
through the intermediary of a transition state with free 
energy of activation AG*. Thus, 

AG*=W*(a*)—W (a). (26) 
Suppose the substrate molecule is specifically bound to 
the active site of the enzyme by local forces with po- 
tential Wo, in which the protein does not participate. 
In the activated state, the potential is denoted by Wot, 
the subscript zero again indicating that protein forces 
are omitted. Under the influence of the protein, the 
local forces are different and the corresponding poten- 
tials are denoted W’ and W*. If Km and Kn? and k3 
and k are the Michaelis-Menten constants and in- 
trinsic r&tes of reaction with and without protein 
participation, then 


E W!=Wa, 
7a 


m 


kT In 


(27) 


k3 
kT In—=A4AW+*—AWE, 
k3? 


where the intrinsic rate of decomposition of ÆS into 
reactants is assumed to be small relative to 3. 

If the protein contains Z groups with acid dissocia- 
tion constants K;, and if the substrate molecule has 
dipole moments u’ and w* in the normal and activated 
states, then, neglecting all forces except those arising 
from charge fluctuations leads to 


i & 
W-—-Wo=- X 
2kT ı 


K,{H*] 
(Lat J+)? 


Ge cos*y; 
DER; 


, (28) 


where y; is the angle between the dipole moment of 
the substrate molecule and the radius vector R; from 
site 7, De is the effective dielectric constant, and the 
first factor arises from the evaluation of (n,?) as is seen 
in a subsequent section. Note that electrostatic inter- 
actions between charges on the protein again are 
neglected. 

Since the fluctuation potentials fall off as R=, one 
„may simplify Eq. (28) by assuming that only nearest 
neighboring groups are effective. If there are Za nearest 
neighboring groups with dissociation constants Ka, 


Zap? Kel H+] 
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From Eqs. (29) and (27), 
ks Zag(u*—p?) s Kal H+] 
og—= : 
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(30) 
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Note that, for this simplified model, ką has a maximum 
when the pH is equal to pK. Further, the pH depend- 
ence predicted by the model is a symmetric bell-shaped 
curve. If the electrostatic interactions between charges 
on the surface of the protein are accounted for, then the 
curve may be skewed since the proton population on the 
molecular surface, and hence the free energy of activa- 
tion, are not symmetri¢ functions of the pH. Kirkwood 
has made a crude comparison of the theory with some 
experimental data of Bergmann and Fruton on the 
effect of pH on the pepsin hydrolysis of carbobenzoxy-1- 
glutamyl-1-tyrosine. The maximum rate occurs at pH 4, 
thereby identifying the participating groups as car- 
boxylates. With an estimated dielectric constant and 
an estimated dipole moment difference, the fall in rate 
to one-half of its value in an interval of one pH unit is 
accounted for if there are about ten carboxyl groups at 
an average distance of 5A from the adsorption site. 
This picture is qualitatively reasonable. It should be 
stressed that we have considered only one of many 
possible contributory forces operative in enzyme re- 
actions and it is not to be implied that the charge- 
fluctuation mechanism is the only or even the most 
important of the possibilities. The charge-fluctuation 
mechanism, however, does provide a physical picture 
in qualitative accord with experiment. 

It should be apparent by this time that the calcula- 
tion of the properties of an isolated polyion eventually 
requires the computation of the mutual electrostatic 
energy of the charges on the molecule. Until very re- 
cently, it has been customary to calculate this electro- 
static free energy for spherical macroions by assuming 
the charge to be uniformly distributed over the surface, 
and the macromolecule to be impenetrable by the small 
ions of the bulk medium. The result obtained is, 


sphere Pe ka 
Alert | : 
2Da 1+xa 


(31) 


where Zq is the net charge on the ion. Aside from the 
questions of penetrability which introduce no basically 
new features,'* the assumption of a uniform charge dis- 
tribution is grossly inaccurate. The calculation of the 
electrostatic interaction energy for an arbitrary dis- 
tribution of positive and negative charges is extremely 
difficult. Basically, the method consists in the direct 


-counting of all configurations with given total charge: 


and weighting each configuration with the appropriate 
Boltzmann factor. Fortunately, for small nümbers of 
charges, there are many configurations of equal energy 
and this degeneracy simplifies the calculation. With fhe 
energy zero taken as the completely discharged protein, 


the work required to charge a polyampholyte sphere 
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with Pn (costs) as the Legendre polynomial of order n, 
Tk, Tu Tkp and Bx, are, respectively, the distances of 
charges & and / from the center of the polyion, the dis- 
tance between charges k and /, and the angle between 
the vectors r; and r;. To this electrostatic energy must 
be added the chemical-potential change due to the 
intrinsic change in free energy on ionization. This total 
free energy then represents a protein with specified 
charge distribution, and to obtain the relevant macro- 
scopic free energy, an average over the charge distribu- 
tion must be performed. The theory has been applied to 
models of proteins in which the charges were variously 
located at the vertices of a cube and dodecahedron.") 
The results of these laborious calculations may be sum- 
marized as follows. The titration curve depends 
markedly upon the depth of the charges below the sur- 
face and upon the distribution of charges. As expected, 
the interaction energy is smallest for a regular, uniform 
distribution, larger for a random distribution, and very 
large if charges are crowded at one end of the molecule 
and the charge distribution made nonuniform. Argu- 
merts can be offered which suggest that the charge 


Beas the surface.4 All deviations from the electro- 
static energy given in Eq. (31) can be accounted for in 
terms of the charge distribution on the molecule. Great 
Sens: eo d in interpreting variations in titra- 
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S V. FLEXIBLE POLYELECTROLYTES 


In turing to a consideration of the properties of 
flexible polyelectrolytes, a complicating feature appears. 
This is the change in mean dimensions of the molecule 
when the ionic strength or charge density is altered. As 
stated earlier, the change in configuration is caused by a 
readjustment between the contractile forces due to the 
Brownian motion of the stretched chain, a force of 
entropic origin, and the repulsive forces due to the 
electric charges. Using the procedures discussed in the 
preceding sections, it may be shown that the Helmholtz 
free energy of an independent polyion immersed in a 
medium of small ions is of the following form% ; 


A=A itA oF} A 3 +A, (34) 
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In Eqs. (35), a’ is the degree of ionization of the 
macromolecule and f the fraction of dissociated sites 
occupied by bound-ion pairs. The degree of neutraliza- 
tion then is related to the degree of ionization by the 
conservation condition a’ =a(1—/f), and the fraction of 
bound sites referred to-the total number of sites of the 
polymer will be af. A; describes the free energy of the 
free hydrogen ions and counterions eligible to take part 
in binding and neutralization phenomena. A» represents 
the electrical free energy of the net charge of the polyion, 
regarding the ion pairs as uncharged sites and thus in- 
cludes the interaction energy between portions of the 
net charge of the polymer and the entropy associated 
with the mixture of charged and uncharged sites along 
the polymer chain. A; includes the chemical free energy 
associated with the state of ionization and binding of 
the polymer, computed from a reference state in which 
the polyion is in its completely undissociated form. 
Also included in A; is the free energy of mixing of the 
bound ion pairs and un-ionized groups among the un- 
charged sites of the chain. Finally, the last contribution, 
Ay, is the intrinsic free energy of the polymeric skeleton, 
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a function of its configuration. Of the remaining un- 
defined symbols, Sm is the entropy of mixing uncharged 
ion-paired and un-ionized groups, Ka and K? are the 
intrinsic dissociation constants of the acid- and ion- 
paired groups, Z, is the number of Kuhn statistical 
elements in the randomly coiled chain, and u(y) is the 
interaction potential between neighboring statistical 
elements making an angle y at their junction. 

From Eqs. (34) and (35), the equilibrium root-mean- 
square end-to-end separation of the chain and the ex- 
tent of counterion binding may be evaluated by straight- 
forward minimization. The agreement between experi- 
ment and theory is semiquantitative, as can be seen in 
Figs. 2 and 3 which show the amount of binding and 
the expansion of sodium polyacrylate and sodium 
carboxymethyl cellulose, respectively. Lengthy dis- 
cussion of the suitability of the ion-pairing concept can 
be found elsewhere, and space limitations prevent any 
amplification of these arguments. Suffice it to say that 
this author believes the concept to be useful and at 
least partially supported by experiment, whereas other 
explanations (such as ion trapping in the region where 
ev/kT>1) of the intimate association of counterions, 
with the polyion cannot explain all the relevant data.” 

It is apparent that the permeability of the open net- 
work representing the expanded polymer coil permits 
the penetration of electrolyte and the consequent re- 
duction of the electrostatic interactions between poly- 
ions. Several approaches?”:**—*7 to the solution of the 
Poisson-Boltzmann equation have been published. 
Despite widely differing approximations, the conclusion 
of all investigators is that the domain occupied by the 
polyion is essentially neutral and polyion-polyion inter- 
actions, therefore, are dominated by the excluded 
volume-type interactions characteristic of neutral poly- 
mers. This is in complete agreement with the data of 
Schneider and Doty.*® 

From the microscopic point of view, there is a very 
important difference between solutions of polyions and 
solutions of ordinary electrolytes. In an ordinary electro- 
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Fic. 2. Fraction of counterions bound to sodium polyacrylate 
as a function of the degree of neutralization. Solid line is theo- 
retical curve. The dashed curve is from the experimental data of 
Huizenga, Grieger, and Wall.” 
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Fic. 3. Expansion of sodium carboxymethyl] cellulose as a 
function of ionic strength. The curves are calculated assuming the 
amounts of counterion binding indicated. The circles are the 
experimental points of Schneider and Doty. 


lyte solution, the medium is essentially homogeneous, 
and the work required to bring an ion from infinity to 
r does not depend upon r (always neglecting boundary 
effects). In a solution of macroions, the internal region 
of any polyion is, roughly speaking, a solution of much 
higher ionic strength than the external medium. Thus, 
the work required to bring a charge to the point r 
depends upon whether r is within or without the volume 
occupied by the macroion. The extra work required in 
this case corresponds exactly to the difference in ion- 
atmosphere interaction corresponding to the two differ- 
ent concentrations of electrolyte.” Because of counter- 
ion binding, the potential inside a polyion is actually 
much smaller than would appear to be the case. In 
fact, marked deviations from a linearized Poisson- 
Boltzmann equation are expected only in the vicinity 
of the unit charges of the polymer, and these deviations 
are no more serious than those in solutions of small 
electrolytes. Solution of the linearized Poisson-Boltz- 
mann equation (with ion pairing) shows that the 
internal and external interactions are differently 
screened, corresponding to the crude picture of a 
drop of concentrated electrolyte suspended in dilute 
electrolyte. Of course, when the charge density is small, 
or the polymer highly extended, the different screening 
lengths approach equality. Nevertheless, there are 
systems, such as polyelectrolyte gels, for which the 
difference is of great importance. The conclusions 
sketched above also may be arrived at from the dia- 
metrically opposed assumption that, to a first approxi- 
mation, the polymeric volume is electrically neutral, 
and from calculating the small’ potential difference . 
between the interior and the bulk medium from a 
Donnan equilibrium condition. That the sam@ conclu- 
sions are reached when two such widely differing 
approximations are used lends support to the conclu- 
sions as general results. 


While the forces which are operative in polyampho- 
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Fic. 4. Expansion as a function of pH for several ionic strengths 
for a regularly alternating copolymer; K;°=107~°, K.2°=10™, K;° 
= K,=10, charge spacings 6 and 8 A for nearest and next nearest 
neighboring groups. For further details, see reference 20. 


lyte systems are qualitatively no different from those 
in purely acidic or basic polyelectrolytes, significant 
new problems arise because of the inhomogeneity of the 
polymer skeleton. Obviously, far from the isoelectric 
point, the polyampholyte behaves as an ordinary poly- 
electrolyte. On the other hand, in the nearly isoelectric 
region where appreciable numbers of positive and 
negative charges can coexist, these complications 
dominate the situation. 

One may proceed® to calculate the titration curve 
and other thermodynamic properties by exactly the 
same technique as used before, taking cognizance of 
both the existence of two types of charge and the skeletal 
distribution. At the isoelectric point, it is no longer 
true that all sites capable of bearing a charge are the 
same. Even though there is no net charge on the poly- 
mer, forces acting between the elements may arise from 
the two following distinct sources: (a) fluctuations of 
the charge of each statistical element about its mean 
$ value; (b) correlations in charge distribution because of 

the polymerization statistics. 
From the properties of the Grand Partition Function, 
it may be readily shown that 
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for the mean-square charge on a statistical element due 
to fluctuations and for the hydrogen-ion activity at the 


isoelectric point, 
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uation of. the degree of dissociation of acid 
-of binding to acid groups, a1’=a,(1—/f), and 
5 Q, requires specific consideration of 
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skeletal structure. The dissociation constants K,°, K2°, 
K, K 4° are for the acid, base, acid-counterion, and base- 
counterion pairs. 

Consider the simplest possible case, when the acid and 
base groups alternate regularly along the polymer 
skeleton. It is found that: 


(a) The counterion binding does not cause a large 
change in the titration curve near the isoelectric point. 
This is due to the stabilizing influence `of adjacent 
opposite charges. When the charge is predominantly of 
one sign, the polyampholyte behaves essentially like a 
one-component polyion. 

(b) The expansion of the polyion is not strongly 
dependent upon the ionic strength at a given pH. This 
is due to the fact that as the ionic strength increases, 
tending to weaken the force expanding the p&lymer, the 
charge which can be supported on it also increases, 
since the group interactions are decreased. This latter 
tends to counterbalance the first effect. 

(c) Since the binding of counterions is small, the 
effective net charge is essentially identical with the 
number of molecules of acid or base added to the 
solution. The effects of added salts are to shift the pH 
at any given degree of neutralization towards the iso- 
electric point, creating a curve characteristic of weaker 
acidic and basic groups. The primary source of this 
shift in the titration curve is due to electrostatic inter- 
action among segments of the polymer and not to 
binding. This effect is, of course, similar to the observed 
effects in proteins. 

(d) Effects of correlations due to polymerization 
statistics vanish identically, and those due to charge 
fluctuations are very small. These results are depicted 
in Figs. 4-6. 


As a more general case, consider an equimolar poly- 
ampholyte, but with statistically distributed mono- 
meric groups. This is a much more difficult problem. To 
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Tic. 5. Titration curve for the same regularly alternating 
polyampholyte as in Fig. 4. 
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proceed, the polymer is characterized by the number of 
groups of a given type which occur in a row. It must be 
recognized that those charge-bearing groups adjacent 
to groups of the same type have a tendency to remain 
uncharged as compared to groups surrounded by op- 
positely charged sites. If the polymer is regarded as a 
succession of acidic and basic groups, it is clear that, 
near the isoelectric point, almost all of the end groups 
are charged: Thus, the interior groups of the sequences 
of monomers may be regarded as independent to a good 
approximation. The problem now has been reduced to 
a combinatorial question of how many ways one can 
distribute a certain number of charges over the polymer. 

The results of this computation are as follows: 


(a) The titration curve has a smaller variation near 
the isoelectric point in pH with changes in the net 
charge of the polymer than has the corresponding regu- 
larly alternating polyampholyte. This means that the 
slope of the titration curve at the isoelectric point is 
steeper, and that the effect is due to the considerable 
number of charges situated in the interior of sequences 
and adjacent to charges of the same sign. These are, 
therefore, relatively loosely bound and a much smaller 
change in the chemical potential of the hydrogen ion 
results from their removal than from the removal of an 
equal number of the charges of a regularly alternat- 
ing polyampholyte. The inverse statement is perhaps 
clearer; i.e., it takes a smaller change in pH to effect a 
change in charge for the randomly distributed 
monomers. 

(b) A wide range of behavior is possible due to varia- 
tions in structure. One obvious extreme is the block 
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Fıc. 6. Comparison of titration curves for a regularly alternat- 
ing (broken line) and a randomly distributed (solid line) copoly- 
mer. The constants are the same as in Fig. 4. 
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Fic. 7. Rate of change of expansion free energy with degree of 
neutralization as a function of net charge. The copolymer is the 
same as in Fig. 4. 


copolymer with all acid groups at one end and base 
groups at the other end of the molecule. 

(c) An increase in ionic strength, resulting in a de- 
crease in interaction between charge groups, causes a 
shift of the titration curve in the direction of the 
behavior expected of independent acidic and basic 
groups. 

(d) The configurational properties are almost identi- 
cal with the corresponding regular copolymer of 
equivalent net segment charge. 

(e) Charge fluctuation interactions are small due to 
the equimolarity of acidic and basic groups. This effect 
is large only when the numbers of one type of group 
exceed the other. . 

The properties of the regular and statistical poly- 
ampholytes are compared in Figs. 6 and 7. 


VI. HELIX-COIL TRANSITIONS AS INFLUENCED 
BY ELECTROSTATIC FORCES” 


The fact that the electrostatic energy of a molecule 
depends upon the molecular configuration suggests that 
electrostatic forces can be the motive source for mo- 
lecular transitions. Consider a simple model wherein 
several isomeric states of a molecule may exist, char- 
acterized by different electrostatic energies. For sim- 
plicity, assume that the distribution of charg@s and un- 
charged sites is random in all isomeric states and that 
ion binding effects may be neglected. Further for 
simplicity, assume the reaction to proceed in an Alon. 
none fashion. If the electrostatic energy is computed 
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Fic. 8. Equilibrium fraction of bonds broken in DNA as a 


function of temperature. The broadening can be appreciably re- 
duced by different choices of the parameters. For further details, 


see references 39 and 4446. g 


for smeared-out charges, it is readily shown that 
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which defines the pair-interaction parameters W,’ and 
W£, and Q! and Q,’ are the partition functions for the 
coil and helix without electrostatic interactions. The 
pair-interaction parameters are‘! 
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with lo the length of a segment, b the radius of the 
‘ helix, Ko and Kı are Bessel functions of the second kind 
© and pis the charge density on the helix surface. Corre- 
i spondingly, %o and / are the root-mean-square end-to- 
, eparations of the coil in the absence and in the 
of electrostatic interactions, respectively. 
(38) does not apply to any real helix-coil 
because of the assumption that the reaction 
lve intermediate states, an approximation 
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turation of DNA predict equilibrium bond 
function of temperature as indicated in 
calculations are for the case when electro- 
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static interactions are omitted. The effect of charge- 
charge interaction is to lower the enthalpy required to 
break bonds, Ho. In Fig. 8, this corresponds to moving 
from right to left across the lines of constant Ho. In 
‘Fig. 9 is a titration curve for DNA at two temperatures 
showing that the transition occurs titrimetrically at the 
same place as it is observed to occur viscometrically, 
calorimetrically, and by several other techniques.'® The 
treatment can be made much more sophisticated by the 
introduction of discrete charges, ion binding, differ- 
ences in charge between isomeric states, and other 
effects previously discussed, but the qualitative results 
are the same. A qualitative change does occur when 
intermediate states between pure helix and pure random 
coil are considered. The reader is referred elsewhere 
for details.’ he 


VII. SOME BRIEF REMARKS 


Nowhere in this review are transport properties con- 
sidered, a subject of great intrinsic interest. The 
treatment of, say, the conductance is even more 
complicated than of some equilibrium property due to 
the loss of elements of symmetry. Some progress has 
been made by Booth," by Hermans,” and by Over- 
beek,‘ and the reader is referred to their papers. 

In closing, the author wishes to emphasize that 
electrostatic forces only have been considered. In any 
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Fic. 9. Titration curve of sodium deox t leate. The 
arrows indicate the points at which the Reine ER occurs 
calorimetrically and viscometrically. Note that we have plotted 
pH+loga/(1—a) instead of pH—loga/ (1—a). This serves to 
sharpen the appearance of the transition. 
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real system, many other forces are operative. To name 
only two, consideration must be given to hydrogen 
bonds and to steric hindrances. Thus, in any real 
system, one must account simultaneously for the electro- 
static effects discussed and for all other relevant 
phenomena. Separation of these from the electrostatic 
part is arbitrary and not always justified. It should, 
therefore, be borne in mind that the applicability of the 
theory discussed herein to any real system must be 
made with forethought and with circumspection. 
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HIS article, early in the series devoted to the 

“soluble (globular) proteins, discusses briefly some 

of the properties of the native proteins, the main charac- 

teristics of protein denaturation, and a few examples 

illustrating specificity and complexity in the interactions 

between proteins. A model is first presented as a means 

| for describing and correlating, in a general way, struc- 
ture and properties. 

At this point, one accepts at once the facts that 
globular (soluble) proteins are condensed structures, 
that they have many of the characteristics of molecules 
and, from their diffraction patterns, that portions of 
the main or backbone chain of atoms are arranged with 

| considerable regularity. Of the possible arrangements 
| for the latter, that of a helix has been most attractive. 
| For some time, the a-helix of Pauling and Corey! has 
provided a singular foundation for discussion. Several 
| elegant helices were discovered by Pauling and Corey 
by assuming (1) that the planarity of the CONH group- 
ing around the peptide bond should be preserved, and 
by assuming (2) the formation of a maximal number of 
hydrogen bonds between CO and NH groups separated 
by intervening peptide groups along the main chain. In 
the a-helix, a compact coil of sufficiently high density 
results. The direction in which one finds the first carbon 
atom of the side chains for amino acids, other than 
proline or hydroxy proline, is indicated on the plan of 
Fig. 1 by the symbol R. A single turn of the a-helix, in 
which van der Waals atomic radii are included to de- 
scribe the outer limit of the plan, is shown in Fig. 2. 
For purposes of discussion, this plan is distorted into a 


o 
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Plar 3.7-residue a-helix [from L. Pauling, R. 
n of re aoa, Proc. Natl. Acad. Sci. U. S. 37, 
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circle of radius equal to 3.4 A. To this circle, one must 
attach the side chains. An analysis of a variety of 
proteins, undertaken a few years ago,” showed that the 
average side-chain extension was 5.1 A. Thus, the dis- 
tance from the helix center to the end of the average 
extended side chain would be about 8.5 A. 

At this stage, certain properties of the side chains 
should be examined since both internal structure and 
protein interactions are strongly dependent on these 
properties. Figure 3, taken from a reviewe by Low,’ 
shows the variations in size and configuration of the 
amino acids themselves. The side-chain configurations 
may be visualized by subtracting from each approxi- 
mately the equivalent of glycine. The nonpolar amino 
acids have been grouped in the upper left, the upper 
center are those containing hydroxyl groups, while 
acidic and basic side-chain groups are in the upper and 
the lower right, respectively. Table I gives lengths and 
volumes for twenty common amino-acid side chains. 
An examination of the second column, which gives ex- 
tended length, reveals that side chains vary from 1.5 A 
for glycine to 8.8 A for arginine. The volumes vary 
from 5.1 A? for glycine to 175.5 A® for tryptophan. 
Figure 4 shows that, in general, there is a closely linear 
relationship between side-chain length and volume 
except for the larger nonpolar side chains of valine, 
leucine, isoleucine, phenylalanine, tyrosine, and trypto- 
phan. All of these are significantly shorter for the 
volumes occupied. I feel that the reason for this situa- 
tion is to prevent, under ordinary circumstances, inter- 
actions of the larger nonpolar side chains. Thus, an 
alteration in structure generally is necessary to allow 
these side chains to come into contact. 


eo 


Fic. 2. Plan view of one turn of the a-helix includin der 
F i g van 
326 (ise Lom D. F. Waugh, Advances in Protein Chem. 9, 
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Fic. 3. Packing models of Isoleucine 
amino acids as dipolar ions 
[from B. W. Low in The 
Proteins, H. Neurath and 
K. Bailey, editors (Aca- 
demic @Press, Inc., New 
York, 1953), Vol. I, p. 235]. 
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The coiled main chain is viewed as a tube, flat-sided 
in the case of the a-helix, having protuberances of 
varying shape and volume. The shapes and spatial 
positions of the ring-containing side chains are fixed 
except for the freedom permitted by rotation around 
the bonds of the CH2 group which, except for proline 
and hydroxy proline, joins the ring structure to the 
main chain. The positions of the protuberances will 
depend on the amino-acid sequence and the precise 
nature of the main-chain coil through that position. 

In many, if not most proteihs, the polypeptide chain 
occurs in segments or folds, and*in the native molecule 
these segments or folds are approximated and bonded 
laterally. Certain of the lateral bonds may be covalent 


TABLE I. Characteristics of amino-acid side chains. 


AND THETR 


INTE’*RACTIONS 


ACIDIC R GROUPS 


2 
TR Serine tR Aspartic acid 
a a ; 
Fis Threonine z May occur as amide 
eii 5 i erna 
— m G 
Bs Tyrosine s | j 3 
<7 Mg Glutamic acid 
Diidotyrosine a | May occur as amide 
a . | Glutamine 
Thyroxine 
Scale 
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SULFUR-CONTAINING R GROUPS 


BASIC R GROUPS 


g © SSS 
YÈ Methionine 7. ‘2 
Yk Cysteine 


a =f 
p Cystine 


bonds, the primary focus of attention in this respect 
being the disulfide bonds of cystine. Others will be ionic 
interactions and secondary valence interactions such as 
hydrogen bonds and short-range van der Waals forces. 
Comments about the interactions of chain segments in 
producing the protein molecule are, T believe, applicable 
with some modification to the interactions between 
protein molecules. 

Figure 5 gives a diagrammatic representation of a 
cross section through a four-chain molecule. This mole- 
cule is constructed on the assumption that, at least for 
short distances, the chains are in the form of straight 
columns and that the center-to-center distance between 
chains is about 10 A, a value heretofore generally ac- 


Side chain 


Length—Volume 


10 |- 
Length Length 
max, Volume max, Volume 
Amino acid A A Amino acid A A3 8 Try 
Aspartic acid 5.0 584 Glycine 1.5 5.1 ZG ; 
Asp. (amide) 5.1 65.4 Alanine 2.8 32.2 To > 
Glutamic acid 6.3 85.5 Serine 3.8 36.0 54 
Glu. (amide) 6.4 92.5 Threonine 4.0 63.1 = 3 
Arginine 8.8 125.7 Methionine ® 6.9 112.1 
Histidine 6.5 89.0 Valine 4.0 86.3 > 
Lysine 7.7 1200 Teucine 38 113.4 0 > 
i 2.9 8 soleucine ` 113.4 
ore 43 579 Phenylalanine 69 136.6 O 20 40, -60807100 \20 i dO COT Calas 
Tyrosine 7.7 138.8 Tryptophan 8.1 175.5 Volume, A 
nn = Ric. 4, Side-chain length vs volume, 
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Fic. 5. Schematic representation of the interaction of four 
helices to produce an internal volume (enclosed by wavy lines) 
in which side chains are closely packed. Heavy circles are main- 
chain domains and light circles give maximum extension of average 
side chain [from D. F. Waugh, Advances in Protein Chem. 9, 


326 (1954)]. 


cepted from crystallographic data. It is probable that 
the axes of the helices are not linear and that the center- 
to-center distance is not always 10 A, but the diagram- 
matic representation is examined for what it suggests 
as to the way in which the side chains are packed in 
the protein molecule. 

Figure 5 shows heavy circles which are the main-chain 
domain, and light circles which give the average maxi- 
mum side-chain extension calculated from the amino- 
acid composition of insulin. The wavy lines demarcate 
an internal volume, where side chains are closely packed, 
from surface regions where side chains are brought into 
contact with other side chains usually through inter- 

i actions with other molecules. The side-chain volume 

| enclosed within the wavy lines is expected to be 

crowded, and thus any regularity in the positioning of 

the main chain would necessitate a careful selection of 

the size, shape, and interaction properties of the side 

chains present on each of the four contributing helices. 

The precision of this selection is appreciated when an 

examination is made of the amount of unoccupied space 

} in the protein molecule. By unoccupied space is meant 

| a space which is too small to accept a water molecule 

` ora larger space which is not accessible to the solvent. 

f The maximum unoccupied space may be assessed as 
| 


follows. 
Fifst; from a comparison of the measured specific 


volume of a protein with that calculated by Traubes’ 
rules from the amino-acid composition, McMeekin and 
Marshall‘ find that the two values are in excellent 
: agreement for a variety of proteins, as shown in 
f Table II. As Edsall> has pointed out, however, the 
= measured value should be about 3.5% lower than the 
= calculated value because of the electrostriction of water 
“ax around charged groups. The difference one may at- 
> to unoccupied space. 

ond, Linderstrém-Lang®” has shown for several 
t enzymatic cleavage of the first few peptide 
ds to a volume decrement in excess of that 
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predicted from electrostriction of water around the new 
charged groups produced. The excess volume decrement 
is about 3% of the molar volume. 

The internal volume of the four-chain molecule shown 
in Fig. 5 is about one-third of the total volume; thus, 
if all of the unoccupied space occurs in the internal 
volume, about 10% of the latter will be unoccupied. 

A rough calculation, based on the average side-chain 
volume and the frequency with which side chains are 
expected to occur in the internal volume, also suggests 
that roughly 10% of the internal volume will be 
unoccupied.” 

The quantity of unoccupied space calculated in the 
foregoing is expected to be a maximum value. For ex- 
ample, disruption of the protein structure may alter the 
structure of the surrounding solvent so thatya décrease 
in volume occurs. If such is the case, the unoccupied 
space in the internal volume would be less than the 10% 
already suggested, and as a result the side chains in the 
internal volume would necessarily be more closely 
packed. Linderstr¢m-Lang® has pointed out that the 
protein molecule has some forced structure which makes 
it occupy more space than the unfolded elements. This 
forced structure is, I feel, the result of a compromise 
between uniformity in main-chain configuration and 
perfection in side-chain packing. 

Here are a few additional remarks about the internal 
volume. In native proteins, it is the rule—rather than 
the exception—to find that a portion of groups such as 
sulfhydryl, disulfide, phenol, imidazol, etc., appear to 
be altered strikingly either in their physical or in their 
chemical properties. These have been referred to as 
hidden groups. 

Interest in hidden groups has generally been associ- 
ated with interest in the more general problems of 
protein denaturation (see the following), for denatura- 
tion, reversible or otherwise, gives at least a portion of 
the hidden groups a freedom to interact in more cus- 
tomary fashion. Thus, such groups in native proteins 
may not react chemically as would be expected from 


TABLE II. Specific volumes of proteins [from T. L. McMeekin 
and K. Marshall, Science 116, 142 (1952) ]. 


Observed Calculated 
: volume volume 
Protein cc/g cc/g 

Ribonuclease 0.709 0.703 
Lysozyme 0.722 0.717 
Fibrinogen (human) 0.725 0.723 
a-casein 0.728 0.725 
Chymotrypsinogen 0.73 0.734 
Serum albumin (bovine) 0.734 0.734 
Insulin 0.735 0.724 
B-casein 0.741 0.743 
Ovalbumin e 0.745 0.738 
Hemoglobin (horse) 0.749 0.741 
B-lactoglobulin 0.751 0.746 
Edestin 0.744 0.719 
Botulinus toxin 0.75 0.736 
Gelatin 0.682 0.707 
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TABLE III. Average side-chain nonpolarity and nonpolar side-chain frequencies. 


Nonpolar class 


Av side chain Equiv. Occurrence 

Protein Equiv. CH2 CH: frequency 
Collagen 1.33 4.06 0.21 
Rabbit tropomyosin 2.15 4.32 0.25 
Ribonuclease 2.16 4.33 0.26 
Rabbit myosin 2.28 4.41 0.28 
Calf-liver histone 2.29 4.13 0.31 
Chymotrypsinogen 2.29 4.32 0.33 
Human fibrinogen 2.31 4.53 0.32 
Ovalbumin 2.33 4.32 0.36 
Bovine serum albumin 2.37 4.74 0.32 
Edestin 2.37 4.31 0.33 
Horse myoglobin 2.39 4.32 0.36 


Nonpolar class 


Av side chain Equiv. Occurrence 

Protein Equiv. CH2 CH2 frequency 
B-lactoglobin 2.41 4.24 0.37 
FSH 2.44 4.27 0.39 
Horse hemoglobin 2.46 4.19 0.43 
Rat-sarcoma histone 2.47 4.08 0.39 
Ox insulin 2.52 4.27 0.45 
Human y-globulin 2.54 4.32 040 
Ox-growth hormone 2.57 4.48 0.39 
Calf-thymus histone 2.59 4.21 0.40 
a-casein 2.59 4.38 0.40 
Human serum albumin 2.61 4.33 0.40 
B-casein 2.77 4.08 0.49 
Average 2.36 4.30 0.35 


the behavior of the same groups in smaller molecules, 
the ultraviolet absorption spectra may differ from those 
determined with less complicated molecules, and the 
acceptance of protons by basic groups may be retarded. 
Most of these effects must be associated with that 
portion of the molecular structure capable of bringing 
side chains into juxtaposition and of shielding either 
through secondary valence interactions or through steric 
hindrance. In a four-chain molecule, approximately half 
of the side chains will be involved in interactions within 
the internal volume, the other half will, to a large 
extent, define the configuration and surface properties 
of the molecule. 

Examination of a series of proteins reveals that the 
amino acids are distributed as follows: 0.2 to 0.5 can 
carry charges (are, therefore, in the groups arginine, 
histidine, and lysine which may carry positive charge; 
aspartic acid, glutamic acid, cysteine, and tyrosine 
which may carry negative charge). The smaller amino 
acids such as glycine, alanine, threonine, and methionine 
make up 0.13 to 0.5 of the side chains, and the larger 
nonpolar amino acids of valine, leucine, isoleucine, 
proline, phenylalanine, tryptyophane, and tyrosine con- 
tribute 0.21 to 0.47 of the side chains. There appear to 
be no striking correlations between the average occur- 
rences of various types of sidè chains, other than that 
as the frequency of one type increases, the frequencies 
of the other types decrease. However, the average side- 
chain volume may be kept relatively constant. This is 
suggested by the fact that the average side-chain 
volumes for a group of five proteins varied over a range 
of only 6% (Table V in reference 2). 
` The interactions of proteins with each other must, to 
a considerable extent, depend upon the characteristics 
of side chains although segments or regions of exposed 
main-chain groups might also be involved. A brief 
summary of the interaction characteristics of charged 
groups, hydrogen bonds, and van der Waals forces has 
been given. Of particular interest are the following 
special possibilities. Kirkwood and Shumaker" show 
that the mobile protons on a group of particles, at pH 
values close to the isoelectric point and at low ionic 


strength, will effect charge patterns which will produce 
a net attractive force. At ionic strengths in the physio- 
logical range (~0.15) and at pH values removed by 
more than a pH unit from the isoelectric point, inter- 
actions of charges are expected to give rise to repulsion 
and the hydration of charged groups to a barrier which 
will modify the interactions of those groups dependent 
upon close approach, such as hydrogen-bond-forming 
groups and nonpolar groups. 

The formation of an interprotein or intraprotein 
hydrogen bond is usually an exhange reaction involving 
the interacting groups and water. As shown by Pauling 
and Pressman” and by Schellman," the resulting inter- 
action energy is about 1 kcal/mole; thus, several such 
bonds, acting alone, would be required to give a stable 
protein-protein interaction. 

The interactions of a few of the larger nonpolar groups 
with the concomitant formation of new hydrogen bonds 
in the released water, would suffice to form a stable 
intermolecular linkage. This conclusion stems from the 
fact that the association of hydrocarbon chains in an 
aqueous environment liberates about the same energy 
as the condensation of the hydrocarbon from the vapor 
state—namely, about 1.2 kcal/mole of CH» groups. 
Thus, the interaction, under the proper conditions, of a 
single CH» group may be equivalent to the formation 
of an interprotein hydrogen bond. Important also, from 
the standpoint of internal structure and stability as 
well as of interaction, is the insensitivity of the nonpolar 
interaction itself to the pH or ionic strength of an 
aqueous environment. SA 

The average, large, nonpolar side chain is the equiva- 
lent of about 4.2 CH» groups. Table III, which lists 
the frequency of occurrence of large nonpolar side chains 
for a group of proteins, shows an average valúe of 0.35, 
the minimum for corpuscular proteins being 0.25 for 


highly soluble tropomyosin and 0.49 for B-casein. It is ` 


clear that the interaction of a small fraction of fhe total 
number of the larger nonpolar side chains (e.g., 5 to 10 
side chains) would be sufficient to produce an interac- 
tion product which would be stable over a wi 


de range in 
pH near the isoelectric point. Some proteins a 


, like serum 
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88 
albumin, are soluble at their isoelectric points and most 
are soluble to an appreciable extent at pH values one 
or two units away from the isoelectric point. It is thus 
apparent that the native-protein structure (or family 
of structures) positions most of the groups capable of 
attraction so that they are shielded; in other words, so 
that the interactions of attractive groups are opposed 
by energy-requiring processes such as (a) the approxi- 
mation of groups carrying like charges, (b) the removal 
of water of hydration from dipolar or charged groups, 
(c) the breaking of hydrogen bonds between water and 
protein groups, and (d) a distortion of the helix (second- 
ary) structure or side-chain interactions (tertiary struc- 
ture’), etc. In this connection, it is significant that the 
nonpolar side chains are shorter per unit volume than 
other side chains and that the charged side chains, 
particularly those carrying positive charges, are some- 
what longer. The addition of water of hydration to 
charged side chains would cause all of these to project 
beyond the values given in Table I. 
The native protein is extracted by using as mild as 
possible a set of physical and chemical conditions, ex- 
perience being required to define the sets of conditions 
which can be considered mild for a given protein. The 
protein is characterized, so far as is possible, also using 
mild conditions, the end result being a description of 
the protein in terms of its physical, chemical, and bio- 
logical properties. If severe conditions of treatment are 
chosen (such as pH, temperature, the presence of urea, 
guanidine, salicylate, etc.), the properties of the native 
protein change. One of the most striking alterations is 
in solubility, and initially the term denaturation was 
based on the fact that almost all proteins become in- 
soluble after a variety of severe treatments. In general, 
denaturation should not involve hydrolysis of protein 
covalent bonds. The importance of and interest in 
denaturation are indicated by the sequence of reviews 
which have treated this subject.8—10.4.15 
It is instructive to examine a typical experiment, 
involving denaturation by alkali, in which case experi- 
ence dictates an appropriate choice of pH, ionic strength, 
and temperature. 

Figure 6, in the left-hand column, indicates that the 
native protein under ordinary conditions has a low net 
charge and that establishing denaturing conditions first 
produces a native protein of high net charge (upper left). 
With time, the properties of the protein population alter 
according to first-order kinetics to give reversibly de- 

natured molecules also of high net charge. Under the 
conditions of denaturation, the protein usually remains 
in solution, aggregation and precipitation being pre- 
vented by the electrostatic energy barrier to close ap- 
proach. Phat alterations in structure have occurred is 
onstrated by taking an aliquot of the protein solu- 
under denaturing conditions and, by diluting with 
< rapidly returning to a pH near 
ere the native protein is soluble, 
A A 
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The fraction of the protein which has been denatured 
will precipitate. Once a precipitate is allowed to form, 
the precipitate is insoluble over a broad pH range in- 
cluding alkaline pH values where the denatured protein 
remains in solution on careful downward adjustment of 
the denaturing pH. Not only is this general, but if the 
precipitate is tested with respect to time, the limits of 
pH over which insolubility is observed are broadened, 
either indicating an increasing number of participating 
bonding groups or indicating that slow rearrangements 
take place which make existing group participations 
more effective. The insolubility of denatured proteins is 
attributed by many primarily to nonpolar interactions, 
for, since the driving energy is actually the formation of 
new water-hydrogen bonds, these are the group inter- 
actions which are particularly insensitive to changes in 
pH and ionic strength. Clearly, the conformations of 
the denatured molecules do not preserve the balance 
between attractive group interactions and energy- 
requiring processes discussed in the foregoing in con- 
nection with the native protein, alterations in the 
structures of the latter generally producing local patches 
of attractive groups which are no longer shielded. 

That the denatured protein has a structure different 
from that of the native protein may also be shown under 
the denaturing conditions—for example, by changes in 
ultraviolet absorption, optical rotation, chemical reac- 
tivity of particular groups, and by applying techniques 
designed to examine size, shape, and hydration. Occa- 
sionally, determinations of biological activity may also 
be made under the conditions of denaturation. 

If the protein is denatured as described—i.e., when 
the molecules are prevented from interacting by hydra- 
tion and charge repulsion—a slow or stepwise return to 
conditions where the protein is native may lead to a 
reversal of denaturation, that is, to a return of solu- 
bility, biological activity, normal optical activity, etc. 


Native Reversibly Denatured 
& ue es) 
High Charge High Charge 
Intermediate 
Moderate Charge 
Native A Reversibly Denatured 


Insoluble 


2 


Fic. 6. Scheme showing the manipulations involved in 
demonstrating reversible denaturation by alkali. 
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A situation frequently encountered is illustrated by the 
central diagram of Fig. 6. An aliquot of the solution 
under denaturing conditions (reversibly denatured, high 
charge) is brought to a critical pH intermediate between 
the denaturing pH and a lower pH where the denatured 
protein will precipitate. At this intermediate pH, the 
molecules are somewhat expanded, the behavior of the 
system suggesting that a compromise has been reached: 
progressively below this narrowly defined pH, activation 
energy barriers to recovery increase (and thus effectively 
freeze the conformations of the denatured states), while 
above this pH, conformation changes corresponding to 
denaturation are increasingly favored. Clearly, the set 
of conformations which are established with time at 
the intermediate pH will, on reducing the pH further, 
tend to produce conformations corresponding to the 
native stałe. For the protein fibrinogen, denaturation 
is carried out at pH 12.4, ionic strength 0.15, and 0°C. 
Denaturation by a slow reaction is largely complete in 
60 min. The critical intermediate pH is 10.8 and a 
treatment time of 16 hr at 0°C is necessary to effect a 
recovery of native properties. As is so often the case, 
however, the recovered protein resembles native protein 
in many gross aspects—for example, clottability after 
treatment with thrombin—but differs in detail—for 
example, in optical activity or solubility at pH values 
close to the isoelectric point. After reversal of denatura- 
tion, few proteins have been shown to be identical in 
all respects with the original native protein. 

For many proteins, conditions may be chosen so that 
an equilibrium exists between the native and denatured 
states. For example, the enzyme trypsin!*1617 at a pH 
near pH 11 and 50°C shows a distribution between 
native and denatured states, for, on adding salt and 
rapidly altering the pH to pH 7, a fraction of the protein 
precipitates while a fraction remains soluble and ex- 
hibits enzymatic activity. The distribution is sensitive 
to temperature, the properties of the system suggesting 
a true equilibrium, a fact which allows a determination 
of the free energy, enthalpy, and entropy of the reaction. 
The data!® for soybean trypsin inhibitor are shown in 
Table IV (see also reference 14). It is quite clear that 
a relatively small change in free energy is the result of 
a balance between large positive changes in enthalpy 
and entropy. Essentially the same result is obtained 
when other proteins showing temperature-sensitive de- 
naturation equilibria are examined (trypsin, chymo- 
*trypsinogen, pepsin, luciferase; see reference 14). The 


TABLE IV. Soybean trypsin inhibitor pH=3 T=40°C [from 
M. Kunitz, J. Gen. Phys. 32, 241 (1948) ]. 


Rate parameters 


Forward Back 
Thermodynamic reaction reaction 
AF, kcal/mole 1 25.4 24.4 
AH, kcal/mole 57.3 55.3 — 1.9 
AS, cal/mole/deg 180.0 99.5 —85 
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balance between large entropy and enthalpy changes 
(of either sign) to yield small free-energy changes is 
quite apparent also in the rate-defining parameters of 
the forward- and backward-activation reactions. So far, 
denaturation reactions have been too complicated to be 
analyzed in precise structure terms. The present con- 
cepts involve mechanisms in which there are: 


(1) Alterations in proton association and attending 
water of hydration. 

(2) Expansion (contraction) effects with concomitant 
(a) increasing (decreasing) entropies, and (b) increasing 
(decreasing) heats. 

(3) Alterations in tertiary (side-chain) and secondary 
(main-chain coil) structure leading to an altered availa- 
bility of surface side chains. 


In the foregoing, reference has been made to the fact 
that interaction generally stabilizes the denatured state, 
the precipitate of denatured protein becoming more 
insoluble with time. Precipitation, however, does not 
necessarily lead to irreversible changes. R. H. Hartley 
and I have denatured fibrinogen by alkali and precipi- 
tated the denatured protein by rapid alteration in pH. 
The precipitate was recovered by centrifugation and 
redissolved in alkali at pH 12.4. It was then brought 
to pH 10.8 and allowed to stand for several hours at 
0°C after which the pH was returned to pH 7. The 
majority of the protein was then found to be soluble 
and clottable with thrombin, although it differed some- 
what from native fibrinogen in its precipitation at the 
isoelectric point. 

So far, those interactions have been considered which, 
I feel, are generally accepted as involving, relatively, a 
disorganization of structure and thus the distribution of 
side chains. A well-organized site of interaction is clearly 
indicated in interactions between antibody molecules 
and hapten groups as antigens, a subject which has been 
extensively studied by Pauling, Campbell, Pressman, 
and their associates. For example, a protein such as 
serum albumin is coupled with the diazonium salt of a 
molecule to give substitutions such as azobenzenearso- 
nates, (hydroxyphenyl azo) benzoates, azobenzoates, 
etc. Antibodies, found in the y-globulin fraction of the 
plasma proteins, appear when these modified proteins 
are injected into animals. The antigens are found to 
combine relatively specifically with the coupled azo 
group. 

A variety of studies has been made of the combination 
of hapten with antibody and of the way in which various 
types of molecules inhibit the formation of an antigen- 
antibody precipitate.*”! The inhibiting action is most 
effective when the structure of the hapten is duplicated. 
and becomes less effective as the inhibiting molecule 
differs in the positions of the groups attached to the 
benzene ring (the substituent groups), in the sizes of 
these groups (assuming similar chemical affinity), and 
D z P g ? 
in chemical affinity. The conclusion has been reached 
that the main-chain coils of the antibody molecule are 
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arranged so that they enclose a cavity or slot which 
accommodates the hapten group and follows the con- 
tours of the latter within a few tenths of an Ångström 
unit. Interaction has been shown to involve whatever 
forces are permissible from the properties of the hapten: 
nonpolar forces if the hapten has no polar or dipolar 
groups, hydrogen bonds where donor (acceptor groups) 
are present on the hapten, and charge interactions 
when such can be made. 
What data are available suggest that the free-energy 
change accompanying a reaction such as 


Ag+AgAbeAgeAb 
Ab (ppt) «Ab (sol.) 


is small and about 5 to 10 kcal/mole. There are un- 

certainties concerning changes in entropy and heat. 

Further considerations are given to antigen-antibody 

interactions in a later article by Kauzmann (p. 549). 

The combination of a hapten-antigen and an anti- 
body can be considered a “point” combination, unless 
the antibody is directed against the protein surface 
surrounding the hapten group. In other words, the 
interacting molecules may have some freedom of rota- 
tion around a single axis going through the locus of 
combination. 

Interactions having a specificity which leads to the 
formation of complexes whose molecular subunits are 
held in particular positions and orientations with respect 
to each other are known to occur and some to have 
particular biological significance. Among these one 
would place at once the association reactions of in- 
sulin” and a-chymotrypsin,”® the remarkable series 
of interactions leading to the high molecular-weight 
hemocyanins,”*?” and the interactions of the caseins 
which lead to the spontaneous formation of stable col- 
loidal micelles.?8:” The interactions just mentioned lead 
to symmetrical complexes of limited size, the last two 
presumably so that they may contribute a relatively 
small increment in viscosity to the fluid media in which 
they are transported. 

Other interactions lead essentially to extended struc- 
tures of unlimited size, thus to fibrils. They are com- 
monly referred to as globule-fibril (GF) transformations. 
Examples of proteins in this group are insulin,” tropo- 
myosin,” actin,” fibrin,” and collagen, the latter being 
treated extensively by F. O. Schmitt (p. 349). These 
materials are arranged roughly in an order which reflects 

the increasing difficulty experienced in obtaining the 
globular (soluble) form of the protein. In addition, 
tissues such as collagen may contain nonprotein mole- 
cules which aid, somewhere in the hierarchy of aggrega- 
tion, in producing the tissue structure: the tandon or 
fascia, etc. È : ; 

The irżeractions of insulin serve asa specific example 
which illustrate a more general experience. As described 
by Oncley (p. 30), the insulin monomer is a unit of 
We 5733 crosslinked covalently by two disulfide link- 
ages. The maximum, net positive charge at low pH 
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values is 4 units/monomer, a value which should be 
obtained at pH~1, and can be obtained since insulin is 
stable in acid solution. The maximum, net negative 
charge would be 8 charges/monomer, although the in- 
stability of insulin in alkali and the high pH at which 
the two arginine side chains lose their protons would 
reduce this value to 14 for most experimental conditions. 
The association reactions of insulin have been carried 
out at pH 2 where each monomer carries an average, 
net positive charge of 3.5 units. The monomers of 
M=5733 form dimers of M/=11 466 so readily that the 
5733 monomer is observed only in dilute solutions, after 
chemical modification, or in solvents such as acetic acid 
and pyridine (see Yphantis and Waugh* for references). 
Thus, the dimer of M@~12 000 is usually considered the 
interacting unit. These will associate readily in pairs or 
in groups of three.” Summarizing stateMents con- 
cerning reversible association would be (1) .the electro- 
static work involved in bringing molecules together is 
balanced against short-range attractive forces (hydro- 
gen bonds, nonpolar interactions); (2) at pH=2, for 
pairing, AF:= — 4.93 kcal/mole, AH2= — 8.1 kcal/mole, 
and AS,;=—12.1 cal/mole/deg; and (3) the entropy 
decrease of — 12 eu is far less than the 122 eu which 
would be expected. Doty and Myers” point out that 
the freeing of water molecules from an association with 
polar groups would provide the requisite positive en- 
tropy change and suggest that 24 water molecules would 
be sufficient. 

Insulin is also one of the smaller proteins which ex- 
hibits the GF transformation. It is chosen for discussion 
here since its kinetic and mechanistic complications may 
well foreshadow similar complications in other systems, 
just as the GF transformation with insulin was one of 
the first to be described. A recent summary has been 
given.” 

The insulin fibril forms under conditions where the 
dimer is the prevalent form, i.e., at pH 2. Heating at 
80 to 100°C causes a spontaneous transformation into 
a population of fibrils, the most numerous, and largest, 
of which are about 200 A in diameter and many thou- 
sands of Angstroms long. The reaction goes essentially 
to completion but is reversible in the sense that, under 
alkaline conditions, the association product, the fibril, 
disaggregates to yield insulin. Recovery of insulin after 
disaggregation is evidence that the insulin molecule does 
not undergo extensive unfolding in the process of form- 
ing fibrils. Stronger evidence, and evidence which gives 
a clue to the mechanism of fibril formation, comes from 
seeding experiments. In these, preformed fibrils or fibril 
segments are seeded into insulin solutions at pH 2.0. 
While such solutions alone are stable for long periods 
of time at temperatures of 20°C or below, the seeded 
fibrils or segmehts recruit insulin from solution and 
grow in the process. Structurally, the new portions of the 
fibrils obtained after fibril growth at lower temperatures 
appear to be identical with those which are formed at 
80 to 100°C. 
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When a solution of insulin is heated and the trans- 
formation of insulin into fibrils is plotted as a first-order 
reaction (Fig. 7), the resulting curves typically have a 
lag period which is followed by an essentially linear rise. 
The extent of the lag period is determined markedly by 
the initial insulin concentration, as is the slope of the 
near-linear portion of the curve. 

A comparison of the reaction kinetics observed (a) 
with the growth of seeded fibrils and (b) in the absence 
of seeding stiggests that the fibril is first initiated by a 
nucleation reaction which involves the cooperative 
effects of p-interacting units (dimers) according to 
Eq. (1), where the differential is the rate of nucleation, 
kı is a rate constant, and C is the insulin 


dn/dt= kC? (1) 


æ — z ono! ho 
concentration. The value of p is near 3. After initiation, 
when the fibril has achieved a reasonable size, the fibril 
grows as a function of its surface area and the free 
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Fic. 7. Kinetics of the transformation of insulin into insulin 
fibrils. The percent protein has been indicated for each curve. 
0 represents the fraction of insulin remaining at time ¢ [from D. 
I’. Waugh, J. Cellular Comp. Physiol. 49, Suppl. 1, 145 (1957)]. 


insulin concentration [ Eq. (2) ]. The cooperative effect 
established during nucleation is perpetuated, for the 
surfaces (particularly the ends) .of the fibril present to 
the entering interaction unit the correct cooperative 
configuration necessary for bonding. 


— (dc/dt) =k» area C. (2) 


One particularly interesting consequence of this type 
vf mechanism is that the majority of the fibrils are 
initiated during the lag period and those which are 
initiated during the first few minutes of the lag period 
dominate the reaction in the sense that they are re- 
sponsible for removing the majority of the insulin. 
Thus, as shown in Fig. 8 which shows the numbers of 
fibrils in successive groups plotted against the radius of 
each group, the fibril population at the end of the re- 
action appears to be relatively homogeneous. 

It is clear that, at least over a portion of its develop- 


ment, the fibril axial ratio increases with fibril mass. 
? 
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Distribution of diameters (G) 
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Fıc. 8. Distribution of insulin-fibril diameters (G) near the end 
of a reaction involving 2% insulin when 0=0.1 [from D. F. 
Waugh, J. Cellular Comp. Physiol. 49, Suppl. 1, 145 (1957)]. 


That an asymmetric aggregate should form when 
charged molecules link, without specification of surface 
structure, has been pointed out by Rees.*® This is owing 
to the fact that the electrostatic-potential barrier is 
lowest at the ends of the dimer or other asymmetric 
unit. However, the cooperative effect itself will exert a 
strong directional influence. This is illustrated in Fig. 9 
which is a possible arrangement for the insulin-fibril 
nucleus. The stable structure is formed when any fourth 
soluble unit is added to the correct previous group of 
three. Thereafter, units are expected to add most fre- 
quently in a manner which perpetuates this stable 


48.5A 


_ Fic. 9. Diagram of a fibril nucleus illus 
interaction [from D. F. Waugh, J. Cellul 
Suppl, 1, 145 (1957)]. 


trating a cooperative 
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structure—consequently, the aggregate elongates in the 
direction of the cylinder axis of Fig. 9. 

A variety of chemically modified insulins will also 
form insulin fibrils**: esterified and acetylated insulins, 
and those having nonpolar or polar groups added 
through coupling of insulin with substituted benzene- 
diazonium chlorides. Several different solvents also were 
used including concentrated urea, organic acids, and 
ethanol. The results of such studies led to the conclusion 
that the associations of nonpolar groups were primarily 
responsible for linking molecules. During the course of 
these investigations, it was observed also that the intro- 
duction of essentially nonpolar’groups, for example by 
reacting benzene- or tolyl-diazonium chloride with the 
protein, led to insolubility. As the number of groups is 

increased, the progression is towards larger and more- 

stable polymers until the insulin itself becomes insoluble. 

: A series of fibrils with increasing stability to disaggre- 
gation by alkali is formed also from these insulins. 

| We have been studying another system in which there 

! occur both specificity of interaction and cooperative 

f effects, the end result being a complex in which second- 

ary valence and charge interactions arrive at a remark- 

able mutual satisfaction. Reference is made to the inter- 

actions of a,- and «-caseins.?S:?9 Here, œs-casein is a 

phosphoprotein of M=23 000 and has a phosphorus 

content of 1%. Physically, this protein appears to be a 

single coil 210 A long and 16 A in diameter. In the ab- 

sence of calcium, the protein is a highly soluble polymer. 

The polymer becomes quite insoluble in the presence 

of calcium. x-casein is also apparently a single coil about 

150 A long, which contains sulfur but little phosphorus. 

It forms soluble condition-insensitive polymers of 13.55 

j in the presence or absence of calcium. When mixtures 

are made containing molecular ratios of 3a, to 1x, the 

original polymers disappear and a stoichiometric a,-x- 

casein complex forms. This complex can form in the 

absence of calcium and under conditions where both 

of the monomers carry a high, net negative charge. One 

must assume that here, as in other cases, secondary 

valence forces are responsible for the association. In 

this connection, an examination of the amino-acid 

compositions of all of the caseins reveals a high-occur- 

ence frequency for the larger nonpolar amino acids 

mentioned at the start. 

The a,-x-complex, although it exists at all tempera- 

tures between 0 and 37°C, is unstable at the lower 

temperatures. This is apparent from the results ob- 

tained on adding calcium (to give 0.05 M). At the lower 

temperature, a Ca-a,-caseinate precipitate forms im- 

i mediately, leaving x-casein in solution. At the higher 

4 ‘temperature, the complex is stabilized and will now 

engage im further aggregation to form stable colloidal 

i namely, micelles typical of milk. Once formed, 

L e (Ca-a.,-x-casein complex is unusually stable to heat, 

iges in pH, etc. We propose for it a structure in 

the a,-casein molecules are oriented axially 
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around a x-casein molecule. At higher temperatures, 
secondary valence interactions hold the a,-casein mole- 
cules in positions such that pairs of phosphorus groups 
attached to adjacent a,-casein monomers are in juxta- 
position and can be linked through the introduction of 
a calcium ion. The precision with which the phosphate 
groups are positioned is probably dependent upon the 
conformations of the interactants, for, at lower tempera- 
tures, the instability of the complex suggests that the 
phosphorus groups cannot be crosslinked by calcium. 
If each phosphorus is able to accept a calcium ion, 
stabilization through Ca-phosphate crosslinking would 
be absent, and, since Ca-a,-caseinate is relatively in- 
soluble at low temperatures, it would precipitate and 
thus lead to a dissociation of complexes. 

Another association which apparently does not in- 
volve covalent crosslinks, but does involve Specificity 
of interaction and units of a higher asymmetry than 
any mentioned so far, is the collagen fibril. According 
to Boedtker and Doty,*’ the molecule is about 3000 A 
long and 13.6 A in diameter (hydrated), and is unusual 
in that it is made up of three polypeptide chains inter- 
coiled to produce a coiled coil. The molecule contains 
unusually high percentages of hydroxyproline and gly- 
cine, and the occurrence frequency for larger nonpolar 
amino acids, including proline and hydroxyproline, is 
low (0.31). The collagen molecule denatures in the un- 
usually low temperature range of 25 to 35°C during 
which process the molecule falls apart irreversibly into 
its constituent chains, which then fold into more sym- 
metrical structures. A cooperative effect occurs when 
collagen molecules are linked together to form proto- 
fibrils, filaments, etc. (F. O. Schmitt, p. 349). That the 
cooperative effect exists is shown by the striking differ- 
ence between the temperature of molecular denaturation 
and the temperature of thermal shrinkage, the uniform 
cross-striations, and the diffraction patterns of collagen, 
etc. Another result of the cooperative effect in collagen 
is the remarkably high tensile strength of wet collagen 
fibers. The latter is about 60 kg mm? for intact collagen 
and may be about 80% of this value for fibers recon- 
stituted from collagen gels. These high tensile strengths, 
in the absence of intermolecular covalent bonds, must 
be the result of molecular overlapping which allows a 
maximum of interaction in the formation of secondary 
valence attractions. 
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RECEDING articles (Oncley, p. 30; Rich, p. 50; 
Waugh, p. 84) have presented protein molecules 
in various ways. With an eye to the properties they 
wished to describe, the authors have used conceptual 
schemes ranging from the straight polypeptide chain to 
the helical configurations, and finally to the unextended, 
bundled-up shape of the globular proteins. These differ- 
ent configurational aspects must be studied in different 
ways. Chemical methods are used to study the sequence 
of amino-acid residues in the polypeptide chain. Indi- 
cations of the helical nature of the chain can be ob- 
tained, for example, by measuring optical rotations. 
Finally, the three-dimensional configuration of a globu- 
lar protein, the detailed twisting and folding of a poly- 
peptide chain to form a bundle of characteristic shape, 
can be determined only by x-ray diffraction. 

Proteins are, of course, much more complicated than 
the molecules usually studied by the x-ray method. Use 
of this method began in 1913 with Bragg’s determina- 
tion of the structure of sodium chloride, which has just 
two atoms in its molecule. The most complicated com- 
plete structure so far determined—vitamin Bj», studied 
by D. Hodgkin—has 90 some atoms in its molecule 
(exclusive of hydrogens). Even the smallest proteins 
are much larger than vitamin Bis. Their great size and 
complexity would, in fact, soon discourage the crystal- 
lographer, if the importance of the problem were not so 
great; for it is not difficult to see that in such highly 
complicated molecules a knowledge of the three-dimen- 
sional configuration is vitally important for an under- 
standing of physicochemical properties or biological 
function. 

X-rays are scattered by electrons, and the amount of 
scattering of x-rays at any point in a crystal is propor- 
tional to the electron density at that point. A crystal is 
essentially a three-dimensional periodically repeating 

structure. The electron density plotted as a function of 
distance along a crystal axis is, therefore, periodic and 
can be analyzed into a series of harmonics, just as a 
musical note can be analyzed into its fundamental and 
overtone components. To carry the analogy further, a 
complicated sound can be reconstituted by a so-called 
Fourier synthesis or putting together of its component 
waves (Fourier components). In order to synthesize a 
musical note, one needs certain items of information 
about each of the harmonics: their frequencies, their 
tedes, and also their phases, that is, how much 


i ther. 
are out of step with one ano 
oes x-ray diffraction, one must generalize this sort of 


i i ions. The complete diffrac- 
rgument into three dimensions. | 
ae pattern of a crystalline protein consists of a regular 


three-dimensional array of spots, any plane of which can 
be sampled by means of an x-ray photograph (Fig. 1). 
The great usefulness of the diffraction pattern lies in 
the fact that, for any crystal, each of the spots corre- 
sponds to a separate Fourier component of the periodi- 
cally repeating electron-density distribution of that 
crystal. The complete distribution is the sum of all such 
waves, each having a characteristic amplitude, wave- 
length and direction relative to the crystal axes, and 
phase. The length and direction of the wave can be 
determined directly from the position of the spot on the 
photograph and the geometry of the x-ray camera, and 
the amplitude can easily be inferred from the blackness 
of the spot on the photograph. It is the determination 
of the remaining parameter, the relative phase of the 
spot, which presents the biggest difficulties in x-ray 
crystallography, for no direct physical method of phase 
determination is available. 

In the study of simpler structures, phases are usually 
found by trial-and-error methods. The crystallographer 
guesses the atomic positions before he begins the analy- 
sis. If his guess is good enough, he can predict the phase 
relationships and then begin a mathematical refinement 
process in which he gradually distorts the proposed 
model until the diffraction pattern predicted from it 
matches that obtained experimentally. He does this by 
minimizing certain error functions which express the 
discrepancies between the observed and predicted pat- 
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Fic. 1. X-ray photograph of sperm-whale myoglobin crystal, 
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terns. In general, the method can yield any one of a 
number of solutions, and the correct one will be found 
only if the initial guess was a fairly good one. For com- 
plex molecules, such as proteins, the method has been 
found to be useless, for the number of possible solutions 
in such cases is so great that there is negligible proba- 
bility that any initial guess as to the positions of all of 
the atoms is sufficiently near the truth to be refined into 
the correct solution. 

Fortunately, there are a few tricks that the crystal- 
lographer can use for determining phases in more com- 
plicated structures. One such trick, the method of iso- 
morphous replacement, is to replace an atom or small 
group in the molecule by a very heavy atom or group. 
If the replacement causes an appreciable change in the 
diffraction,pattern ; if the positions of the heavy groups 
in the unit cell are known; and if the replacement causes 
no distortion of the original crystal structure, it is then 
possible to determine phases by comparing diffraction 
patterns before and after replacement. 

‘The isomorphous-replacement method was first used 
successfully in protein crystallography in 1953, when 
M. F. Perutz applied it to horse hemoglobin. This pro- 
tein molecule contains sulfhydryl groups which can 
react specifically with reagents such as p-chloromercuri- 
benzoic acid (PCMB), and Perutz found a definite 
change in the diffraction pattern when the reaction 
took place. For the method to work, it is essential that 
only a small number of heavy groups are attached to 
each molecule and that they attach at specific sites. 
When these conditions are met, one can establish the 
positions of the heavy atoms in the unit cell by straight- 
forward methods. From the weights of the heavy groups 
and from their positions, one can calculate their vec- 
torial contribution (i.e., both phase and amplitude) to 
the harmonic wave represented by each of the spots in 
the diffraction pattern. Corresponding waves for the 
protein alone, and for protein plus heavy atom, are 
represented by vectors of known (experimentally de- 
termined) amplitudes, but unknown phase angles ; how- 
ever, since the vectorial difference between them (i.e., 
the contribution of the heavy groups) is completely 
determined, one can solve for their phase angles geo- 
metrically by finding an arrangement in which the two 
vectors of specified length form a triangle with the 
difference vector. None of these operations requires 
any guesswork. 

” Perutz determined some phase angles for hemoglobin 
in this way and proceeded to calculate a two-dimen- 
sional Fourier synthesis. He confined his attention to 
a small class of reflections present in the diffraction 
pattern of hemoglobin (and most other crystals) where 
the symmetry of the crystal restricts the phase angles 
to values of either 0 or m radians. When the Fourier 
synthesis is carried out with just these reflections, a 
plane projection of the three-dimensional electron- 
density distribution is obtained. By working in two 
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Fic. 2. Difference Fourier projection of the complex of mercury 
diammine with myoglobin. The unit cell contains two protein 
molecules; the two peaks indicate that one mercury atom is 
attached to each molecule. 


dimensions instead of three, he achieved a twofold 
simplification: only a few of the reflections had to be 
considered, and the phases of these were relatively easy 
to determine. It is obviously easier to determine a vari- 
able with just two possible values than it is to determine 
one which may have any value. 

By putting other heavy atoms in different positions 
in the molecule, Perutz was able to redetermine the 
same phases by an independent set of calculations. The 
results were the same, showing that his projection of 
hemoglobin was quite certainly correct; unfortunately, 
however, it proved impossible to interpret in terms of 
chemical structure. The unit cell is about 40 atoms 
thick, and in the projection all of these atoms came on 
top of one another, so that the features of the molecule 
obscured one another in a hopeless confusion. Obviously, 
the only way of getting useful information was to ex- 
tend this analysis into three dimensions. 

I now turn to our own studies of myoglobin, which 
like those of hemoglobin began with a two-dimensional 
analysis. We chose myoglobin for our study because it 
is a small protein, having a molecular weight of only 
18 000. The molecule consists of 153 amino-acid resi- 
dues, all arranged, so far as we know, in a single poly- 
peptide chain to which is attached one heme group. 
This group is common to hemoglobin and myoglobin, 
and has the property of reversible combination with 
oxygen. While hemoglobin is used for transporting 
oxygen in the blood stream, myoglobin is intracellular 
and its function is to act as a temporary store of oxygen. 
Though myoglobin is present in all mammals, it is 
especially abundant in animals such as whales which 
spend a great amount of time under water and must 
store oxygen for long periods. : 

There are several heavy groups which form good iso- ` 
morphous-replacement compounds with myoglobin. In 
order to find out whether a heavy group has Mtached 
to specific sites in the myoglobin crystal, one takes an 
x-ray diffraction picture of the protein with heavy group 
added and another of the protein alone. One then, as it 
were, subtracts one diffraction pattern from the AA 
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Fic. 3. Sites at which heavy atoms have been attached to the 
myoglobin molecule. In each case, these are two sites per unit 
cell, corresponding to one site per molecule. 


and carries out a Fourier synthesis in such a way that 
only the contribution of the heavy group appears in the 
projected electron-density contour map. If a good iso- 
morphous replacement has been achieved, this two- 
dimensional difference Fourier synthesis of the unit cell 
will show a few high peaks and very little else (Fig. 2). 
A nonspecific absorption of the heavy group or a de- 
formation of the protein would cause the appearance of 
significant “background” features in the map. 
We find, by the foregoing criterion, that myoglobin 
forms good isomorphous-replacement compounds with 
mercuri-iodide ion, gold-tetrachloride ion, silver ion, 
p-chloromercuribenzene sulfonate, and mercury diam- 
mine. The gold-tetrachloride and silver ions appear to 
attach at the same site in the molecule. The other 
groups all go to different sites (Fig. 3). In most of these 
replacements, we have no idea as to the chemical nature 
of the attachment, and indeed there are several un- 
expected features of our results. Thus p-chloromercuri- 
benzene sulfonate is known as a reagent for sulfhydryl 
groups, yet it combines quite specifically to myoglobin 
which has no free sulfhydryl groups. We do not under- 
stand this, nor do we understand why the positively 
charged silver ion and the negatively charged gold- 
tetrachloride ion go to the same site in the molecule. 
There is a great need for some good chemical work on 
the rationale of combining heavy atoms with proteins. 
Our approach, however, has simply been to try various 
reagents and judge our success from the x-ray pictures. 
In two-dimensional work, only one replacement com- 


pound is really needed, but it is useful to have others 


because a double check is then possible. For our two- 
dimensional Fourier projection, we have checked the 
phases of all of the reflections out to a resolution of 
about 2 A. We get the same answer with each of four 
isomorphous compounds, except for a few reflections 
where the heavy atom contribution is so small that we 
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cannot be sure of its direction. Our two-dimensional 
Fourier projection, cross-checked four times, was there- 
fore as reliable as Perutz’s hemoglobin projection but 
just as unintelligible (Fig. 4). We were unable to deduce 
anything about the structure of the myoglobin molecule 
by looking at it, although now that we have a three- 
dimensional synthesis we are able to understand in 
retrospect all of the features of the projection. 

We have already noted some of the factors that make 
it more difficult to work in three dimensions than in 
two. Besides these difficulties, it turns out that in three 
dimensions the phases cannot be determined unambigu- 
ously if there is only one replacement compound avail- 
able. On attempting to draw the vector triangle, one 
finds that there are two possible phase angles consistent 
with the data for each reflection. However, if the calcu- 
lations can be done for each of two different replacement 
compounds, the method yields two pairs of answers of 
which one from each pair should agree, and these give 
the correct value of the phase angle. In practice, because 
of experimental errors and other difficulties, two replace- 
ment compounds are hardly sufficient for determining 
all of the phases. Three compounds or even more are 
highly desirable. 

Before embarking on our three-dimensional synthesis, 
we had some difficult decisions to make. The amount of 
work required for such a project goes up very steeply 
with the degree of resolution sought in the final Fourier 
synthesis. For myoglobin, we estimated that in order 
to get a resolution of 6 A one would have to solve the 
phases of about 400 reflections, while for a resolution 
of 2 A one would have to treat some 10 000 reflections ; 
to discern the individual atoms in myoglobin, one would 
need a resolution of at least 1.5 A, and that corresponds 
to about 20 000 reflections. At this point, the diffraction 
pattern of a myoglobin crystal begins to fade away 
altogether. In an ordinary crystal, the diffraction pat- 
tern goes on further than this, but a protein crystal is 
more disordered, and it will probably be touch-and-go 
as to whether or not we can resolve directly the indi- 
vidual atoms in protein crystals by any method. At the 
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Fic. 4. Two-dimensional Fourier projection of th 
unit cell. The highest peaks in the projection are owin 
atoms of the heme group. 
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Fic. 5. Three-dimensional Fourier synthesis of the myoglobin unit cell. Some of the rod-like polypeptide chains can be seen. 


present time, however, these theoretical limits of reso- 
lution are of academic interest only, and in fact, because 
the amount of labor goes up so rapidly with increasing 
resolution, we decided in the first instance to try out the 
isomorphous-replacement method at the lowest resolu- 
tion that was likely to give us useful and interesting 
structural information. 

How were we to decide what resolution would be 
useful? Here we had to bringin preconceived notions 
about protein structure, and we adopted a working 
hypothesis which would nowadays be accepted by most 
workers as being at least plausible—that a-helices are 
somehow involved in the protein molecule. There is 
some experimental evidence that this is the case, though 
it is not of an absolutely conclusive kind. An a-helix is 
a very dense object, and at low resolution it should 
appear as a rod of electron density about 1 electron/A®. 
The side chains sticking out from the helix are more 
open structures, and they should appear as regions of 
density about 0.29 electron/A’, surrounding the central 
rod. a-helices would be expected to pack together so 
that each is about 9 to 10 A from its neighbors. It was 
decided, therefore, to calculate the first Fourier syn- 
thesis at 6 A resolution, when the a-helices, if present, 
should be clearly distinguishable. 

The process by which we finally arrived at the Fourier 


synthesis is not detailed here. Much of the calculation 
was done with a high-speed computer. The method is 
analogous to that used in two dimensions; it requires 
no guesswork, and the phases of all reflections can be 
cross-checked. We are, therefore, confident that our 
three-dimensional electron-density distribution is es- 
sentially correct, although it is somewhat blurred by 
its low resolution, and must contain a small background 
because of experimental error. 4 
When the Fourier synthesis was finished, we had the 
problem of making a suitable geometrical representation 
of this three-dimensional object. We first plotted a series 
of electron-density contour maps for 16 parallel planes 
at different levels in the unit cell, and then traced the 
contours of these onto transparent sheets of Lucite, so 
that by stacking two or more of them together one could 
see how the contour surfaces were arranged in three 
dimensions (Fig. 5). At this point, we could sce the two 
differently oriented myoglobin molecules which com- 
prise the unit cell of the crystal, their positions being - 
related to one another by the symmetry of the crystal 
(a screw dyad axis). We could also see rod-like™eatures 
of high density within the molecules, and indeed except 
for a few regions of uncertainty we could follow a con- 
tinuous path of high electron density throughout the 
entire molecule. Thus, we were able to make a model 
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Fic. 6. Model of the myoglobin 
molecule. The grey disk seen nearly 
edge-on is the heme group; the 
small spheres are heavy atoms at 
tached to the molecule. The marks 
on the scale are 1 A apart. 
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to represent the protein molecule consisting of one long 
rod of modeling clay or thermo-setting plastic, molded 
to the shape of the rod of high electron density. The 
rod is irregularly coiled to form a bundle measuring 


Section parallel to [201] at x = 0 


rough the three-dimensional Fourier synthesis 
on the right, straight rods 40 A long. 
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about 45 A by 35 A by 25 A. On looking at the model, 
one is impressed by its total lack of symmetry, and by 
the absence of parallel lengths of rod (Fig. 6). 

The rod has segments over which it is nearly straight 
for distances of 20, 30, or even 40 A (Fig. 7). Adjacent 
segments of rods are always at least 9 to 10 A apart. The 
electron density at the rod axis is about 1 electron/A*. 
Thus, in all respects, the rod seems to fit the specifica- 
tions of an a-helix, and we have little doubt that it is 
in fact the polypeptide chain. We cannot yet be sure, 
however, whether the configuration of the chain is that 
of an a-helix, though it must be something fairly closely 
resembling this in gross dimensions. r 

From chemical evidence, the molecule is thought to 
consist of just one chain, and that is the way we repre- 
sent it in our solid models. But, in following the high- 
density streak through the molecule, one finds regions 
where its path gannot be traced unambiguously. These 
are the places where we think the rod is sharply bent. 
The presence of a bend would imply an interruption in 
the very dense helical structure, and would explain our 
difficulty in tracing it through such a region. A simple 


calculation shows that, in any case, the entire chain a 


cannot be coiled as tightly into an a-helix, for the total 
length of rod appearing in our model is 300 A, whereas 
if the molecule consisted entirely of a-helix, the rod 
would be about 230 A long. If we make the obviously 
oversimplified assumption that part of the molecule has 
the a-helix form, and that the rest of it is fully extended, 
then the measured length of the rod requires that about 
70% of it be in the helical form. This figure agrees with 
other estimates made by the optical-rotation method 
and by the deuterium-exchange method. 

The heme group of myoglobin can also be seen in our 
Fourier synthesis. This group was easy to identify be- 
cause of its iron atom, by far the heaviest atom in the 
protein molecule. It produced a peak of electron density 
50% higher than any other feature in the molecule. 
Though we cannot resolve the porphyrin-ring system, 
the heme $roup appears flattened in one direction, and 
this direction corresponds very well with what we al- 
ready knew of the orientation of the heme group from 
optical and magnetic studies of myoglobin crystals. In 
order to make our identification of this group more 
certain, we treated the protein with a reagent, p-iodo- 
phenyl hydroxylamine, which is known to react with 
the myoglobin heme group, and it turned out that the 
electron density was raised at the appropriate place in 
the molecule (Fig. 8). 

We are now collecting further data in order to extend 
the resolution of our model to 2 A. At this resolution, 
our rod, if it is truly a helix, should show up as a hollow 
tube with spiral walls like a spring. We might even be 
able to recognize some of the larger side chains, such 
as those containing benzene rings. The resolution of 
individual atoms is still a long way off, if indeed it is 
attainable. 

When considering the future of protein crystal- 
lography, one must remember that a full three-dimen- 
sional analysis of a protein crystal is a very long and 
tedious process. In that respect, it is like the determina- 
tion of the amino-acid sequence of a protein. Neither is 
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Fic. 8. Difference Fourier projection of the p-iodophenyl hy- 
droxylamine derivative of rhyoglobin. The peaks are caused by 
the iodine atoms, and they lie near the heme groups in projection. 


likely to be embarked on lightly, yet it is only by means 
of these now difficult and tedious methods that we will 
ever be able to put into concrete form the vague con- 
cepts concerning configurational specificity in proteins. 
Edmundson at the Rockefeller Institute has recently 
begun an amino-acid sequence analysis of myoglobin, 
and his work taken together with the x-ray results at 
higher resolution may eventually lead to a detailed 
picture of the configuration of this one protein. There 
are, however, very many other interesting proteins for 
which the same sort of dual approach would be highly 
warranted, and it is very important to try to develop 
short-cut methods, but as yet nobody has much idea 
what form they will take. 

The author desires to express his appreciation of the 
work done by Dr. J. Kraut and Dr. R. G. Hart in 
preparing this manuscript for publication. 
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The Hydrogen Bond 
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T is an experimental fact that a hydrogen atom at- 
tached to an electronegative atom in one molecule 
can cause an interaction with another molecule which 
also contains an electronegative group, or with a differ- 
ent electronegative group of the same molecule. This 
interaction is referred to as hydrogen bonding.! The 
order of hydrogen-bond strength is F>O>NCI,C. 
There is no clear-cut lower limit to the strength of hy- 
drogen bonds, those to chlorine and carbon atoms in 
particular giving rise to very little energy of interaction. 


THERMOCHEMISTRY 


The heat of formation of the strongest hydrogen 
bonds is about 10 kcal, but this value is achieved only 
in polymers of HF. The strongest hydrogen bonds be- 
tween groups of biological interest are between oxygen 
atoms, and the heat of formation of these is rarely 
greatly in excess of 5 kcal. Hydrogen bonds between 
nitrogen and oxygen are usually somewhat weaker and 
bonds between pairs of nitrogen atoms somewhat weaker 
again. These relations apply only to general trends; it 
is not suggested that every O—O hydrogen bond is 
stronger than every N—O or N—N bond. 


STEREOCHEMISTRY? 


The lengths of hydrogen bonds vary considerably, 
but there is a general rule that the stronger the bond, 
the shorter its length. Oxygen bonds may have a length 
as short as 2.5 A, although those occurring in biochemi- 
cal systems are usually somewhat longer than this, 
perhaps 2.7 A. There is no sharp upper limit to the 
Jength of a hydrogen bond, but if the O—O distance is 
much greater than 3.0 A, there is very little interaction. 
Similar considerations apply to N—O and N—N hy- 
drogen bonds. 

In almost all hydrogen-bonded systems, the proton 
does not lie symmetrically between the atoms bonded. 
The exceptions to this rule are the [HF:] ion and the 
hydrogen-maleate ion, but in systems of biochemical 
interest the hydrogen will always be associated definitely 

with one or the other of the atoms bonded. 

The position of the hydrogen atom relative to the 
atom to which it is most strongly bound is always close 


On the other hand, its position relative to the 


co : 
a sting that x-ray studies show that, when the 


sible for the hydrogen atom of a hydrogen bond both to 
he on the line joining the atoms bonded and to occupy 
its normal position relative to its nearest neighbor, it is 
the second requirement which is satisfied. Thus, in sali- 
cylic acid the hydrogen atom lies well off the line joining 
the two oxygen atoms.’ 


SPECTROSCOPIC PROPERTIES 


Hydrogen bonding has a profound influence, both on 
the infrared and the nuclear-resonance spectrum of a 
hydrogen-bonded system. Here it is mentioned only that 
these techniques are of great importance in studies of 
hydrogen bonding. 


EFFECT OF HYDROGEN BONDING ON 
CHEMICAL PROPERTIES 


In general, the effect of hydrogen bonding on the 
donor group is in the same direction as the effect of ioni- 
zation, but less extreme. In a similar way, the ac- 
ceptor group behaves in the same manner as it does 
when a proton is added, but again the effect is less 
marked. An example of this is salicylic acid which is a 
much stronger acid than one would have anticipated if 
no hydrogen bond were formed. The effect of the hy- 
droxyl proton on the carboxylic-acid group is like that 
which would be produced by the addition of an extra 
proton and, therefore, leads to an easy loss of the car- 
boxylic-acid proton. 


THEORIES ABOUT THE HYDROGEN BOND 


The earliest theories of hydrogen bonding postulated 
either an electrostatic or,a covalent interaction, usually 
the former.* In the electrostatic theories, it is argued 
that, since the fluorine oxygen and nitrogen atoms are 
very electronegative, a proton attached to them must 
carry a positive charge. Thus, if a second molecule con- 
taining an electronegative and consequently negatively 
charged atom is available, the hydrogen atom will be 
attracted to it as shown below. ¢ 


R 
+ = +- 
C=0. - -HO 
R4 
Naturally, the strength of the bond will be greater if the 
positive charge on the hydrogen atom is large and if the 


negative charge on the acceptor is also large. This ex- 
plains why the strongest bonds are formed by fluorine, 


the most electronegative element, and the next strongest 


me THE HYDROGEN BOND 


by oxygen, etc. Rough calculations show that the ener- 
gies of formation of hydrogen bonds are not incompatible 
with a simple electrostatic theory, while the covalent 
theory could not be tested quantitatively. 

More-recent detailed calculations’ show that the situ- 
ation is intermediate between those suggested by the 
electrostatic and covalent theories. It seems that weak 


(a) 


(b) 


Fic. 1. Hydrogen-bond arrangement in (a) a-helix 
and (b) sheet structure. 


101 


hydrogen bonds are always entirely electrostatic in ori- 
gin, but that, as the hydrogen bond becomes stronger 
and shorter, the covalent contribution to bonding in- 
creases. In typical hydrogen bonds occurring in biologi- 
cal systems, it seems unlikely that the covalent charac- 
ter of the bonds will ever exceed about 10%. 


PARTICULAR HYDROGEN-BONDING CONFIGURA- 
TIONS OF BIOLOGICAL INTEREST 


There is one arrangement of atoms which leads to 
particular stable associations between pairs of molecules. 
It is the one present in carboxylic acids, amides, guani- 
dine, etc. 


H 
o O N 
A VA A 
=Ç =@ =€ 
AN >N AN 
OH NH NH 
H H 


This arrangement is particularly advantageous since it 
allows for the formation of two bonds rather than one 
between molecules, e.g., 


pag Te 
RA NER 
X Í 
OH ---0 


We believe that it must be one of the most important 
contributors to the hydrogen-bond stabilization by side 
chains in proteins, etc., and it is also the basis of the 
Watson-Crick® base-pairing scheme, e.g., in the base 
pair adenine-thymine. 


H 
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ELECTROSTATIC AND HYDROGEN- 
BOND INTERACTIONS 


In proteins, one important interaction is that between 
charged amino groups and charged carboxylate ions. If 
the two ions are sufficiently close to be in contact, it is 
not possible to distinguish between electrostatic and 
hydrogen-bond contributions to the energy of inter- 
action. Thus, it is only in the case of interaction between 
widely separated charged groups that a clear case of 
electrostatic (without hydrogen bond) interaction can 
be recognized. x 


@ 
HYDROGEN BONDING IN AQUEOUS SOLUTION 


The heats of formation of hydrogen bonds in the gas 
phase give one little information about their stability in 
aqueous solution, because in this case one is concerned 
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with the difference in the hydrogen-bonding energy of 
the solute and solvent, firstly when the solute molecules 
are bound together and secondly when they are sepa- 
rated. Thus, the heat obtained on forming hydrogen 
bonds in solution is usually less than that obtained in 
the gas phase. Most simple single hydrogen bonds are 
split almost completely in water except when the solute 
is present at very high concentration. The amide group, 
however, because of its great hydrogen-bonding power, 
can form moderately stable dimers in aqueous solution. 

Schellman® has made a detailed study of the dimeriza- 

tion of urea in aqueous solution. Although his assump- 
tions are open to a good deal of doubt, the final value 
for AH, the heat of formation of an O—N hydrogen 
bond in aqueous solution of —1500 cal is probably 
roughly right. In the author’s view, this figure is prob- 
ably a little too high. Schellman then went on to calculate 
the degree of stability of the a-helix in aqueous solution. 
He was obliged to make a number of questionable as- 
sumptions and so, instead of obtaining a firm value for 
the free energy of formation, he could give only a rather 
wide range of possible values. His work is of great in- 
terest because it shows that in aqueous solution the 
a-helix must be on the border of stability so that side- 
chain interactions may be critical in determining whether 
a particular protein exists in the a-helix form or not. 
Even though the numerical values which he obtains are 
probably not very reliable, this conclusion—which was 
reached before much of the evidence which is described 
by Doty (p. 61) was obtained—is of very great 
importance. 

Schellman discussed end effects in some detail and 
showed that the a-helix should be stable only provided 
the chain size exceeds a certain lower limit. Using rea- 
sonable values for the parameters involved, the critical 
length came out to between 8 and 15 units. Experi- 


o 
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5 
mentally, there does seem to be some evidence for this 
conclusion. 


PROTEINS IN AQUEOUS SOLUTION 


The available experimental evidence discussed in de- 
tail in a later paper [Doty (p. 107) ] shows that most 
proteins are only partially present in the a-helix con- 
figurations, the rest presumably being in some less regu- 
lar structure. This raises a point of great importance to 
all discussions of the physical properties of proteins in 
aqueous solution, namely, the question of whether or not 
proteins are present in equilibrium configurations. It 
seems at least possible that, in fact, proteins are present 
in frozen configurations formed during the peeling-off of 
the molecule from its template. If so, many of the argu- 
ments from the structure of simple synthetie polypep- 
tides may not apply directly to proteins. If this is true, 
any structural information about proteins also should 
give useful information about their mode of formation. 
In particular, it may be that certain sections of proteins 
which can be removed without effecting the enzymatic 
activity are present only in order to allow the protein 
to fold up in the right way. Of course, there are also 
many other possible explanations of the same effects. 
Experiments with synthetic copolymers containing, for 
example, appreciable quantities of glutamine or aspara- 
gine might help to solve this problem. 
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Forces between Macromolecules 
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N earlier articles, special attention has been paid to 

electrostatic forces (see Rice, p. 69), steric forces 
(see Waugh; p. 84), and hydrogen bonds (see Orgel, 
p. 100) in biological systems. Here, the attempt is made 
to round out the discussion of various types of inter- 
molecular action! and to take up several special 
applications. 

One recalls first the nature of the mutual potential 
energy of two simple and chemically inert molecules, 
such as two argon atoms. Instantaneously, this energy 
is the sum of the Coulombic interactions involving all 
of the electrons and nuclei. However, one cannot follow 
the electronic motion in detail; all that is required is 
simply the average force between the nuclei, which 
move much more slowly. In fact, little error is made? by 
assuming the nuclei to be stationary while quantum- 
mechanical averages over the electronic motions are 
computed. The results of such theoretical calculations, 
reinforced by experimental evidence from properties of 
gases and molecular crystals, lead to a potential energy 
given approximately by relations such as: 

o(r)=Ar-” —Cr-§, (ia) 


(1b) 


or 


(r)=B exp(—ar) —Cr€, 


where 7 is the internuclear distance and the other 
symbols are constants.* The positive (repulsive) term 
in these expressions is the steric energy resulting from 
direct overlap of the two atomic electron clouds, and 
the negative term is the dipole-dipole dispersion or 
London energy. The familiar curve for the function 
(r) is shown in Fig. 1(a). 

Two further points concerning these intermolecular 
potential energies should be nrentioned. 

(a) In a system of many molecules, the total poten- 
tial energy is to a good approximation the sum of those 
for all possible pairs; thus, useful predictions of the 
behavior of bulk matter can be made in principle with 
the aid of appropriate statistical-mechanical and kinetic 
theories, from a knowledge of the potential energy 
Between just two isolated molecules. Without this 
fortunate result, progress would be difficult indeed. 

(b) The most important features of the ¢(7) curve— 
namely, the equilibrium distance and the potential 
energy corresponding thereto (the minimum in ¢)—are 
not given either by steric or by London forces alone, 
but result from a balance between them. This truism 
warns that, in general, one should not hope to single 
out one unique type of interaction (and the classifica- 


* For pairs of simple molecules, n is about 12, or œ about 
3> 108 cm™. 


° 


tion into “types” is really somewhat arbitrary) as.solely 
responsible for a particular effect. 

One of the simplest applications of a knowledge of 
$(r) is found in the theory of slightly imperfect gases* 
for which the equation of state can be written 


PV/RT=1+B(T)/V+---, (2) 


where P is the pressure and V the molar volume. The 
so-called second virial coefficient, B(T), is given for 
monatomic gases (except for the lightest molecules at 
very low temperatures) by 


BDE 2o f [1—exp(—¢(r)/kT) dr, (3) 


where No is Avogadro’s number and k Boltzmann’s 
constant. At temperatures low enough so that the depth 
of the potential well is at least comparable to kT, the 
quantity B(T) can be thought of as the negative of an 
equilibrium constant describing binary clustering of the 
molecules. 

On turning to the interactions of two molecules in 
solution, as one inevitably must in most biological con- 
siderations, it is extremely important to recognize that 
the space intervening between the molecules is no 
longer empty, but is filled with solvent and perhaps 
other solutes. Detailed treatments of the interactions 
are now much more difficult, but Eqs. (2) and (3) may 
be retained if P now is defined to be the osmotic pressure, 
V the volume of solution per mole of solute, and if (r) 
in Eq. (3) is replaced by a quantity W(r,P,T), called 
the potential of mean force.’ The physical meaning of the 
latter quantity is easily stated : It is the work that would 
be required to pull apart the two solute molecules from 
a separation 7 to an infinite separation when these ‘are 
immersed not in a vacuum (as for the potential en- 
ergy $), but in a very large quantity of the solvent, the 
molecules of which are permitted to perform all motions 
consistent with equilibrium at the stated temperature 
and pressure. Thus, W depends (through the properties 
of the solvent) upon pressure and temperature, and its 
dependence upon the separation r reflects the structure 
of the solvent. . 

The simplest and most familiar example of such a 
potential of mean force is the Coulombic energy 


W (r,P,T)=ee;/rD(P,T) * (4) 


between two point charges, e;, ej, separated by a dis- 
tance r in a medium of dielectric constant D. 


__ T An apparent exception is given by the Coulombic interactions 
in very dilute ionic solutions, but even here a short-range steric 
energy is essential to insure stability. 
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} ' Fic. 1. Potentials of mean force W (r) between two nonpolar 
} spherical molecules as a function of their separation r: (a) in 
f vacuo; (b) in good solvent; (c) in poor solvent. 
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Outside the domain of very dilute electrolyte solu- 
tions, such simple and general expressions for W as 
Eq. (4) are no longer found. Possible curves of W (r) for 
two uncharged nonpolar solute molecules in a good and 
in a poor solvent, respectively, are drawn in Figs. 1 (b) 
Me and 1 (c). The oscillatory nature of these curves is im- 
=~ posed by the structure of the solvent, but it is seen that 
the damping is quite large. Thus, many approximate 
theories of liquids and solids can be constructed in which 
molecules are considered to interact only if they are 
nearest neighbors.‘ 

The important role of the solvent has already been 
= emphasized in connection with hydrogen bonding (see 

= Orgel, P- 100). Perhaps unfortunately for students of 
= poth inorganic and biophysical chemistry, the structure 
artes liquid water is probably uniquely eo pleated. As 

: discusse | in detail by Bernal and Fowler, a large 
e juum of the characteristic, tetrahedrally coordi- 
structure of ice may be considered to persist in 
wat 


rdinary temperatures. The orientations 
cules are far from random, and they 
with rather small arıplitudes, in- 
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molecules or ions into water produces large perturba- 
tions in the structure surrounding each solute particle, 
roughly out to about the third shell of neighbors. 
Despite considerable effort,” a complete quantitative 
theory of these effects still is lacking. However, the 
student of macromolecular behavior rarely needs an 
absolute theory, since his task is to infer the behavior 
of large molecules, given all necessary information 
about small ones. : 

Steric effects between macromolecules, in general, 
may be considered in terms of the well-known covalent 
and van der Waals radii of the constituent atoms.‘ 
Naive expectations may sometimes be misleading, as 
illustrated by the measurements by Chinai and co- 
workers? of the random-coil dimensions in dilute solu- 


tion of a series of synthetic polymethacrylates 
CH; 
| 
(=i e 
| 
COOR 


in suitably ideal (“theta”) solvents.” The effective bond 
length, instead of increasing monotonically as R is 
varied from methyl to -octyl, passes through a mini- 
mum at the n-butyl ester. Clearly, a detailed examina- 
tion of the structure of the polymer-chain skeleton and 
its surrounding substituent groups and solvent mole- 
cules is necessary to the understanding of such behavior. 

Turning now to a more detailed consideration of the 
London dispersion forces,™!! one first recalls that they 
may be considered as arising out of the interaction be- 
tween instantaneously unsymmetrical electrical-charge 
distributions in the molecules concerned. Although a 
noble-gas atom such as argon has, on the average, a 
spherically symmetric charge cloud, an instantaneous 
snapshot of the atom, if such were possible, would 
reveal almost always an unsymmetrical array, with a 
dipole moment, and also, in general, higher moments of 
the charge distribution. If this unsymmetrically dis- 
tributed set of charges is-brought close to another atom, 
its field polarizes the latter into an unsymmetrical 
condition. The result of this mutual perturbation of the 
charge distribution is an attractive potential energy 
whose principal part comes from the interaction of the 
instantaneous dipoles. London’s general formula ob- 
tained by second-order perturbation theory, is (for the 
case of two identical molecules) a 


3h’ Tox? 

Amr? © (Er— Ep)? 
where e and m are electronic charge and mass, / is the 
Planck constant divided by 27, 7 is the internuclear 
distance, Æo and Ex are the electronic energies of the 
isolated molecule in the general state and the kth ex- 
cited state, respectively, and fo; is the so-called oscillator 


strength of the 0—k transition, closely related to the 
intensity of absorption or emission of the corresponding x 
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It is seen that the dispersion energy is always attrac- 
tive, and that it varies inversely as the sixth power of 
the intermolecular separation, in contrast to a depend- 
ence upon 7~* for two dipoles in fixed specified orienta- 
tions. This indicates the second-order nature of the 
effect, as does the presence in Eq. (5) of properties 
related to the excited states of the molecules. 

The quantities required for directly computing the 
sum in Eq. (5) are rarely available. However, useful 
estimates of dispersion forces can be made by appealing 
to a similar expression for the (low-frequency) molecular 
polarizability which contains a similar sum. If it is 


> Jok 

> 
m k (Ex.— Eo)? 
assumed (sometimes apparently without too great 
error) that the largest part of each sum is contributed 
by a single excited state, elimination between Eqs. (5) 
and (6) leads to the much simpler result 


Cn 


(6) 


3a? 
Qais(r) = —-—(Ex— Eo), (7) 
4 76 
in which the energy difference (Ex— Eo) is of the order 
of magnitude of an ionization potential (say, 10 to 20 v). 

The corresponding expression for the London dipole 
energy between two unlike molecules! shows that no 
great specificity is to be sought in these interactions. 
A rough rule, often embodied in approximate theories 
of solutions, is that the force between a pair of unlike 
molecules is the geometric mean of those between the 
related pairs of similar molecules, at the same value of r. 

Equation (5) suggests that highly colored molecules, 
which have at least one large oscillator strength at a 
rather small excitation energy, should exert especially 
strong dispersion forces. This expectation seems to be 
supported” by the fact that many dyestuffs are exten- 
sively dimerized in solution at concentrations as low as 
10— molar, often in spite of Coulombic repulsion be- 
tween ionic charges (cf. Fig. 2). The interaction between 
lyophobic colloid particles.(see Rice, p. 69, and 
Verwey and Overbeek") is qualitatively similar, Cou- 
lombic repulsion at large separation giving way to 
London attraction at smaller separations until overlap 
repulsion produces the stable minimum. 

Unless extensive arrays of conjugated double bonds 
or condensed aromatic rings are found in large mole- 
ecules, their mutual London interactions can be regarded 
as the sum of contributions from all possible pairs of 
localized centers, which may be taken as the individual 
atoms or the bonds. This additivity is well substanti- 
ated in the properties of the normal paraffins, for ex- 
ample, and [recall Eq. (6) ] also is found in refractivity. 
When nonlocalized electron orbitals are permitted by 
the molecular structure, the simple London formula, 
Eq. (5), is inadequate and unusual orientational de- 
pendence of the forces is found,™»!® but the magnitudes 
are not unusually great (e.g., compare benzene with 
-cyclohexane, or hexatriene with n-hexane). 
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FORCES BETWEEN MACROMOLECULES 
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Fic. 2. Potential of mean force between two similarly charged 
molecules or lyophobic colloid particles. At large separations, 
Coulombic repulsion is dominant. As y decreases, this gives way 
in turn to London attraction and steric repulsion. 


It has been suggested'® that very strong and highly 
specific attractive forces can operate between identical 
macromolecules. Consider two identical and widely 
separated molecules, one in its ground state and one in 
some excited state. If these approach each other, the 
fundamental indistinguishability of the molecules pre- 
vents the identification of one of the two as the excited 
one. In familiar language, there is resonance® between 
two identical structures, with resultant (first-order) 
perturbation of the energy levels. The resonance energy 
(dipole term) varies as 7 and can be either attractive 
or repulsive!; but, statistically, the former is favored at 
any finite temperature. 

Resonance forces are highly specific indeed, for, if the 
excitation energies of the two molecules do not match 
rather closely, the resonance energy is negligible; but 
clearly, they cannot contribute much to the stability of 
a system unless appreciable numbers of the molecules 
are in excited states to begin with. This, in turn, requires 
low values of the excitation energies. The excited elec- 
tronic states of all known macromolecules do not satisfy 
this requirement. Excited vibrational states of heavy 
molecules, on the other hand, are more generously 
populated but they do not make very large contribu- 
tions to the forces (compare the magnitudes of the 
so-called “atomic” and “electronic” polarizabilities of 
molecules!”), because only small changes of charge 
distribution attend the vibrational motions. It is con- 
cluded, therefore, that, although specific dipole reso- 
nance forces between identical molecules are certainly 
capable of existence, their magnitudes must be negli- 
gible in all practical cases. However, their role in energy- 
transfer processes involving excited states is very 
important.!§ 

Another class of attractive force operates between 
electron acceptors and donors (Lewis acids and bases). 
These so-called “charge transfer” forces"® contribute to 
the stability of many colored complexes (e.g., benzene- 
iodine) and, thus, have interesting consequences for 
spectroscopy, but they are probably too rare fo be 
accorded major attention in biology. % 

These views regarding the forces between biological 
macromolecules are summarized by stating that, in 
general, they do not differ qualitatively from those 
between simple molecules, and that 
specific interactions are steric and Coulombic. 
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Turning now to applications, consider first the sta- 
bility of micelles formed by aggregation of soap ions, 
which have long hydrocarbon tails and ionic heads. 
Whether the micelles are spherical [compare casein (see 
Waugh, p. 84)] or lamellar (as lipid membranes), 
the ions arrange themselves so that their charges re- 
main at the micelle surface, where counterions from 
the surrounding medium congregate preferentially to 
complete an electrical double layer. The potential of 
mean force between unchanged hydrocarbon molecules 
in water is strongly attractive, so that the micelles 
would grow indefinitely large (as do droplets of un- 

charged hydrocarbon in water), except for the counter- 
acting mutual repulsion between the ionic heads of 
the ions. More specifically, the standard free energy 
of formation AF? of a micelle containing n ions is 
the sum of a negative (favorable) bulk term directly 
proportional to n (arising from the net attraction 
between the hydrocarbon tails), and of a positive 

(unfavorable) surface term varying more rapidlyt than 

n (arising from the electrostatic repulsion between the 

ionic heads). The resultant curve of AF° against n 

displays a minimum defining the most stable micelle 

size. Quantitative development of these ideas”.2! ac- 
counts satisfactorily for the sizes and size distributions 
of soap micelles. 

In systems of more than two components, electro- 
static repulsion is not the only possible size-regulating 
mechanism. For example, the size of stable oil-in-water 
emulsion droplets is controlled by the available quantity 
of a third component, the surface-active agent, which 
need not be an electrolyte. Casein micelles (Waugh, p. 
84) are characterized by large size and unusual sta- 
bility toward added electrolytes, and may well involve 
the simultaneous action of several size-regulating effects. 

Overbeek” suggests that lipid molecules naturally 
form flat membranes rather than spherical micelles 
because of the bulkiness of their hydrophobic parts 
(two hydrocarbon chains per glyceride unit), and that 

there is no need, therefore, to postulate a more specific 
biological mechanism of membrane synthesis. He also 
accounts for the order of magnitude of the observed 
spacing (50 to 100A) between multiple layers of such 
membranes (as in a myelin sheath) as a consequence of 
balance between London attraction and electrostatic 
repulsion. è ai. : 

The formation of gels or gelatinous precipitates in- 

volves a somewhat different balance of forces. Here, the 
interactions between the major portions of the primary 
s, frequently chain- or rod-like macromolecules, 
are insufficient to cause aggregation and even may be 
weakly repulsive, but they are overshadowed by grong 
attractions between specific local ee The oe gre 
aggregate has a network structure and usually is 


tensively swollen by the solvent medium. 


ee ee ce of the electrostatic term upon 7 is 
f The precise dep poden: the charge and concentration of the 


particle 
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complicated, aud . If the counterion concentra- 
counterions forming pedont layer 5/3 for spherical micelles or 
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The mathematical theory of gelation? is similar to 
that of explosions and of certain classical gambling 
problems. It may seem obvious perhaps that networks 
of unlimited size can form from primary molecules only 
if each carries an average of more than two reactive 
sites. The nature of the attraction between these sites 
need not be of any special type. Thus, in synthetic 
thermosetting resins,” the bonds are covalent and essen- 
tially permanent, while in gelatin™ they are presumed 
to be the hydrogen bonds which stabilizé the triple- 
stranded helical collagen structure. In antigen-antibody 
reactions,” a variety of possibilities exists. 

If each of the primary molecules has a large number of 
reactive sites, only a very small attraction suffices to 
produce giant network structures. For example, if each 
primary molecule had 1000 reactive groups, a solution 
of about one percent concentration would gel €ven if the 
association equilibrium constant between two groups 
were only about 10-* liter/mole. One can see that this 
mechanism affords the possibility of producing large 
effects (e.g., gelation greatly reduces the mobility of 
large particles) at small cost of free energy. 
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AON all biologically important macromolecules 
are particular chemical and structural modifica- 
tions of three basic, polymeric-chain structures. These 
polymeric chains are the polysaccharides, the polypep- 
tides, and the polynucleotides. They are better known 
in their more particular forms; for example, cellulose 
and amylose in the first instance, fibrous and globular 
proteins i the second, and deoxyribonucleic and ribo- 
nucleic acid in the third. 

A polysaccharide chain such as cellulose exists as a 
completely extended chain in the crystalline, fibrous 
form and as such has a unique configuration. In solu- 
tion, however, the limited rotational freedom permitted 
at the juncture of each repeating unit results in the 
chain being flexible and, because of internal Brownian 
motion, it undergoes continuous, worm-like changes in 
configuration. This state is referred to as a randomly 
coiled configuration. Such a configuration is devoid of 
any fixed relationship between pairs of residues, and 
hence is said to have no secondary structure. Since the 
fixed configuration in the crystalline form and the ran- 
domly coiled configuration in solution are typical of the 
situation found in most synthetic polymers as well, these 
have been widely studied and are well understood. 
Consequently, they are not further dealt with here. 

The polypeptides and polynucleotides offer a sharp 
contrast in configurational properties to the polysac- 
charides and all other known polymeric chains. This 
uniqueness lies in the ability of these two types of chains 
to form hydrogen-bonded helical configurations con- 
sisting on one, two, or three chains which are stable in 
aqueous solution and hence in,the cellular environment. 
Each of these helical configurations is unique and is 
equivalent to one-dimensional crystallites in that they 
consist of a periodic arrangement of the repeating-chain 
units along the helical axis. In this way unique, stereo- 
specific relations in the individual macromolecule can 
be maintained while the macromolecule itself moves 
about in solution. The randomly coiled configuration 
exhibited in solution by all other macromolecules does 
not have this property. Consequently, it is evident that 
the preservation of unique configurations by polypep- 
tides and polynucleotides in solution offers a basis of 
biological specificity. 

These unique configurations of individual macro- 
molecules are similar to the crystalline state in several 
respects, in addition to their having a one-dimensional 
periodic order. Most important is the implication that 
they will have a melting point; that is, a temperature 
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will exist at which the supporting hydrogen-bonded 
structure will undergo a transition to the equivalent of 
the liquid state. For the macromolecule, the liquid state 
is simply the randomly coiled configuration already de- 
scribed. Thus, it is not surprising that transitions are 
found in these macromolecules that can havea sharpness 
approaching a phase transition. These are generally 
known as helix-coil transitions, and their study in the 
present case is of interest because their location reveals 
the relative stability of the unique configurations upon 
which one’s attention becomes focused. 

In the last few years, the study of these helical struc- 
tures has greatly benefited from the possibility of making 
pure polypeptides and pure polynucleotides—pure in 
the sense that the repeating units are identical rather 
than differentiated as they are in the naturally occurring 
counterparts, proteins and nucleic acids. As a conse- 
quence, the properties of the purely helical forms could 
be carefully studied and the means of detecting such 
forms in the naturally occurring materials were thereby 
greatly sharpened. 

It is against this background! that some studies of the 
configurations of polypeptides and their relation to pro- 
tein structure are examined briefly, and following that a 
similar look is taken at the corresponding situation in 
polynucleotides and nucleic acids. The configurations 
have been established in solution by the use of the 
methods outlined earlier (Doty, p. 61). In addition, 
optical-rotatory dispersion and ultraviolet spectroscopy 
have been found widely useful in detecting the con- 
figuration established by the physical methods. 


POLYPEPTIDES AND PROTEINS 3 


In 1951, shortly after the proposal of the a-helix by 
Pauling and Corey,”* Perutz! showed by means of x-ray 
diffraction that a few fibrous proteins in crystalline form 
did contain this configuration in unspecified amounts. 
However, it was not possible to extend this method to 
globular proteins. 

In 1953, E. R. Blout and the author initiated a pro- 
gram of synthesis and characterization of synthetic 
polypeptides : this had as one of its aims the direct test- 
ing of whether or not polypeptides could take up unique 
configurations such as the a-helix in solution. In 1954 
we found that poly-y-benzyl-r-glutamate coul@exist ia 
two configurations, the a-helix and the solvated, ran- 
domly coiled chain, depending on the solvent. Moreover 
we found that the two forms showed a substantial diffe $ 
ence in specific rotation similar in sign and ma 


A CC-0. Gurukul Kangri University Haridwar Collection. Digitized by S3 Foundation USA 


SSDS ZD OO ee 


ES SS 


shy 


E 


| a; 


108 


[a] [n] 


—60 


—80 


PAUL DOTY 


100 
(ra 
80 
‘= 
2 
60 © 
N 
|= 
Q 
c 
cel 40 L 
D 
(aT 
20 


Fie. 1. The helix-coil transition in poly-Ł-glutamic acid. 


to the difference between native and denatured pro- 
teins. This fitted nicely with the suggestion made in 
1955 by Cohen’ that a specific main-chain configuration 
may be the cause of the difference in optical rotation 
between native and denatured states of proteins. 

In 1956, more-detailed evidence was presented in 
support of the assignment of configuration in solution,’ 
and it was shown that a sharp transition could be ob- 
served between the two configurations.’ More to the 
point, however, was the investigation of the rotatory 
dispersion (i.e., the wavelength dependence of specific 
rotation), because this showed that, whereas the dis- 
persion for the randomly coiled form was of the simple 
(or Drude) type found in most organic compounds con- 
taining asymmetric carbon, the dispersion of the a- 
helical form was anomalous.® At the same time, poly-1- 
glutamic acid was investigated in aqueous solution, and 
found to be helical in acid solution and randomly coiled 
in neutral and alkaline solutions.? This provided addi- 
tional data on a system that more closely resembled 
proteins in aqueous solution. 

It is perhaps useful to examine the case of poly-t- 
glutamic acid in some detail. Measurement of its in- 
trinsic viscosity, sedimentation constant, and molecular 
weight! at pH values below pH 5 (where the carboxyl 
group was largely non-ionized) showed that it had the 
rod-lixe form and mass-to-length ratio of the a-helix. 
As the pH was raised, however, substantial changes in 
all of the physical properties of this polypeptide were 
observed? (Fig. 1), and the interpretation of these 
changes indicated that the polypeptide has gone over 
to a random-coil form. This can be considered as a 
„melting process brought about by the riptane je 
Ision. That is, owing to the ionization of the carboxy 
PY ¥esulting in turn from the increase in pH, there 
our the original helical structure an outer 
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can observe this transition by measuring changes in the 
viscosity, for example, corresponding to the change in 
the shape of the molecule. However, the most generally 
useful of all of the changes accompanying this transition 
is the change in optical rotation. In this case, one ob- 
serves a change of optical rotation from +4 to — 80°C 
accompanying the helix-coil transition. With this nearly 
instantaneous means of detecting the configuration, it 
is easy to show that this is a reversible transition. 

It is an old observation that the specific rotation of 
proteins falls* upon denaturation.!! During the last 
decade, much specific support has been given to this 
generalization by quantitative studies, particularly by 
Kauzmann.” These studies show that the specific rota- 
tion, [@]p, for the majority of proteins lies between 
—30 and —60°C and that these values fall to approxi- 
mately —100°C upon denaturation. Now, if the values 
of [a |p observed for the helix and coil forms of poly-t- 
glutamic acid are adjusted to the mean residue weight 
of proteins (151 for sodium glutamate to 115 for pro- 
teins), the result is that a value of about +5°C would 
be expected for a protein.in the completely helical form 
and about —105°C for a protein in the completely 
random-coil form. Thus, the observed specific rotation 
of proteins is compatible with 40 to 70% of their resi- 
dues being in the a-helical configuration, and the specific 
rotation of denatured proteins corresponds to that of the 
randomly coiled configuration as previously surmised.” 
Naturally, this proposal requires extensive testing, and 
much of this is under way or has been done. 

Since it is well known that proteins denature upon 
heating, one would expect the helical form to break up 
as the temperature is raised. The data in Fig. 2 show 
that this does occur if one uses the optical-rotation 


* In this article, changes in specific rotation are always de- 
scribed with reference to the absolute value, not the magnitude 
as is customary. This is necessary since we now have both positive 
and negative values of specific rotation to discuss in contrast to 
earlier times when the only measured values of proteins were 
negative. f 
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method to gain a measure of helix content.’ At pH 4.1 
and temperatures below 40°C, the optical rotation is 
nearly constant. Above 40°C, the optical rotation begins 
to decrease, showing the start of the helix-coil transition. 
It melts out nearly all of the way at pH 4.6. One can 
pick up the rest of the transition by going to pH 5.0 
where it melts out completely. 

It is possible that the stability of the a-helix in poly- 
L-glutamic acid resides not so much in the hydrogen- 
bonding framework of the peptide units as in the pair- 
wise interaction of carboxy! groups. This is the kind of 
hydrogen bonding that is always pointed out as being 
quite strong, and, if one makes models of the a-helix 
configuration of polyglutamic acid, one sees that un- 
charged glutamic residues can pair very nicely on the 
surface of. the helix. Therefore, it is of great interest to 
see if other polypeptides that do not have this possi- 
bility also show the same phenomena. 

Poly-t-lysine has an amino group on the side chain. 
This amino group is in the charged form (—NH;*) at 
pH values below 9.5, and becomes uncharged (— NH2) 
above pH 10.5. Thus, one would expect that, if the helix 
is only marginally stabile, it would exist only at high 
pH values and would melt out as one lowered the pH. 
Current experiments show just this.“ Again, there is 
about a 90°C change in optical rotation upon passing 
through the helix-coil transition. Physicochemical in- 
vestigations were used to show that it exists as a pure 
a-helix at high pH. 

If one examines the state of charge of the poly-L- 
lysine molecule when the helix begins to melt out in a 
quantitative fashion, one sees that this helix is consider- 
ably weaker than the poly-L-glutamic-acid helix. How- 
ever, the difference is not enough to prevent the poly- 
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Fic. 2. Optical rotation of poly-Ł-glutamic acid as 
a function of temperature as several values of pH. 
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L-lysine helix from being stable in aqueous solution. 
From this and a few other examples, it can be concluded 
that the a-helical configuration is stable in aqueous solu- 
tion in the absence of pronounced electrostatic repul- 
sions or other disrupting influences. 

An interesting point associated with the nature of 
these helix-coil transitions is the question of whether or 
not they have an “all-or-none” character. That is, at the 
midpoint of the transition are half of the molecules in 
each configuration or are the residues in each molecule 
partitioned between these two forms? One of several 
answers indicating that the latter was the case was ob- 
tained in the following way. The intrinsic viscosity in- 
creases much more strongly with molecular weight in 
the helical form than in the random-coiled form. Thus, 
there is one molecular weight for which the viscosity is 
the same in both configurations. By choosing a sample 
of this molecular weight and carrying it through the 
transition, the intrinsic viscosity would be expected to 
remain unchanged if the transition were of the all-or- 
none type. This was not observed in several cases, 10.14.15 
Consequently, the transition must be viewed as one 
that proceeds via intermediate states in which individual 


molecules have interspersed helical and nonhelical 
regions. 3 Ix 
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It is of some interest to consider the behavior of a 
copolymer composed of equal amounts of 1-glutamic 
acid and L-lysine. The optical rotation, which again is 
taken as a measure of helix constant for this copolymer, 
is shown in Fig. 3.16 These results can be compared with 
those previously reported for poly-t-glutamic acid and 
poly-L-lysine which are also shown. The copolymer ex- 
hibits an intermediate behavior here; that is, it is about 
50% helical at acid pH, and not at all helical at 
alkaline pH. This is interpreted as resulting from the 
difference in intrinsic stability of the glutamic-acid and 
lysine residues in supporting the helical structure. When 
the lysine residues are uncharged, there appears to be 
no helix, because the lysine residues cannot compensate 
for the electrical repulsion arising from the 50% nega- 
tive charges that exist here owing to the ionized glu- 
tamic-acid residues. At neutral pH, one finds about 15 


or 20% helix. 


ROTATORY DISPERSION AND HELICAL CONTENT OF 
POLYPEPTIDES AND PROTEINS 


Although one could proceed to estimate the fraction 
of residues (helical content) in the helical configuration 
from the specific rotation of proteins in the manner in- 
dicated in the foregoing, one would be failing to take 
advantage of another aspect of the situation that, upon 
proper analysis, offers an independent means of estimat- 
ing the helical content and in addition offers a much 
more perceptive view of the problem. Soon after the 
first observations of the helix-coil transition in polypep- 
tides, the rotatory dispersion (wavelength dependence 
of the specific rotation) of the two forms was measured 

j and found to be simple Drude dispersion for the random- 

i coil form and a complex dispersion for the helical form. 

The complex dispersion could be fitted in a manner 

expected for coupled oscillators and originally suggested 

by Moffitt.!” This gave rise to a very considerable theo- 

retical interest in the problem of the rotatory dispersion 

of helical macromolecules,'8-” but it has not been pos- 

sible to take advantage of much of this because of the 

limited number of parameters that can be uniquely 
determined from the experimental data. 

From the practical point of view, it is sufficient to say 

that the following equation was found adequate to ex- 

press the observed data*~ > ; 


Ao? Aco? 
= (ao +a”) (<<) +n(———) 5 (i) 
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cient, one depending on the intrinsic residue rotation 
(asymmetric carbon), ao”, and the second is owing to 
the effect of the helical configuration, ao”. The second 
term is the anomalous one, being the square of a Drude 
term, and its coefficient is wholly owing to the effects of 
the helical configuration on the rotatory dispersion. 

This equation is of the same form as that proposed 
by Moffitt and shown to be applicable to the first dis- 
persion data obtained on helical polypeptides,®*.? but 
it now seems proper to consider it an empirical proposal 
for two reasons. The equation itself is of the form ex- 
pected for coupled oscillators. Secondly, Moffitt con- 
centrated his attention on the second term, recognizing 
the complexities that may be associated with the first 
since, in any rigorous view, the first term must instead 
be replaced by several Drude terms each,having a 
different Ao. But if this last point is admitted, then the 
analysis of any available dispersion data is impossible, 
because there is not enough precision to allow the in- 
dependent assignment of values to more than the three 
constants contained in Eq. (1), i.e., (a@o*++ao”), bo, and 
Xo. Consequently, in order to proceed, it was necessary 
to assume that Ao had the same value in both terms. 
This compromise, added to the later finding that the 
absolute value of bo cannot be computed with enough 
accuracy to be useful, forces us to accept Eq. (1) as 
empirical. It is important to make this point because, 
when it is recognized that Eq. (1) is not firmly supported 
by theory, it is more likely that it will be used with the 
caution that an empirical relation deserves. 

The Fitts-Kirkwood theory®” cannot be expressed 
in simple functional form, but since it consists of four 
different contributions, it is clear that these cannot 
possibly be resolved from experimental dispersion data 
since again too many parameters would have to be in- 
dependently evaluated. However, if one takes the com- 
puted results of Fitts and Kirkwood,” one finds that 
they can be quite well fitted by Eq. (1). This observa- 
tion emphasizes that the procedure that we have 
adopted in using Eq. (1) is not in conflict with the 
Fitts-Kirkwood theory. “ 

Examine the adequacy of Eq. (1) in representing the 
dispersion data of polypeptides in the helical form. By 
dividing Eq. (1) by o?/(A°—Av?), it is clear that disper- 
sion data obeying this relation should yield linear plots 
when [m ](\?—)c?)/Ac? is plotted against \o2/(A2—A.?), 
provided the appropriate value of \o has been selected., 
In this plot, the intercept of the straight line would then 
provide the value of (ao”+-ao”), which may be denoted 
a for convenience, and the slope would provide the 
value of bo. Since in this type of plot an empirical rela- 
tion is being tested, it will be necessary to have specific 
rotation measurements at a number of wavelengths, 
because the absence of curvature in the plot must be 
demonstrated in each application. 

In selecting the appropriate value of Ao, a process of 


trial and error must be followed until that particular 
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value which linearizes the dispersion data is found. in 
the fitting of the first dispersion data, a value of 212 my 
was found for poly-y-benzyl-t-glutamate and poly-L- 
glutamic acid. Since that time, the value has been con- 
firmed on both of these polypeptides: and found to 
apply to at least four other helical polypeptides, as well 
as to several copolypeptides.?” Dispersion data plotted 
in this manner are shown in Fig. 4 for three samples of 
poly-y-benzyl-L-glutamate.?® One sample had an aver- 
age chain length of 4 units and gave a horizontal plot 
indicative of no helical content. Another had an average 
chain length of 10 residues and gave an intermediate 
slope and intercept, which, in comparison with the high 
molecular-weight sample, indicates about 60% helical 
content. 

In all of the cases referred to in the foregoing (poly-y- 
benzyl-L-gfutamate, poly-L-glutamic acid, poly-carbo- 
benzoxy-L-lysine, poly-L-lysine, and the copoly-1-ly- 
sine-L-glutamic acid), the values of bo obtained from 
the slopes of what one might call the “‘coupled oscillator” 
plot have clustered about —630. Thus, there is good 
reason to accept this value with an uncertainty of no 
more than +10% as a constant characterizing in an 
empirical way the anomalous dispersion of helical poly- 
peptides of t-residues. In some cases, we have found 
somewhat lower values, but these have in each case been 
with polypeptides that had not been proved to be in 
the completely helical form. 

Turning finally to the coefficient ao, we do not expect 
to find constant values for different polypeptides, be- 
cause do consists of two components, one of which (ao?) 
depends on the intrinsic rotatory power of the individual 
amino-acid residue. It is sufficient to say that, in the 
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Fic. 4. Coupled oscillator plot of rotatory-dispersion data for 
poly-y-benzy|-1-glutamate of various molecular weights. [A ]/[7] 
indicates degree of polymerization (number average). 
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cases of the five polypeptides mentioned in the previous 
paragraph, the values of dp gave an average of zero. 
Since in the helical form ao is found to equal 650, this 
must be approximately the value of ao” itself. 

One is now in a better position to examine the extent 
to which the work on polypeptides enables one to under- 
stand the optical activity and rotatory dispersion of 
proteins. Actually, one is forced to proceed in a very 
simple manner. The polypeptide studies have given 
quantitative information about the rotatory dispersion 
of two configurations of polypeptide chains. One can, 
therefore, only ask if the rotatory dispersion of proteins 
can be accounted for by a linear combination of these 
two characteristic rotatory dispersions. It is important 
to recognize that, if this is the case, the residues not 
occurring in the a-helical configuration need not be in a 
randomly coiled configuration. It is necessary only that 
they not be in any kind of periodic arrangement, be- 
cause this is the condition that the rotatory dispersion 
be of the kind observed for the randomly coiled con- 
figuration, subject of course to “solvent effects” on the 
intrinsic residue rotation, i.e., do”. 

The data summarized in terms of do? and bo values 
can be translated directly into the contribution that the 
a-helical configuration would make to [@ Jp. It is +117° 
for the values of ao¥=650, and bo= — 630 if the mean 
residue weight is taken as 115. The value of ao” con- 
tributes 84% to this result and, hence, it is the uncer- 
tainty of this term that affects most the predicted value. 
Taking the limits of ao” to be 550 and 750 would make 
the helix contribution to [æ ]p 17° higher or lower, re- 
spectively. Thus, the prediction is made that for proteins 
the purely helical configuration and the completely de- 
natured form should be separated by 100° to 135° in 
specific rotation. The polypeptide studies, adjusted to 
mean residue weights on 115 and aqueous solutions, 
establish the lower end of this increment—that is, the 
denatured protein—at about 110° and the high end— 
the pure helix—at about +5°. This is nearly the same 
as that obtained earlier from the measurements on 
poly-t-glutamic acid. 

In terms of the model proposed here, the rotatory 
dispersion of proteins in aqueous solution should be 
given by converting Eq. (1) into the following form’: 


LaJ= 1.39] (© ae G5) +fo(—==)] 


where f denotes the fraction of residues in helical form 
and do” the sum over the ao” values characteristic of 
each residue in the protein. The model is not subject to 
a considerably more rigorous test than in the comparison . 
dealing only with [a ]p. This can be carried out in the 
following way. Rotatory-dispersion measurements are 
made on the protein both in aqueous solution and in the 
completely denatured state. For the former set of data 
[w]1.39(\—2) is plotted against (A—\)—. The result 
should be a Straight line yielding a slope equal to fbo 


a CC-0. Gurukul Kangri University Haridwar Collection. Digitized by S3 Foundation USA 


ESSE: sr 


112 e, 
1: TABLE I. Excess right-handed helical contents 
of (f) of various proteins in water. 
—b0/630 ao" /650 
E 0.87 
Tropomyosin 0.88 .87 
a Insulin 0.38 0.57 
t Bovine serum albumin 0.46 0.58 
i; Ovalbumin 0.31 0.50 
t Lysozyme 0.29 0.39 
Pepsin 0.31 0.26 
Ẹ Histone 0.20 0.30 
} Ribonuclease 0.16 0.17 
t Globin (H) 0.15 0.09 
‘ and an intercept equal to (>) ao”+ fao”). With the value 


of bo set at — 630, the value of f, the fraction of residues 
in the helical configuration, is obtained at once. 

The rotatory dispersion in the denaturated state 
should yield at once the value of ao”, since f=0 in this 
case. Indeed, only a measurement of the specific rotation 
at one wavelength is needed if it is known that A9= 212 
my, as appears to be the case in 8M urea. With ao? 
known, the value of fao” can be obtained from the in- 
tercept evaluated in the plot described above. By taking 
the value of ao” as +650, an independent estimate of f 
can be made. 

This kind of analysis has been applied to a number of 
proteins.” In each case, a linear plot was obtained. The 
test of the model consists then in seeing if the two differ- 
ent estimates of f are reasonable, that is, lie between 
0 and 1, and are in agreement. A selection of this data 
is shown in Table I. It is seen that the values of f are 
indeed reasonable ones and that the agreement is fairly 
good. 

The reasonably good performance of our model in this 
test is not, of course, proof that the model is a faithful 
description of the secondary structure of the proteins 
to which it has been applied. However, it does appear 
to represent an advance which deserves further and 

more-incisive testing. At present, it greatly increases 
our confidence in the existence of regions having the 
a-helical structure in proteins, it sharpens our views on 
protein denaturation, and it provides a framework of 
reference in which studies relating protein structure and 
function can be at least tentatively interpreted. 

Having observed the striking dependence of the con- 
figuration of polypeptides on solvent, we were led to 
wonder if the intermediate development of the helical 

structure of proteins suggested by the foregoing experi- 
ments could not be increased by altering the solvent. 
For this, a solvent that was miscible with water, of 
. comparable polarity and cohesive energy density, but 
with less hydrogen-bonding capacity, was needed. Our 
he = search indicated that 2-chloroethanol was well suited, 
te and it was found that the addition of this to earns 
solutions increased the ae content as mere 
py rotatory dispersion in nearly every case. some ex 
AY Table II for proteins dissolved 


es are shown in 
= 8 
hloroethanol.* À 
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“Thus, the expectation that the helical content of 
proteins can be increased by lowering the hydrogen- 
bonding capacity of the solvent is confirmed. This be- 
havior is precisely the opposite of denaturation. In the 
few cases that we have studied in detail (e.g., ribo- 
nuclease and insulin), the configurational changes on 
going from water to chloroethanol and back to water 
have been completely reversible. Thus, it would appear 
that the helical content of proteins is a result of balance 
between the intramolecular hydrogen bonding of the 
protein and the hydrogen-bonding capacity of the sol- 
vent. When water is the solvent, the point of balance 
is in many cases near 50%, and can be shifted in either 
direction by proper alteration of the solvent. 

It is of little interest to note that insulin was the only 
protein in the list whose helical content was not sub- 
stantially increased in chloroethanol over Water. Since 
the model of insulin proposed by Lindley has about 20% 
of the residues in the left-handed helical configuration, 
owing to restrictions imposed by the cystine bridges, it 
appears quite possible that insulin has approximately 
this configuration both in aqueous and in chloroethanol 
solutions. The residues in the left-handed helix cancel 
the effect of an equal number in the right and thus, even 
when the helical content is nearly 100%, only about half 
of this amount registers with the rotatory-dispersion 
method. This anomaly is removed when it is recognized 
that the method measures only the excess right-handed 
helical content, as indicated in the titles of the tables. 
Nevertheless, this is the sort of detail that emerges only 
on making additional measurements. Obviously, there 
is now a great need of a completely independent method 
to measure total helical content so that the rotatory- 
dispersion methods can be checked in regularly behaving 
cases, and the amount of left-handed helical configura- 
tion evaluated in those proteins where cystine bridges 
may prevent the normal development of right-handed 
helical configurations. 

As a final illustration of the relation between rotatory 
dispersion and protein structure, no better example can 
be quoted than the current work of Kay and Bailey on 
Pinna tropomyosin.” Kay’s study of this molecule by 
physical methods shows it to have the character of thin, 
rigid rods having the dimensions of an a-helix of the 
proper molecular weight. Rotatory-dispersion measure- 
ments have now shown it to behave precisely like an 
a-helix with values of ao” and bo equal to those used 
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TABLE II. Excess right-handed helical contents 
of various proteins in chloroethanol. 


Tropomyosin 110 
Insulin 45 
Bovine serum albumin 75 
Ovalbumin 85 
Lysozyme 63 
Pepsin 44 
Histone 72 
Ribonuclease 67 
Globin (H) á 74 
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here.” Consequently, one has here the perfect example configuration of a random coil. About 10 years ago, it 


of the extreme of the scale of protein structure provided 
by our model. However, the work that is needed to 
establish this model as a framework of reference for 
protein structure in general remains to be done for the 
most part. It is certainly to be expected that numerous 
exceptions will be found, but there is even now reason 
to hope that the concept of describing the secondary 
structure of proteins by a partitioning of the residues 
between helical and nonhelical regions will play a useful 
role in the advance of knowledge of protein structure. 


POLYNUCLEOTIDES AND NUCLEIC ACIDS 


From the viewpoint of chemical structure, nucleic 
acids are known to consist of chains of repeating units 
such as are shown in Fig. 5. The backbone is a phospho- 
ester polymer with six chain atoms per repeating unit. 
The chain shown in Fig. 5 is of, the ribonucleic-acid type. 
That of deoxyribonucleic acid differs in that the oxygen 
at the 2-position on the ribose ring is absent. Generally, 
four different monomeric units, differing in the hetero- 
cyclic rings attached to the ribose group, are found in a 
given nucleic acid. In ribonucleic acid (RNA), these four 
groups are adenine, uracil, guanine, and cytosine. In 
<leoxyribonucleic acid (DNA), thymine (5-methy] uracil) 
replaces uracil. DNA is found in cell nuclei as the princi- 
pal component of chromosomes, and plays the central 
role in carrying and passing on the genetic endowment 
of the cell. RNA occurs principally at the sites of protein 
synthesis in the cytoplasm, and is intimately involved 
with that process. 


Deoxyribonucleic Acid 


The nucleic-acid chain is a very flexible one and, in 
the absence of secondary structure, it would have the 
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became evident that this could certainly not be the case 
for DNA. With increasing refinement in the study of 
DNA in solution, it became clear that this molecule was 
the most extended ever examined.*! Light-scattering 
studies and, more recently, viscosity and sedimentation 
measurements, show that typical DNA samples have 
average molecular weights in the range of 5 to 10 million 
and occupy volumes in solution about one-half micron 
in diameter. Ordinarily, polyelectrolytes are charac- 
terized by their molecular size being very dependent on 
ionic strength. This was not the case for DNA. Thus, 
it appeared to be not only very greatly extended, but 
stiff rather than flexible, as well. 

These observations, as well as a number of others, 
became understandable with the structural proposal 
made by Watson and Crick® in 1953 on the basis of the 
x-ray diffraction studies made by Wilkins and his col- 
laborators.® This structure consisted of two antiparallel 
DNA chains united through hydrogen bonds connecting 
the heterocyclic rings (usually called bases). Only two 
types of pairings were permitted, those between adenine 
and thymine and those between guanine and cytosine 
as shown in Fig. 6. The resulting structure is a two- 
stranded helix about 20 A in diameter. With two resi- 
dues every 3.4A, a DNA molecule with a molecular 
weight of 10 million would have a length of 50000 A 
(S u). Such a long, thin structure would have a slight 
flexibility, presumably enough to account for several 
gentle folds that would reduce its maximum extent by 
about tenfold to agree with the molecular size found in 
solution. (Technically, it is correct to speak of this 
molecule as randomly coiled, but the degree of coiling 
is minute as compared with the single-chain coils to 
which this description is usually applied.) 
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perfect arrangement in the original DNA cannot be re- 
covered. This helix-coil transition can be induced by 
other means, for example by raising or lowering the pH, 
and it can be followed by other means, such as changes 
in optical rotation. However, the most sensitive indi- 
cator of structure has turned out to be the ultraviolet 
spectrum. DNA exhibits a broad maximum at about 
2600 A, and it has been widely observed that the maxi- 
mum of this absorption is substantially depressed rela- 
tive to that found for the corresponding monomeric 
units (nucleotides). Thus, hydrolysis which brings about 
such a conversion from DNA in the helical form to in- 
dividual nucleotides is accompanied by approximately 
a 40% increase in the optical density or extinction at 
2600 A. This suppression of absorbance in the DNA is 
known as bypochromicity. Now, it is found also that, if 
the helical form is converted to the randomly coiled, 
denatured form by raising the temperature or by lower- 
ing the pH, a rise of nearly 40% in the extinction co- 
efficient also occurs. Thus, electronic states of the base 
groups are substantially different, depending on whether 
or not they are held in a hydrogen-bonded, helically ar- 
ranged configuration. It is not yet clear if the hypo- 
chromicity is a result of the hydrogen bonding or of the 
stacking of the base groups one on top of another. 


Polynucleotides 


In 1955, Ochoa and Manago discovered a new enzyme 
which could bring about the polymerization of nucleo- 
side diphosphates to form polymers having the chemical 
structure of ribonucleic acid. The study of pure poly- 
nucleotides which this has made possible has greatly 
increased knowledge of the basic properties whichare 
combined in naturally occurring RNA itself. The prob- 
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Fic. 7. The intrinsic viscosity (measured at 25°C) of DNA as 
a function of the temperature to which it has been heated for 


one hour. 
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lem that can be directly attacked with these new poly- 
ribonucleotides is the following. In 1956, Donohue’ 
showed that the two specific base pairs used in the 
Watson-Crick structure of DNA are only two out of a 
large number of possible pairs. Thus, the availability 
of these polymers has made possible the investigation 
of which pairs are actually stable in aqueous solution. 
Warner and Rich’? quickly showed that polyadenylic 
acid (Poly A) and polyurdylic acid (Poly U) combined 
to form a double-stranded helix when their solutions 
were mixed. A number of other pairs have now been 
found, and in addition several triple-stranded helical 
complexes have been demonstrated. These are discussed 
elsewhere by Rich (p. 191). 

Fresco, Klemperer, and the author have been con- 
cerned with self-pairing, as typically exhibited by Poly 
A.*8:39 We first investigated the sedimentation and in- 
trinsic viscosity of a series of such polymers of different 
molecular weight at neutral pH. The molecular-weight 
dependence of these properties was found to be given 
by the molecular weight to the 0.45 and 0.65 power, 
respectively. These are self-consistent values for random 
coils. The absence of a significant birefringence of flow 
and the rise in viscosity that was observed when the 
ionic strength was lowered forced us to conclude that 
the configuration was that of randomly coiled, single 
chains. However, upon lowering the pH, we noticed a 
sharp transition from one type of ultraviolet adsorption 
to another. The titration curve showed a similar abrupt- 
ness that could occur only if some cooperative transition 
were taking place. At pH’s below this transition, the 
solutions showed very marked negative birefringence of 
flow. Moreover, the molecular weights were found to 
be greatly increased. 

These observations clearly indicated that a coopera- 
tive association was taking place, upon lowering the pH 
through a critical value. The nature of this was further 
clarified in the following way. Neutral solutions of a 
single polymer were made up at a series of different 
concentrations. When these were acidified and the sedi- 
mentation and viscosities determined, the results showed 
the regular behavior that can be seen on the right-hand 
side of Fig. 8. It is seen that the molecular weights, or 
rather particle weights, span a twentyfold range and 
that the sedimentation constants and viscosities vary 
in the manner expected for homologous polymers. The 
weight and size increase with the concentration at which 
they were formed. The respective slopes are again self- 
consistent and their numerical values, 0.36 and 0.92, 
are indicative of a more-extended chain structure than 
was present in the neutral solutions. The results already, 
mentioned at neutral pH are seen at the left of Fig. 8. 

This evidence on the acid-stable form indicates fairly 
clearly that the polyadenylic-acid molecules have joined 
together in a regular structure. We were then able to 
show that the amino group of adenine is unavailable for 
reaction with formaldehyde in the acid-stable complex, 
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Fic. 8. The molecular-weight dependence of sedimentation con- 
stant and intrinsic viscosity for the two forms of polyadenylic acid. 


but that it is highly reactive in the randomly coiled 
state. This showed that the pairing of the adenine 
groups was responsible for the association. When a 
model is built to satisfy this type of pairing, it is seen 
that it must resemble the double-helical configuration 
of DNA with the adenine bases nearly perpendicular to 
the axis. The structure requires that bases be 3.8 A 
apart. To test this point, J. Fresco? has taken x-ray 
photographs of solutions of Poly A (1 to 10%) above 
and below the transition. At low pH, he finds several 
very sharp rings, including one with a spacing of 3.8 A. 
At neutral pH, there are none. From all of this informa- 
tion, we conclude, therefore, that polyadenylic-acid 
molecules react as shown in Fig. 9. For the helix to form, 
the negative charge owing to the phosphate groups must 
be at least half-neutralized by the uptake of protons. 
This behavior of Poly A is similar in many ways to 
other pairwise interactions that have been studied. 
Thus, we have nearly reached the point where all pos- 
sible pairwise interactions have been cataloged. At 
present, it appears that, of the ten possible combinations 
between the four pure nucleotide polymers, 6 or 7 ac- 
yy tually exist. The next problem is to list these pairwise 
interactions in order of their strength in analogy with 
the tables of bond energies one has in the case of co- 
valent bonds. This information can be obtained by a 
| careful study of the temperature at which the respective 
! helix-coil transitions take place. 


Ribonucleic Acid 


j Although DNA and RNA are so very similar in their 
Gi chemical structure, their configurations are strikingly 
© different. This can be said even though the configura- 
of RNA is only now beginning to become clear. I 
summarizing some investigations that have 
n this laboratory on RNA from calf-liver 
ticles by Hall“ and on RNA from 
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` All examinations of RNA solutions have indicated 
that the intrinsic viscosity of such solutions is supris- 
ingly low when compared with the molecular weight. 
This is quite the opposite of the case of DNA. Further- 
more, RNA shows only a small birefringence of flow, 
and this is of the opposite sign of DNA. Thus, it is 
certain that the configuration of RNA in solution is not 
that of the extended double-stranded helix found in 
DNA. 

An examination of the dependence of sedimentation 
and intrinsic viscosity on molecular weight shows that 
it is very close to the 0.5 power in both cases. This can 
be unambiguously interpreted as showing that the mole- 
cules are highly coiled and that the coils are contracted 
somewhat in comparison with the usual polymer chain. 
This result is surprising because the RNA chain carries 
negative charges on each repeating unit and, as a con- 
sequence, at the ionic strength (0.01 M) where these 
observations were made, it would be expected that the 
electrostatic repulsions would result in a very highly 
swollen polymer coil. If this had been the case, the in- 
trinsic viscosities would have been much larger and the 
molecular-weight dependences far different. These con- 
siderations clearly suggest that there are very sub- 
stantial intramolecular attractions which hold the mole- 
cules in contracted configurations. 

The next problem lies in the identification of these 
points of internal attraction. That they indeed could be 
broken, and reversibly formed again, was shown in two 
ways.*°! By removing the salt from the solution, the 
viscosity was found to rise indicting that, by increasing 
the electrostatic repulsion sufficiently, the intramolecu- 
lar bonds could be broken, allowing the coil to expand 
in the manner expected for a polyelectrolyte. 

Similarly, it was found that the viscosity would rise 
with the temperature of the RNA solution. This again 
indicated the breakup of the intramolecular bonding. 

It is known that RNA is hypochromic relative to its 
hydrolysate. Consequently, it was of interest to see if 
the optical density increased upon removing the salt or 
upon heating. Such was ‘indeed found to be the case. A 
quantitative study showed that the increase in extinc- 
tion was parallel with the increase in viscosity. Thus, 
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Fic. 9. Schematic illustration of the helix-coil 
transition in polyadenylic acid. 
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the Base pairing through hydrogen bonds appears to be 
the cause of the contracted state of the RNA. 

By referring to the extinction-coefficient changes ac- 
companying helix-coil transitions in the study of poly- 
nucleotides, it is possible to estimate the fraction of base 
pairs that exist in RNA solutions at moderate ionic 
strength and at room temperature. This comes out to 
be the surprisingly large figure of about 50%. At present, 
this conclusion can only be tentative, but it does raise a 
very interesting question. The studies of polynucleotide 
interactions reveal the large number of different types 
of base pairing that are possible among the bases oc- 
curring in RNA. We now have the strong indication that 
such pairing does occur to a great extent. Obviously, 
the next problem to face is whether or not such exten- 
sive base pairing occurs in a random fashion, or whether 
there is sufficient organization in the RNA structure to 
justify describing it as secondary structure. 

Finally, it is important to recognize that RNA occurs 
in the cytoplasm, not as a discrete substance, but in 
intimate association with protein in the microsomal 
particles. Unlike DNA, which can of itself be biologi- 
cally active (as revealed in bacterial transformation), 
RNA performs its function as a complex with protein, 
and it is to the analysis of the configuration of both 
components in these particles that much future effort 
is certain to be directed. 
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Problems 
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INTRODUCTION 


ROGRESS in biology at the molecular level is be- 
coming increasingly dependent on physical and 
physicochemical techniques such as x-ray analysis, 
radioactive tracers, chromatography, light scattering, 
and spectroscopy. The purpose of this paper is to give 
a broad review of the actual and potential contributions 
of infrared spectroscopy to biology. Since several excel- 
lent articles and papers have already been written giving 
detailed accounts of the practical application of infrared 
methods to selected problems in biology, detail is 
avoided as much as possible in this presentation. In the 
writer’s opinion, infrared methods have not been suffi- 
ciently exploited in the biological field and, since this 
may well be because many biologists find it hard to 
assess their value, it is hoped that a rather general dis- 
cussion of the advantages and disadvantages of infrared 
analysis will be of value to biologists. 

The infrared region of the spectrum is conventionally 
defined as extending from the limit of the visible (about 
7500 A or 0.75 4) to the present lower wavelength limit 
of the microwave region (about 1000), but only a 
relatively small range of these wavelengths is of much 
interest to biologists, viz., 2.5 to 154. There are two 
reasons for this: the first is that within these limits are 
found the vast majority of spectra which throw any 
light on molecular structure; the second is that this 
region can be conveniently scanned and recorded in 
about ten minutes by a commercially available instru- 
ment with a rocksalt prism as the dispersing element. 
The exclusion of infrared wavelengths beyond 15 u does 
not seem so strange when expressed in frequency* rather 
than wavelength units (i.e., 670 to 10 cm~), since the 
majority of the fundamental vibration frequencies of 
molecules which can be readily assigned to highly local- 
ized motions of one or both of the atoms in a chemical 
bond lies between 4000 cm™ (2.5 u) and 600 cm. The 
limits of 670 and 4000 cm~ given in the foregoing are 

set by the fact that beyond 670 cm™ absorption by the 
rocksalt prism rapidly becomes complete, while above 
4000 cm™ the dispersion is rather low. However, it is 
becoming easy to replace prisms by gratings, and so 
these limits need no longer be set by the experimental 
- technique. The first reason given is the dominant one, 
viz., most of the ip aag absorption bands yich go 
be interpreted with certainty he between an 


tional frequency unit in infrared spectroscopy is 
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The general interpretation of the infrared spectra of 
molecules is very straightforward; even the detailed 
interpretation is not difficult, provided the objective is, 
for instance, the identification of a compound in a mix- 
ture, or the establishment of the presence of a charac- 
teristic chemical grouping in a molecule. But, to deter- 
mine the configuration of a biological polymer such as a 
protein is a difficult and complex problem which re- 
quires the same degree of expertise as tha( needed in 
the x-ray analysis of protein structure. 


GENERAL INTERPRETATION OF INFRARED 
SPECTRA OF MOLECULES 

The average polyatomic molecule of biological interest 
has an absorption spectrum in the region between 600 
and 4000 cm™ consisting of 20 to 30 discrete absorption 
bands of which about 5 to 10 will usually be much more 
intense than the remainder. This is because certain of 
the modes of vibration of such a molecule cause a 
periodic change in the dipole moment of the molecule. 
Thus, radiation of the appropriate frequency incident 
on the molecule can be absorbed and excite the corre- 
sponding mode of vibration. Some bands are more in- 
tense than others because the change in dipole moment 
is greater for some of the modes of vibration. Theoreti- 
cally, a polyatomic molecule (of n atoms) has an infinite 
number of modes of vibration, all of which can be built 
up from the 37-6 fundamental! modes of vibration. How- 
ever, the change in dipole moment associated with over- 
tone and combination frequencies is usually an order of 
magnitude less than that associated with the funda- 
mental modes, and in the region of the spectrum under 
discussion the majority of the observed bands can be 
assigned to fundamental modes. 

How can these fundamental modes be vizualized? 
Although, strictly speaking, each fundamental mode 
involves some motion of every atom in the molecule 
and would, therefore, seem to be hard to predict and 
describe, it turns out that an appreciable fraction of the 
fundamental modes of any molecule can be assigned to 
easily vizualized vibrations of individual chemical bonds, 
or of small structural units in the molecule. This is 
especially true of the fundamental modes which are as- 
sociated with the motions of hydrogen atoms. For in- 
stance, any molecule containing an NH group will be 
found to have £ fundamental vibration near 3300 cm™! 
owing to the stretching and contraction of the NH bond 
and two other fundamentals between 1600 and 600 cm 
in which the motion of the hydrogen atoms is almost 
perpendicular to the NH bond. More generally, in the 
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region between 3700 and 2500 cm are found funda- 
~~ a mental modes owing to such “hydrogenic-stretching 


_-© frequencies” in the following order: 


vibrations. The reason is that the masses of these atoms 
are not very different, nor are the restoring forces be- 
tween them. This means that their characteristic stretch- 


OH 3650—3150 cm, 
NH 3500—3150 cm=, 
CH 3150—2850 cm“, 
SH 2650—2550 cm“, 
PH 2450—2300 cm“. 


No other fundamental modes appear in this region of 
the spectrum, and so identification of any of these bonds 
in a compound is generally easy. There are occasional 
limitations, e.g., difficulty of distinguishing OH from 
NH, and the weakness of the SH fundamental; these 
are not discussed here. 

The corresponding “‘hydrogenic deformation frequen- 
cies” for OH, NH, and CH occur between 1650 and 
600 cm~', but cannot be characterized so simply. More- 
over, many other fundamental modes occur within this 
range whose assignment and identification present spe- 
cial problems. However, several groups (such as CH» 
and CH; and the peptide link) can readily be identified 
by characteristic fundamentals of this type. 

Before leaving the hydrogenic frequencies, mention 
should be made of the ease with which hydrogen bond- 
ing can be detected through its effect on these frequen- 
cies. The stretching frequencies are lowered by a few 
hundred cm™!, and the corresponding absorption bands 
become quite broad and diffuse. The deformation fre- 
quencies are increased to a much smaller degree and 
the broadening is not always so marked. 

In the region between the hydrogenic-stretching and 
hydrogenic-deformation frequencies (viz., 2300 to 1650 
cm7') occur two other separable classes of fundamentals: 
firstly, those owing to the stretching of triple bonds such 
as C=N and C=C occurring between 2250 and 2100 
cm™, and, secondly, those largely localized in the 
stretching vibrations of double bonds such as C=O, 
C=C, C=N. The latter start about 1800 cm™ and 
overlap a little the high-frequency end of the hydrogenic- 
deformation frequencies at 1650 cm. This means that 
it is occasionally difficult to decide whether a band 
occurring between 1650 and 1500 cm™ is the result of 
the stretching of a double bond or of the deformation 
of a hydrogen atom. If the hydrogen atom can be sub- 
stituted by deuterium, this uncertainty is often re- 
solved, as a hydrogenic frequency will be reduced by 
almost a factor of V2, whereas the double-bond fre- 
quency will be unaffected. It should be added that oc- 
casionally a hydrogenic-deformation frequency interacts 
with a double-bond frequency, and the two funda- 
mentals resulting can only be described as a super- 
position of these two motions. 

A very large number of the chemical bonds in any 
polyatomic molecule are of the type C—C, C—O, C—N. 
Such bonds do not give rise to localized fundamental 


ing frequencies (which can be observed in small mole- 
cules such as ethane or methyl alcohol, where they occur 
in isolation) are all about the same magnitude (approxi- 
mately 900 to 1100 cm). Consequently, in a poly- 
atomic molecule, strong coupling occurs and the result- 
ing fundamental modes involve simultaneous motions 
of all of the bonds of this type in the molecule. Such 
fundamentals are usually referred to as “skeletal modes” 
and have a range between about 800 and 1250 cm™. 
The pattern of these skeletal frequencies is often the 
most characteristic physical property of a molecule. Two 
hydrocarbons or two steroids, which may be very hard 
to differentiate by chemical means or by other physical 
properties (e.g., refractive index or melting point), can 
usually be recognized instantly by comparing their 
infrared spectra between 700 and 1300 cm™. 


APPLICATION TO BIOLOGICAL PROBLEMS 


There are three principal ways in which infrared 
spectroscopy can be of help in molecular biology, viz. : 

(a) as an analytical tool, (b) as a means of establish- 
ing the structural formula of a biologically important 
compound, and (c) as a means of determining the spatial 
configuration of biological polymers. 


(a) Analysis 


Infrared spectroscopy is now a well-established and 
widely used analytical tool in any up-to-date biochemi- 
cal laboratory. The principal advantage of infrared over 
visible or ultraviolet spectroscopy is that the spectrum 
is much richer, since it is the result of vibrations in every 
part of the molecule. Spectra in the visible and ultra- 
violet arise from the excitation of an electron in one part 
of the molecule, usually in a double bond such as CO 
group. In analytical work, one therefore has to be cer- 
tain that this group is not present in any of the other 
compounds in the mixture. On the other hand, infrared 
methods will usually reveal every one of the compounds 
in a mixture. The only serious disadvantage of infrared 
analysis is that the use of water as a solvent is generally 
impossible because of its intense absorption over a good 
deal of the working range. This can frequently be over- 
come either by the use of other solvents, or (in the case 
of a solid) by making a pressed disk of the compound 
ground very finely in an excess of KBr. One other point 
should be mentioned, since it is frequently very impor- 
tant in much biological work, viz., the minimum quant- 
ity which is necessary in order to make an identification. 
It is impossible to give a precise figure here, because the - 
Intrinsic intensity of absorption in the infrared varies 
over such a wide range between compounds, and refer- 
= general guide, one my ad I O 
designed cells it is aan ae et A Ey 
ane a E Es spectrum from about 
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microscope attachment, this limit can be reduced to 
below one microgram. 


(b) Structural Formulas 


One of the earliest applications of infrared spectros- 
copy to the structure of a biologically important mole- 
cule occurred in the case of penicillin. Here, the chemists 
were unable to agree on which of the three possible 
formulas was the correct one. Although x-ray analysis 
gave the first and most unequivocal proof that the 6- 
lactam formula was the correct one, infrared methods 
gave an independent proof and demonstrated that no 
structural changes took place in going from the solid 
state (required for x-ray work) into solution, in which 
penicillin is known to be a rather labile molecule. This 
work depended on differences in the type and in the 
environment of the double bonds in the three possible 
structures. By investigating the spectra of model com- 
pounds containing double bonds of various types in 
appropriate environments, a decision could finally be 
made in favor of the lactam structure. 

The biological molecules which have been most ex- 
tensively investigated by this method are the steroids. 
In such compounds, it is now possible to answer quite a 
variety of important structural questions concerning the 
position of CO, OH, CH», and C=C groups. On the 
other hand, it is still not possible to recognize the class 
of steroids by any common feature running through all 
of their spectra, although certain closely related groups 
of steroids do show common features. We find the op- 
posite extreme in the proteins which, as a class, are 
easily recognized and differentiated from nucleic acids 
and lipids, but show remarkably little variation among 
one another. The reason is that the spectrum of any 
protein is dominated by several intense bands arising 
from the identical peptide links. Only in the few cases 
where two or three of the constituent amino acids are 
in great excess is it possible to identify individual amino- 
acid residues (e.g., glycine and alanine in silk). 

The general line of attack is to compare the spectrum 
of the compound whose structure is unknown with the 
spectra of compounds of known structure containing 
groups of atoms identical with and in a similar environ- 

ment to those suspected of occurring in the unknown. 

Extensive collections of infrared spectra have now been 
compiled by various laboratories and organizations, and 
the comparisons can be made relatively rapidly by vari- 
ous mechanized sorting devices. At first, it is usually 
advisable to call in an experienced spectroscopist for 
final confirmation, but any laboratory which takes up 
infrared analysis in a particular chemical field is not 
-long in developing its own expert. The logic behind the 
hole process is not unlike that behind the chemical 
X dle the same problem. The chemist identifies 
ee puns in an unknown structure by their reactive 
properties, and he degrades the molecules into simpler 


Fah identify from their well-known chemi- 
ones which he can 1 erties. The spectroscopist identifies 
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individual chemical bonds in an unknown structure from 


their known spectroscopic earmarks and tries to supple- _ 


ment this by identifying groups or constituent units in 
a structure (e.g., peptide link, benzene ring) in a similar 
way. The most efficient approach is, of course, to com- 
bine the two methods. 


(c) Determination of Spatial Configuration of 
Biological Polymers 


In order to understand many key biological phenom- 
ena, such as the mode of action of genes, protein syn- 
thesis, or the production of antibodies, at the molecular 
level, it will be necessary to establish in detail the molec- 
ular configuration of proteins and nucleic acids. This 
must be done by physical methods, of which the most 
successful so far has been x-ray analysis. However, in 
spite of the intensive efforts of several groups of very 
able workers in various countries over the past twenty 
years, the spatial configuration of any globular protein 
is still unknown. While it is true that one now seems to 
be very close to this goal in the case of myoglobin, the 
extension of the methods used for that protein to other 
proteins presents formidable problems. At present, infra- 
red analysis seems to be the second most powerful phys- 
ical method. Although it has not yielded such precise re- 
sults, it has often given independent confirmation of the 
main features of a structure. More important is the fact 
that it can often give a lead at any early stage, either in 
laying down certain conditions, which the correct model 
structure must fulfill, or in ruling out certain models 
which might appear to fit a preliminary analysis of the 
x-ray data. Just as the x-ray method requires the poly- 
mers to be arranged in an orderly pattern (preferably 
in a single crystal), so the infrared method can be more 
precise the closer the arrangement of the polymers ap- 
proaches that found in a single crystal. This is because 
the infrared method is based on observing the dichroism 
associated with key absorption bands when the spec- 
trum is obtained using polarized radiation. 

An oversimplified example will make this clearer. In 
the infrared spectrum. of deoxyribonucleic acid is an 
absorption band which can be assigned to vibrations 
localized in the planes of the purine and pyrimidine 
bases. In the Watson-Crick model of DNA, these bases 
are nearly perpendicular to the axis of the double helix. 
If a highly oriented specimen of DNA is prepared in 
which the axes of the helices are roughly parallel to the 
direction of orientation, then this absorption band is 
very intense when the incident infrared radiation is 
polarized with the electric vector perpendicular to the 
direction of orientation, and very weak when the direc- 
tion of polarization is rotated through a right angle 
(Fig. 1). This gives a general confirmation of one aspect 
of the Watson-Crick model. Ifit were possible to produce 
a single crystal of DNA of suitable dimensions in which 
all of the bases were parallel to each other, then the 
dichroism would become perfect (i.e., no absorption for 
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Fıc. 1. The infrared absorption spectrum of NaDNA at various relative humidities. 


orientation direction; 


~ INFRARED SPECTROSCOPY IN BIOLOGICAL PROBLEMS 


1500 


cm-! 


121 


1600 1700 1800 2200 2600 3000 3400 3800 


indicates electric vector parallel to the 


indicates perpendicular to the same direction. The bands referred to in the text are those between 1600 


and 1700 cm~! [from G. B. B. M. Sutherland and M. Tsuboi, Proc. Roy. Soc. (London) A239, 446 (1957)]. 


one polarization direction) and a precise confirmation 
of this aspect of the model could be given. 

From the foregoing, it might appear that the infrared 
methods should, therefore, be applied to single crystals 
of biological polymers. Unfortunately, there are severe 
practical difficulties about such an approach. The first 
is that the “suitable dimensions” referred to in the fore- 
going hardly ever can be realized, viz., a flat crystal 
about 10 u thick and 2X5 mm in area. A second diff- 
culty is that many biological polymers crystallize only 
in association with a high proportion of water. This 
limits observation to the bands which are not obscured 
by the spectrum of H:O or D20. The third is that the 
molecular configuration is unknown and assumptions 
have, therefore, to be made about the orientations of 
recurring chemical groups or bonds in the polymer. The 
number of reasonable assumptions from which a choice 
has to be made can be quite large, and infrared analysis 
may not prove to be a sensitive enough discriminant 
between the various possibilities. In cases where the 
number of ways of fitting a long polymer into the unit 
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cell is severely restricted, this method should give a 
precise answer, and it may be that the first experimental 
difficulty referred to can be overcome by investigating 
with a microspectrometer the reflection spectrum, .in- 
stead of the absorption spectrum. Another line of attack 
is to measure the dichroism of the overtone and com- 
bination bands above 4000 cm=, since these are much 
weaker than the fundamentals and can be conveniently 
studied in a reasonably thick crystal. It should be added 
that there are still some theoretical problems to be over- 
come in the interpretation of such spectra. 

Because of all of these difficulties, most of the infrared 
investigations to date have been made on thin sections 
of fibrous proteins such as porcupine quill, silk, ete. In 
such cases, the orientation is far from perfect and obvi- ° 
ously the degree of disorientation must be estimated 
quantitatively before any reliable deductions can be 
made from the partial dichroism. A method of doing 
this has been recently worked out and applied to a 
number of proteins, viz., porcupine quill, elephant hair, . 
horse hair, harn, feather quill, silkworm gut, and colla- 
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gen. For each of these proteins, seven different struc- 
tures were tested, and it was possible to reject all but 
one or two of the model structures in most cases. 

It must be emphasized that this method of attack 
depends on certain assumptions which may not always 
be correct. These assumptions are as follows: 

(1) The direction of change of electric moment 
(assaciated with a particular absorption band) bears the 
same relation to the repeat unit of the polymer as that 
determined from infrared studies on single crystals of 
small molecules which contain the repeat unit; (2) the 
spatial distribution of the polymer chains has a certain 
degree of symmetry; (3) the specimen studied consists 
of one structural species. 

The first assumption is necessary, because it is found 
from work on single crystals of small molecules contain- 
ing a peptide link that the change of dipole moment in a 
bond stretching fundamental vibration (e.g., NH or 

CO) is not exactly along the bond in question, but may 
make an angle of as much as 20° with it. Thus, the direc- 
tions of these bonds (and consequently the molecular 
configuration) cannot be determined unless the corre- 
sponding angles are known for a peptide link in a pro- 
tein. Up to the present, single crystals of only two 
different molecules have been investigated. The angles 
in these two crystals did not differ by more than five 
degrees. However, more work on single crystals of a few 
other molecules is required in order to estimate accu- 
rately the degree of uncertainty arising from this as- 
sumption. The second assumption can be shown to be a 
reasonable one if the sample is prepared in a certain 
way. At first sight, the third assumption seems reason- 
able and is made by the x-ray analysts. The reason for 
doubting it is that the structure of many of the infrared 
bands of proteins is complex. Whether this arises from 
the presence of two different configurations, or from 
differences in frequency between a paracrystalline and 
less-ordered arrangements of the polymer molecules, 
cannot easily be determined. In simple synthetic poly- 
mers, such as polyethylene, differences in frequency are 
found between crystalline and amorphous forms of the 
polymer. Structure can also occur in the absorption 
band of a single crystal because of interaction between 
neighboring molecules. 

It appeared at one time that infrared analysis might 
be able to distinguish between various configurations 
of a protein molecule (specifically between the a-helix 
and the fully extended form of the polypeptide chain), 
even when the polypeptide chains were not arranged in 

any ordered pattern. This method depended on a change 
in the frequency of the CO absorption band in synthetic 

: polypeptides which could be prepared in either of these 
two forms by precipitation from the appropriate sol- 
vent. However, there are so many possible reasons for 
a small change in the frequency of an absorption band 

cation of any such rule to proteins (even 


an empirical basis) leads to serious difficulties and 
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A 
inconsistencies. Until the reasons for these changes are 


pounds, it seems very unwise to use them as a guide to * 


protein structure. 

The general situation with the infrared analysis of 
protein structure is similar to that which existed at an 
earlier stage in the x-ray analysis of the same problem. 
Before the interatomic distances in the peptide link were 
established by careful work. on small molecules, it was 
impossible to establish the correctness of any protein 
structure by x-ray methods. One could only say that 
such and such a structure was generally consistent with 
the diffraction pattern. Now that these distances are 
known and that single crystals of proteins can be pre- 
pared containing a heavy atom, it is becoming feasible 
to locate the individual atoms. Similarly, before the 
infrared method can become precise, more work is re- 
quired on single crystals of simple compounds contain- 
ing the peptide link and on the experimental problems 
of obtaining satisfactory spectra from single crystals 
of proteins. 
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ILUTE solutions were covered in the papers of 

Rice (p. 69), Stockmayer (p. 103), and Doty 
(p. 61). This paper first discusses concentrated solu- 
tions in contrast to dilute solutions. Many properties 
in concentrated solutions are determined by the fact 
that the interactions between the molecules are of pri- 
mary importance, whereas they are of secondary im- 
portance when the dilution is high. 

In most macromolecular solutions, one may never 
neglect completely the interactions between the large 
molecules. The nearest exception to this is probably 
solutions of the globular proteins at high ionic strength 
and at reasonable concentrations—i.e., concentrations 
in which it is quite easy to do physical experiments, such 
as the determination of intrinsic viscosity, and so on. 
In any solution, interactions can be ignored fairly well, 
providing highly precise measurements are not desired. 
As soon as extended particles—like fibrinogen, or es- 
pecially the nucleic acids, or the randomly coiled poly- 
mers such as the synthetic ones, or starch or other 
polysaccharides—become involved, it is practically 
impossible to make them sufficiently dilute to eliminate 
the interactions between the molecules, and, at the same 
time, to have enough material remaining in solution to 
measure by the standard techniques. In these cases, it 
might be said that one is dealing always with concen- 
trated solutions. 

For practical purposes, the interaction can be charac- 
terized by the following. The osmotic pressure versus 
the concentration is plotted. The ideal-solution laws 
predict that the osmotic pressure divided by the con- 
centration should be practically constant. The van’t 
Hoff law says that the osmotic pressure is proportional 
to the concentration. For an ideal solution, one obtains, 
then, the curve a in Fig. 1. Most of the time, however, 
curve b is obtained. As soon as the curve has deviated 
from the ideal law by about as much as the ideal law 
itself predicts, one has a concentrated solution. 

This upward, or positive, deviation usually results 
from repulsion between the molecules, and the fact that 
the molecules are large means that there are unavoid- 
ably a lot of repulsions just from their space-filling 
properties. This is the common type of behavior in a 
macromolecular solution; in effect, the molecules elbow 
each other apart. There is the same type of behavior in 
the theory of imperfect gas where one must consider 
the “‘co-volume”’ of the molecules. The pressure is raised 
in consequence. 

A similar effect is observed on the scattering of light. 
If there are these repulsive interactions, the fluctuations 
of concentrations can not take place as easily or as in- 


dependently of each other, and with less fluctuation 
there is less scattering of light. ‘ 

These are manifestations of the intermolecular forces. 
There is one trap into which the uninitiated easily fall, 
and that is to consider these intermolecular forces as 
acting as though the melecules were in a vacuum. The 
molecules are in a solvent, and the forces between the 
molecules and the solvent molecules and other solvent 
molecules are as important as the direct interaction 
between the macromolecules. What is seen here actually 
is a balance of forces. 

Figure 2 shows some data obtained by Doty and 
Yang! on polybenzylglutamate. The fraction of internal 
hydrogen bonds in a molecule are shown. When they 
are all intact, the helical form results, and when none of 
them are intact, the random-coil form results. The tran- 
sition region between these two forms depends upon 
the solvent composition, and this particular solvent is 
a mixture of dichloroacetic acid and dichloroethane. 
Taking the transition temperature as the reference 
point, it is seen that, when the temperature is reduced, 
one goes to the unbonded form. If the temperature is 
raised, one goes to the bonded form. It is incorrect to 
consider solely the interactions between parts of the 
macromolecule itself because, in that case, an increase 
in temperature should melt it— that is, break the bonds. 
In fact, just the opposite is done in raising the tempera- 
ture; the bonds are formed. Obviously, there is compe- 
tition between the bonding to the solvent and the in- 
ternal bonding of the macromolecule. The solvent here 
actually is the determining factor. In fact, the entropy 
of this transition has the wrong sign. Instead of the 
entropy of the random coil being the greater and that 
of the helix the lesser, the reverse is true. The helical 
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Fic. 1. Osmotic pressure m as a function of c 


] once i 
real and ideal macromolecular solutio appara O? 


ms. See text. 


123 


° 


CC-0. Gurukul Kangri University Haridwar Collection. Digitized by S3 Foundation USA 


TES 
N y 


124 Aanb Whe 


Fraction bonded 


AT 


Fic. 2. Fraction of intramolecular hydrogen bonds in poly-y- 
benzyl-L-glutamate; points calculated from optical-rotation data 
of Doty and Yang!; the curves represent an unpublished theory 
developed by J. K. Bragg and the author. 


form is the one that is stable at high temperatures and 
the entropy of the helix is greater and that of the random 
coil is less. This means that the solvent is participating 
and that there is considerable entropy change in the 
solvent. This illustrates the pitfalls possible when the 
solvent is overlooked. 

| Consider highly concentrated solutions, such as 10 to 
50% (or possibly more) dissolved material in solvent. 
What can be said about them? The kinetics of such 
systems are treated in the paper by Ferry (p. 130). The 
concern here is with statics. One of the first items of in- 
terest is, “What is the absorptive capacity of such solu- 
tions?” They can be called gels, for convenience. What 
f then is the absorptive capacity of gels for foreign sol- 
RE vents, such as ions, etc. ? For gels composed of long-chain 
macromolecules, the thermodynamic behavior with re- 
spect to the absorption of solvent is described by a 
theory advanced by Flory, Huggins, and Guggenheim.?’ 
The derivation contains numerous obviously wrong as- 
sumptions, over-simplifications, and so on; however, 
since it is a classical theory, one need say little about it 
here. If one simply wants to calculate an absorption 
isotherm, it is useful. It has an adjustable parameter 
= that allows one to make it work fairly well in many 
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tive vapor pressure—the vapor pressure of the solvent 
in the mixture over that of the solvent. To be very 


: : — 
careful, gas-law corrections and such things must KA~ 


included. 

An ideal solution should give the perfectly straight 
diagonal of curve c. This shows the activity of the small 
molecule component proportional to the volume frac- 
tion. Raoult’s law actually was stated in terms of mole 
fractions, but if it is modified slightly, it states that the 
ideal solution follows curve c. However, a real macro- 
molecular solution usually gives something of the type 
of curve d. Sometimes the curve is more complicated, 
as is shown later. 

Two characteristics are to be noted. At the upper 
right-hand corner, the slope of curve d is, as is usual, 
very much smaller than the ideal one. This is merely a 
manifestation of the repulsive interaction “mentioned 
before, i.e., a positive deviation. The osmotic pressure 
is higher; the vapor pressure, therefore, also is higher 
in dilute solution than in the ideal case. At the lower 
left-hand corner, curve d usually rises much more 
steeply than the ideal. This slope, in cases where there 
are no specific interactions, is of the order of three to 
five times that of the ideal. 

There is a very simple interpretation for this marked 
steepness in slope. Malcolm Dole probably was the first 
to apply it to any extent.’ It is based upon the proposed 
mechanisms for rigid absorbing systems, like clays, in- 
organic catalysts, carbonblack, etc. It says that, in the 
absorbing material—in this case, the dry polymer— 
there are a number of sites which can take up the solvent, 
and that the rate with which the activity rises is an 
inverse function of the number of sites. If there is a 
large number of sites, then much absorbed material can 
be taken up without raising the activity very much. If 
there are a very few sites, however, they will fill up very 
fast, and the activity will rise much more rapidly than 
the volume fraction. 

A mixture of two pure liquids, toluene and benzene, 
gives the straight line, curve c in Fig. 3. The number of 
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Fic. 3. Activity a; against volume fraction ¢1 of solvent 
sorbed in a high polymer. See text. 
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sites in the mixture apparently increases in proportion 
to the amount of toluene put in, with one site available 
for every molecule added. The whole volume remains 
accessible to these molecules and curve c results. 

The macromolecular system is different. There are 
not nearly as many available sites in which to put sol- 
vent molecules; there appear to be about one-fourth as 
many. This suggests a very simple and reasonable 
hypothesis. Figure 4(a) shows the absorption of toluene 
in benzene. The benzene molecules can be elbowed out 
of the way, presenting little resistance to the added 
toluene, since the whole volume is accessible to its 
admission. 

For macromolecules, the volume initially consists of 
intertwined chains [ Fig. 4(b) ]. If a toluene molecule is 
introduced at position a, a long cavity will have to be 
made with considerable wasted space at its ends, since 
the macromolecular chains are quite rigid. Therefore, 
the toluene molecule enters position 6 preferentially 
where it can cause with greater ease the necessary re- 
arrangements in the neighboring macromolecules. It 
appears obvious that material of this kind is relatively 
unreceptive to small molecules; it can fit them into 
certain specific places only. 

One cannot predict the number of specific places, but 
the experiments quoted in the following indicate that 
about one-fourth of the volume is occupied by sites into 
which a small molecule easily can be introduced. This 
discussion refers to unactivated absorption where no 
strong specific forces are involved. An example is toluene 
in polystyrene where the forces are fairly well in balance. 

In systems permitting hydrogen bonding, specific in- 
teractions are likely, with pronounced deviations result- 
ing (curve e, Fig. 3). This curve, which is typical of 
water absorbed in protein, starts off with a very 
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Tic. 4. Sorption of a small molecule, toluene, (@) in an 
Fic N liquid and (b) in a high polymer. 
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low, rather than a high, slope, indicating that some 
of the sites are very receptive to the water molecules. 
After these specific places have been filled up, the curve 
rises rapidly. It even may come up and hit the axis, 
which means that the protein is saturated and cannot 
take up any more water. 

In this case, there are a few very specific active sites 
as compared with the previous case in which there are 
many inactive sites. But, when one attempts to apply 
the simple site hypothesis quantitatively, it is not very 
successful, because the manner in which the absorbed 
molecules affect the protein has been ignored. Appar- 
ently, the protein rearranges its state considerably, and 
it is difficult to describe the situation in simple terms. 
The whole concept of an inert absorptive site loses its 
applicability. 

There is, however, another approach to this problem.® 
Define a function that measures the tendency of the 
absorbing molecules to cluster. This function, the ‘“‘clus- 
tering function,” has an exact molecular definition. It 
is given by an integral, 


Gu= (1/7) | [TRG -1O 


where z and 7 are molecules of component 1, the sol- 
vent; V is the total volume; and Fə(i,ĵj), the distri- 
bution function, is defined by the statement that 
(1/V2)F:(i,j)d (i)d(j) is the probability that the mole- 
cules 7 and 7 are each at the positions specified by the 
coordinates (i,j) in the range of these coordinates d(i) 
and d(j). This distribution function is familiar in the 
case of spherical molecules where it is called the radial 
distribution function. It is simply the probability of 
finding one molecule in a given position with respect to 
another. 

The quantity ¢:Gi:/71, where vı is the molar volume 
of the solvent, is the mean number of solvent molecules 
in excess of the random expectation in the neighborhood 
of a given solvent molecule. Thus, it measures the 
clustering tendency of solvent molecules, and, for this 
reason, Gy is called the clustering function. If Gi is 
positive, the solvent molecules cluster; if it is negative, 
as frequently happens, they do not. 

It should be noted that this clustering function has a 
simple and exact relation to the isotherm. There are no 
assumptions except for a very minor approximation of 


neglecting the compressibility of the system. The rela- 
tion is as follows: 


Gu ð fu 

= (.-)—(=)=1, 

Vy Oa \¢1 i 

Similar relations were known to Willard Gibbs, but 

they were neglected until interest was revived recently 

by Mayer’ and by Kirkwoodê and their collaborators. 
The clustering function can be determined from te 


isotherms, and they immediately reveal whether the 
er the 
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absorbing molecules tend to crowd into a few sites, or 


tg whether they tend to occupy isolated sites and prevent 

= other molecules from entering. This would depend upon 
i whether Gi: is positive or negative, and upon how 
i much so. 


Figure 5 shows the data for three systems, two of 
which are simple nonpolar systems. The heavy solid 
curve represents the activity of the solvent in the 
systems benzene-rubber and toluene-polystyrene, in 
which the intermolecular forces are neither specific nor 
= strong. The two curves lie almost on top of one another. 

_ The initial slope is quite high, which indicates that the 
____ number of sites available for the absorption of the in- 
¿Coming solvent molecules is much smaller than it would 
be in a mixture of two ordinary liquids. A water-collagen 
j E system exhibits another type of behavior.’ First, there 

= is a region of very strong attraction for the first few 

= molecules of water absorbed, with the water hardly 
_ showing any vapor pressure. Then, after the initial sites 
are filled, the activity rises very rapidly until it is seen 
ic Sa to. come up to that of liquid water. Dole and McLaren? 
wre ound that the first region has a strong negative heat of 
bsorption ; that is, heat was evolved to the amount of 
bout 6 kcal/mole of water in the region of ¢: less 
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cules. For the hydrocarbons, a positive clustering 
tendency appears. Gy is fairly uniformly +1 for all 
values of ¢;. This means that the molecules of the liquid 
tend to cluster in the polymer, which is probably a 
manifestation of the same geometrical problems de- 
scribedain Fig..4(b).. It.is not easy. for the polymer to 
make room for the liquid molecules, but when one has 
been admitted and a cavity is opened, it is apparently 
easier to put in other liquid molecules by extending the 
same cavity than by opening up new cavities. Therefore, 
there is this mild clustering tendency which is not a 
manifestation of any special attractive properties be- 
tween the benzene and rubber or between the toluene 
and polystyrene. Actually, in the former case, the heat 
of absorption is mildly negative, and, in the latter case, 
it is mildly positive. This contrast leads to the tentative 
conclusion that Gy; is affected more by the geometry of 
the packing than by the heat of sorption. 

The Flory-Huggins- Gir venheira theory predicts 
values of Gy of less than unity for the systems pre- 
viously referred to. In general, it underestimates the 
amount of clustering. In other words, it does not take 
sufficiently into account the intrinsic heterogeneity of 
the mixture. This, in fact, is very difficult to do. 

The water-collagen system presents a different pic- 
ture. Initially, the clustering function is markedly nega- 
tive. The first experimental point seems to be at about 
—8, and the curve is very steep. This means that the 
first molecule of water that is adsorbed excludes at least 
eight times its own volume to other molecules. One can 
interpret this as meaning that the first molecule occupies 
a special site, and that there are no other special sites 
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Effect of elongation on stress 
relaxation of cast rubber latex 
at 0°C. 


within at least the cube root of eight molecular diam- 
eters around. Other molecules that enter will have to 
find a site elsewhere. After these special sites are filled, 
the system behaves more normally, and just before the 
collagen is saturated with water, it acts like an ordinary 
unspecific high polymer in which the water molecules 
show a mild tendency to cluster. 

From the curve of the clustering function, it is ap- 
parent that the initial excluded volume for water on 
collagen is at least 10 water volumes, or 180 ml, which 
amounts to less than 0.8 mole of absorption sites per 
100 g of collagen. By use of the Langmuir isotherm, 
Dole and McLaren estimated about 0.6 mole,’ which 
seems to be reasonable agreement in view of the fact 
that the curve is still climbing steeply at the lowest 
point. If the Langmuir isotherm is used, one makes 
special assumptions, whereas, with the clustering func- 
tion, none are made. 

Consider now the elasticity of macromolecular sys- 
tems. Most concentrated macromolecular solutions are 
not simple liquids. They show some kind of structural 
elasticity or structural viscosity. The following discus- 
sion is restricted to the equilibrium manifestations. [For 
further discussion, see Ferry (p. 130). ] Rubber is classi- 
cally the best-known example. When a rubber band is 
stretched, it warms up; the work of stretching is con- 
verted into heat. In contrast, when a steel spring is 
stretched, there is practically no thermal effect; in steel, 
the work of stretching is converted into internal energy. 
The obvious explanation is that in rubber the coiled 
macromolecular chains are straightened and their en- 
tropy is reduced. If this is done reversibly, an output 
of heat must result. Alternatively, the retractive force 
of the stretched rubber band is caused by the thermal 
vibration of the macromolecular chains which try to 


return to a state of increased entropy. 
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This is the entropic theory of rubber elasticity, and 
it works very well at elongations of less than 200 to 
300%. At small elongations, the change of entropy ac- 
counts for nearly all of the retractive force. This is 
discussed by Flory.” 

Beyond about 300% elongation, the material stiffens 
rapidly and the internal energy contributes heavily to 
the retractive force. The energy of extension becomes 
negative, which means that there is a decrease in in- 
ternal energy on extension. With further extension, the 
entropy also becomes more negative than theory pre- 
dicts. This marked decrease in entropy and energy is 
suggestive of the decreases that occur during crystal- 
lization where the molecules are ordered with the result 
that the entropy decreases; at the same time, they are 
placed in energetically favorable configurations so that 
the energy decreases also. 

Direct evidence for this is shown in Fig. 7 from work 
by Tobolsky and Brown." It is known that rubber does 
crystallize, since x-ray patterns are obtainable. If rubber 
is stretched at 0°C, which is well below the freezing 
point for the initially amorphous rubber, the retractive 
force decreases with time; the curve approaches the 
axis. (The apparatus is incapable of measuring a pega- 
tive retractive force. In many cases, the rubber band 
extends spontaneously.) The macromolecular chains, 
which initially are lined up somewhat by the stretching, 
are crystallizing now into an even better linear array, 
and the band extends. 

At present, polyethylene thread is available which, | 
when stretched, oriented, and then cross-linked by 
irradiation, exhibits a striking phenomenon. If heated, 
it melts and contracts about 10 to 20%. When cooled 
and frozen at about 125°C, it extends again spontane- 
ously. This phenomenon is the result of crystallization 
of oriented chains. A similar process occurs in biological 
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Fic. 8. Length as a function of temperature for formaldehyde- 
tanned rat-tail tendon under a constant small load [from P. J. 
Flory, J. Cellular Comp. Physiol. 49, Suppl. 1, 175 (1957)]. 


materials, for example in collagen. Figure 8 illustrates 
some interesting data obtained by Dumitru, and dis- 
cussed by Flory.“ Collagen, normally, is in crystalline 
form, but, if heated to its melting point, it suddenly 
contracts. Unfortunately, when annealed below its melt- 
ing temperature, collagen does not re-extend very 
markedly on cooling, although the density does go up 
somewhat. This striking change of length accompanying 
the melting of the oriented crystal is similar to the 
phenomenon observed in rubber. 

i A similar interesting calculation can be made for a- 
helices. Instead of plotting the usual stress-strain dia- 
gram with length as a function of force, two convenient 
dimensionless variables can be introduced. In place of 
length Z, substitute L/nô, where n is the number of 
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residues in the chain and 6 is the length along the helix 
axis of one residue. Similarly, in place of force f, suh- a 
stitute fb?/36kT, where b? is the square of the effective SNe 
Jength of a residue in the random form; that is, b? is the ‘= 
mean-square end-to-end length divided by n. These 
variables are plotted in Fig. 9. 

Suppose that the material initially is in the random 
form. The stress-strain curve is linear and somewhat 
simpler than the curve for rubber, because only one 
molecule is considered rather than an array of molecules 
with different orientations. Initially, the slope of the 
curve is unity. Upon stretching, the helical form is 
favored because the helix is oriented in the direction 
of stretch. The coil-to-helix transition commences with 
an essentially two-phase situation. In this situation, 
considerable change in length results from a small 
change in force, since the proportion of helical to random 
form rapidly increases. Upon complete conversion into 
an a-helix, the structure becomes very rigid, and large 
changes in the force produce little change in length. As 
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Fic. 10. Two spherulites of an anisotropic solution. (a) Large 
regular spherulite, showing radial fault. R1 in methylene chloride; 
spacing about 0.015 mmX 154. (b) Large regular spherulite, fault 
not visible. Phase contrast; R1 in methylene chloride; spacing 
about 0.015 mm X131 [from C. Robinson, Trans. Faraday Soc. 
52, 571 (1956) ]. 


illustrated in Fig. 9, the 45° line is crossed, and the 
random form represented by the dashed diagonal line 
would extend more readily than the helical form in the 


direction of stretching. Thus, the random form again 
becomes more favorable, and a second transition from 
the helical to the random coil occurs. As shown in the -¥ 


diagram, both of the horizontal regions of the curve 
(the transition regions) represent a mixture, with a 
segment of random coil followed by one of helix. There t 
is probably always a random segment at the end, be- 

cause of the disordering effect of the end. 

An additional subject of interest is that of liquid 
crystals. Many macromolecular solutions contain a con- 
siderable amount of order. Tobacco mosaic virus parti- 5 
cles are random in dilute solution, but, in a more con- 
centrated form, the solution is intrinsically anisotropic 
and exhibits colors between crossed polaroids. Its x-ray 
diffraction pattern consists of arcs of circles. This 
pattern originally was investigated by Bernal and 
Fankuchen.” There are also many mixtures of soaps 
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and water and of various lipids in water which have 
similar properties. There is, in fact, a large and rather 
neglected field of what are generally called liquid crys- 
tals, or crystalline liquids, or mesophases—different 
names for the same phenomenon. 

Figure 10, a recent photograph taken by Conmar 
Robinson,” shows a liquid crystal, a drop of a solution 
of polybenzylglutamate in methylene chloride. The 
concentration is roughly 20% by weight. In this solvent, 
the molecules are in the a-helix form, and the presence 
of the long extended structures promotes formation of 
liquid crystals. In a concentrated solution, the liquid 
crystals separate and form small drops of a separate 
phase. Figure 10 illustrates one of the drops surrounded 
by the ordinary isotropic solution. The diameter of the 
drop is approximately 300 u. Each illustrated layer is 
about 10 to 15 u thick. This onion-like structure is re- 
markable. In Fig. 10(a), one sees a single spiral with 
something of the nature of a stem, or radial fault, on one 
side. Looked at from a different angle, Fig. 10(b), the 
stem radius is not apparent; instead, a double spiral is 
seen. The structure is very striking. For further refer- 
ence, the reader is referred to Robinson® and to papers 
by Robinson and Frank. Onsager and Flory have 
proposed quantitative theories which try to explain the 
formations of the liquid crystal as resulting from the 
anisotropic interaction between long rod-like molecules. 
These theories indicate very nicely the essential feature 
that, if too many long rods are packed in a given space, 
they pack very poorly unless they are aligned; if aligned, 
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they can be packed more economically. Hence, a con- 
centrated solution of rod-like molecules tends to form 
liquid crystal structures, but, if diluted, isotropic struc- 
tures occur. In fact, phase boundaries can result be- 
tween two solutions of the same materials at different 
concentrations. 
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Rheology of Macromolecular Systems 


Jonn D. FERRY 


| I. INTRODUCTION 


i S the paper by Zimm (p. 123) shows, mechanical 
l energy can be stored in a flexible macromolecular 
network at elastic equilibrium because of the configura- 

tional restrictions imposed on it by an external de- 

Qi formation. The resulting rubber-like elasticity and its 
Ei] characteristic dependence on temperature and cross- 
r linking density are familiar and intuitively reasonable 
phenomena. Less familiar is the fact that elastic energy 

can be stored in a macromolecular system without cross- 

links when it is subjected to steady-state flow; this 

= ioi elasticity is manifested in solutions of either flexible or 
rigid molecules and even at high dilution. In flow, of 
course, mechanical energy is also continuously dissi- 


+ 


‘Se ne pated as heat. In some cases, the energy storage is 

Se aah A Beane he 
ine ota masked by the dissipation and is difficult to detect; in 
pi others, including numerous systems of biological inter- 


est, the stored energy is obvious as seen, for example, in 
] elastic recoil after cessation of flow. 

JA The presence of stored elastic energy is usually as- 
4 oe sociated with non-Newtonian character of the flow; the 
$ apparent steady-flow viscosity falls with increasing 
shear rate. Moreover, the elastic strain (i.e., that part of 
the deformation which is recoverable after removal of 
stress) may not be directly proportional to shear stress, 
n so that the elasticity may be non-Hookean. The province 
= of rheology should include the description and expla- 
DY id nation of these nonlinear relationships, as well as the 
simultaneous appearance of viscosity and elasticity. 
Nevertheless, for sufficiently small stresses, the devia- 
___ tions from Newtonian flow and Hookean elasticity can 
= be made negligible, and most of the present discussion is 
= restricted to such conditions, corresponding to so-called 
| linear viscoelastic behavior. Although nonlinear phe- 
na nomena are certainly important and potentially valu- 
_ able sources of information, they are less well understood 

a and are omitted for the sake of brevity. 
= pz With this simplification, a rheological description of 
the concomitant viscous and elastic behavior of a 


steady-flow viscosity, 7=Z/y, where T is the 
ss and ý pee rate of spear: The energy dissi- 


aie ae J =ye/ Z, where ye 
m AET stored per cc is J®?/2. 
presenting characteristic 
er sy stored to energy 
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molecular solution, 71, is usually also of the order of Jy. 
The time required to attain steady-state flow under 
constant stress is approximately rı; after cessation of 
flow, the stored energy is dissipated by relaxation 
processes within an interval which is also of the order of 
71. For a complete description of the approach to steady- 
state flow and of the course of relaxation, as well as of 
other time-dependent rheological processes, a spectrum 
of relaxation times is usually required. 


A 

From experimental measurements of stored energy 
and of the rate of approach to steady-state flow, or from 
equivalent information about response to oscillating 
stresses, conclusions can be drawn regarding the ease of 
motion of macromolecular segments through their sur- 
roundings and the nature of long-range coupling be- 
tween molecules, as illustrated by the examples which 
follow. 


II. ENERGY STORAGE IN STEADY-STATE FLOW 


The storage of energy in very dilute solutions during 
flow can be attributed qualitatively to the departure 
from random orientation, familiar from discussions of 
streaming birefringence, which leads to a decrease in the 
entropy content in the steady state. The complete 
rheological behavior can be predicted by considering the 
interactions between the frictional forces imposed on 
solute molecules by the flowing solvent and their own 
Brownian motions, as calculated by Kirkwood and 
Auer’ for thin rigid rods and by Rouse? and Zimm? for 
flexible coils. The results correspond mathematically to 
the mechanical models in Fig. 1, where the solvent 
viscosity 7, is shown added to various viscosity con- 
tributions from the solute. For rods, one of the solute 
contributions behaves as though in series with an elastic 
spring; for coils, there is an infinite series of solute 
contributions, each in series with a spring. The magni- 
tudes of some of the model components are given in 
Table I in terms of the solution viscosity, 7, the molecu- 


TABLE I. Rheological characteristics of macromolecular solutions. 


Flexible free-draining 


Thin rigid rods 
coils (Rouse>) 


(Kirkwood and Auer®) 


GC 3cRT/SM oRT/M. 
oS (n— mu 4 

3(n—n5)/4 6(n— aft 
7 (very dilute) „ 3M En Pe/20RT 2M nfc/SRT 
J (concentrated) c 2M, SORT 


a Reference 1, 

b Reference 2. 

e Rigid rods will not form a concentrated disordered solution, preferring 
thermodynamically to distribute themselves between a dilute disordered 
and a very concentrated ordered phase.4,5 
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Tic. 1. Mechanical models corre- 
sponding to theoretical viscoelastic 
behavior of dilute rigid rods and flex- 
ible coils (below); schematic hydro- 
dynamics (above). 


lar weight, M, and the concentration, c (in g/cc). The 
values for rigid ellipsoids®:? are quite similar to those for 
rigid rods, and those for coils with internal hydro- 
dynamic interaction? are somewhat similar to those for 
the free-draining type. Of course, the portrayal of a 
mechanical model is not an essential feature, but it aids 
visualization. The terminal relaxation time (for rods, 
the only relaxation time) is 71:=/G1. 

Table I also gives the steady-state compliance for 
these systems. In very dilute solutions, it is proportional 
to the square of the intrinsic viscosity [n] (expressed 
here in cc/g) and to the molecular weight. It is inter- 
esting that the values for rods and coils differ only by a 
minor numerical factor. In concentrated solutions (de- 
fined by the condition that n®n:), J depends on the 
molecular weight only; if there are different macro- 
molecular components, it is weighted strongly by those 
of highest molecular weight.$ + 

Under the low stresses which would be necessary to 
avoid extreme non-Newtonian and non-Hookean effects, 
the magnitude of the stored elastic energy is quite 
small. For example, a dilute solution of tobacco mosaic 
virus? flowing under a shear stress of 10 dyne/cm? would 
store 25 cal/mole of solute; one of hyaluronic acid! 
(molecular weight 500 000), 0.25 cal/mole. At a concen- 
tration of 1%, the terminal relaxation times for these 
systems are calculated to be of the order of 10-* and 10+ 
sec, respectively, which are so short as to preclude 
direct experimental observation of a transient elasticity ; 
macroscopic elastic recoil would be vitiated by the 
inertia of the solvent. The elasticity could be measured 
by dynamic experiments, however, as described in the 
following section. By contrast, the behavior of a concen- 
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trated solution of flexible coils may be illustrated by a 
2% solution of the sodium salt of DNA, which" under a 
shear stress of 100 dyne/cm? would store about 200 
cal/mole. (Sodium DNA molecules are of course not 
very flexible, but in concentrated solution exhibit some 
flexibility which may arise from hinges where the helix 
is unraveled for short distances.) The elastic shear would 
amount to about 50%, and the terminal relaxation time 
is calculated to be of the order of 1000 sec, so a substan- 
tial slow elastic recoil is readily observable following 
cessation of flow. It should be emphasized that to 
obtain such elastic effects, familiar in various biological 
systems (mucous and gelatinous secretions, slime molds), 
it is not necessary to have any linkages between the 
macromolecules or a netted gel structure. 

Under higher stresses, substantial amounts of energy 
may be stored, but the complications of nonlinear effects 
prevent making numerical estimates. With a high degree 
of orientation, the structure of the system may be com- 
pletely changed by the appearance of ordered phases,‘ 
and the changes may be irreversible as in the secretion 
of silk. These phenomena may give some hint of the 
nature of rheological processes in protoplasm." 


Ill. ENERGY STORAGE AND DISSIPATION 
IN OSCILLATING MOTIONS 


When a stress is applied to a macromolecular system, 
a finite time of the order of 7; is required for the stored 
energy to build up to its steady-state value. The changes 
during this transient period are determined by the 
characteristic relaxation times, or relaxation spectrum 
mentioned in Sec. I. Equivalent information about the 
relaxation times can be obtained, often more easily, by 


subjecting the system to a sinusoidally oscillating shear AA 
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stress. Here some of the energy is stored and released 

each cycle, and some is dissipated. The respective values 

(for a given peak strain) are proportional to the storage 
modulus G’ and the real part of the dynamic viscosity 7’, 

= which are defined as the stress divided by those com- 
ponents of strain and rate of strain, respectively, in 
phase with the stress. Although it is doubtful whether 
any thythmic biological processes can be approximated 
by this pattern, sinusoidal experiments are very useful 

= for deducing the rates of molecular rearrangements. 

At very low frequencies, n’ approaches the ordinary 
steady-flow viscosity 7, and G’ approaches zero. How- 

= ever, the limiting value of G’/(wn’)?, where w is the 
¢ A radian frequency, remains finite at vanishing w and is in 


the mechanical models of Fig. 1 that with increasing 
equency the motions will occur more in the springs and 
e dashpots. On a molecular scale, they mean 
e period of oscillation is shortened, there is 
yortunity for Brownian motion (rigid rotatory 

‘rods, configurational rearrangements for 
e ao imposed by the external 


for dilute rigid rods and free-draining flexible coils, at concentrations such that n=2ns. 


solutions at a concentration such that 7=2n,. For rigid 
rods,’ there is a single relaxation time; G’ should ap- 
proach a limiting high-frequency value of Gi, and n’ 
should approach 7,+np (cf. Fig. 1 and Table I). 
Measurements of transverse wave propagation in solu- 
tions of rod-like intermediate polymers of fibrinogen are 
in rough agreement with this prediction.“ For free- 
draining flexible coils,? there is a spectrum of relaxation 
times reflecting various degrees of cooperation between 
neighboring segments in configurational rearrange- 
ments; 7’ should decrease and G’ increase more gradually 
with increasing frequency, following slopes of —0.5 and 
0.5, respectively, on a doubly logarithmic plot, as the 
motions responding within the period of oscillation be- 
come restricted to shorter range cooperation. Eventually 
n should approach ns, and G’ should attain a very high 
value of the order of the rigidity of a hard glass multi- 
plied by the volume fraction of solute (this limit, which 
is not taken into account in the theories quoted,? 
corresponds to complete lack of configurational rear- 
rangement within the period of oscillation, and is 
probably of no relevance to biological conditions). At 
present there afe no data of this sort on polymers of 
biological origin; measurements on dilute solutions of 
synthetic polymers follow the general course shown in 
Fig. 2, although a somewhat different slope would be 
expected because of hydrodynamic interaction.’ 


í 


Concentrated, rather than dilute, solutions are of 
primary interest in biological systems. Here the fre- 
- quency dependence of dynamic mechanical behavior is 
more complicated, as shown in Fig. 3 for a 1.2% solution 
of sodium DNA." Although the coils of this polymer 
have perhaps only moderate flexibility, the behavior 
follows a well-established pattern common to many 
flexibly coiling polymers of high molecular weight in 
concentrated solution.!® There is an intermediate fre- 
quency range in which 7’ falls steeply and G’ changes 
relatively little. This is associated with long-range 
coupling phenomena which cause the motions of one 
molecule to influence those of others separated by con- 
siderable distances but connected by some kind of chain 
of entanglements or coupling points. 

The intermediate region—often called the plateau 
zone—lies between two extreme regions where the me- 
chanical behavior is dominated by quite different 
physical processes. At high frequencies (corresponding 
to short times in transient experiments), the relaxation 
times are determined by the local friction of a short coil 
segment moving through its surroundings, oblivious of 
long-range coupling. The frequency dependence of G’ 
and 7’ follows the Rouse theory for free-draining flexible 
coils, and from it the magnitude of the local friction can 
be calculated (Sec. IV). At low frequencies (corre- 
sponding to long transient times), the relaxation times 
are determined by the long-range coupling, which in 
turn is reflected in the steady-flow viscosity (Sec. V). 

All the oscillatory phenomena described here relate to 
macroscopic deformation in shear. High-frequency meas- 
urements of G’ and 7’ can be made by wave-propagation 
experiments provided shear waves are employed. In 
most sonic and ultrasonic wave-propagation experi- 
ments, however, the waves are longitudinal and the 
deformation is a combination of shear plus bulk com- 
pression. The energy storage is almost entirely due to 
the compression, in which, for a polymeric solution, the 
solute would play a minor role. 
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Fic. 3. Observed frequency dependence of G’ (ascending curve) y 
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Pic. 4. Schematic representation ofgtranslational friction for a 
unit of a macromolecular chain (left) and a small foreign molecule 
in a macromolecular matrix (right). 


IV. LOCAL MOLECULAR FRICTION 


The frictional forces encountered by a short segment 
of a flexible coil as it pushes its way through its sur- 
roundings in a concentrated solution are very compli- 
cated but can, on the average, be described by a friction 
coefficient per monomer unit, ¢o, which is force per unit 
velocity. This represents the same sort of translatory 
friction which a foreign molecule encounters in diffusion 
(Fig. 4). For certain soft synthetic polymers (with no 
diluent present), it has been possible to compare the 
friction coefficient per monomer unit of the wriggling 
chain, calculated from rheological measurements such as 
those in Fig. 3, with the friction coefficient of a foreign 
molecule similar in size to the monomer unit, obtained 
from diffusion measurements where only a trace of the 
foreign component is present.'® They are actually of 
similar magnitude—for example, in polyisobutylene at 
25°C, 4.5X10-° dyne sec/cm for the chain unit, and 
3.8 10-* for a pentane molecule. 

The friction coefficient reflects a sort of local viscosity 
which involves pushing aside solvent molecules together 
with short segments of other polymer coils in the im- 
mediate vicinity. A very crude estimate of the effective 
local viscosity 7. can be obtained from Stokes’ law 
(So=677-r), taking r as the radius of a sphere with,the 
volume of a monomer unit. When the solution is not 
highly concentrated, the limited information available 
indicates that 7, is not far from the solvent viscosity," 
and is thus much smaller than the macroscopic steady- 
flow viscosity. With increasing polymer concentration, 
fo and ne increase rapidly, but not nearly as rapidly as 
does the macroscopic viscosity ». The most extreme 
divergence is reached in an undiluted soft polymer; for 
the polyisobutylene quoted above, for example, ne is 
about 70 poises and 7 is 10. For sufficiently high 
molecular weights, fo and ne are independent of molecu-- 
lar weight, since the local motions are oblivious of the 
distant ends of the macromolecules. But. increases very 
rapidly with molecular weight, often with the 34 
power.!§ 


It is evident that the effective local viscosity en- 


and 7’ (descending curve) in a 1.2% solution of sodium DNA. —~ countered in, translatory friction of a foreign molecule 
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Fic. 5. Local effective viscosities of small organic molecules in 
soft rubber (black points) and in water (open circles), plotted 
against their molecular weights on a doubly$logarithmic scale 

- (after F. Griin”). 


should increase continuously with molecular size as its 
motion involves increasingly long-range cooperation of 
the coil segments which rearrange around it. When the 
size reaches a magnitude such that long-range coupling 
affects its motion, there should be a large increase in ne, 
and eventually for macroscopic dimensions of the 
moving particle ne must approach 7. 

There are some fragmentary examples of a striking 
dependence of local effective viscosity on the size of a 
moving particle, derived from measurements of centrifu- 
gation and diffusion in matrices of flexible coils. Bushy 
stunt virus molecules sedimented in an approximately 
0.1% solution of sodium DNA encountered an effective 

viscosity only 20% higher than that of the solvent, in 

_ studies by Schachman and Harrington, while much 
arger polystyrene latex particles encountered a vis- 
TO sity higher by a factor of 30. A size dependence for the 
= effective viscosity of latex particles of different diame- 
i _ ters: sedimenting through concentrated sodium-poly- 
___acrylatesolutions has been found by Ferry and Morton.” 
Di n of sucrose in concentrated aqueous solutions 
pyrrolidone, studied by Nishijima and 
d that the local effective viscosity en- 
the sucrose was independent of the mo- 


c compounds, with molecular 
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macroscopic viscosity of water over the whole molec- 
ular-weight range.) 


The magnitude so fall relaxation times reflecting , 


configurational changes of the sort described by the 
Rouse theory are proportional to fo. With increasing 
temperature, fo falls and the relaxation spectrum shifts 
to shorter times without changing its shape; curves such 
as those in Fig. 3 shift to higher frequencies without 
changing their shapes. For biological systems, tempera- 
ture changes are probably much less important than 
concentration changes. With a decrease in polymer 
concentration, which might arise from osmotic flow, to 
could drop sharply, speeding up all molecular rearrange- 
ments and conceivably allowing stored mechanical 
energy to be dissipated or to accomplish macroscopic 
movements. 


V. LONG-RANGE COUPLING PHENOMENA 


The plateau zone in Fig. 3, observed in all high mo- 
lecular-weight polymers in concentrated solution or the 
undiluted state, represents qualitatively the behavior to 
be expected if the macromolecules were coupled together 
quite strongly, though slipping to some extent, at 
widely separated points.*4 The same concept of long- 
range coupling explains another widely observed phe- 
nomenon in the dependence of the steady-flow viscosity 
on molecular weight.*® A doubly logarithmic plot of 
viscosity against molecular weight or degree of poly- 
merization shows a sharply increased slope above a 
critical value of the abscissa, which in typical cases 
corresponds to M= 10 000 to 50 000 (Fig. 6). The long- 
range coupling cannot, in general, be attributed to 
secondary bonding at loci of attraction, because it 
appears in nonpolar polymers which are quite homo- 
geneous in chemical composition. It is apparently re- 
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_ Fic. 6. Doubly logarithmic plots of viscosity against degree of 
polymerization for several types of synthetic polymers, showing 


onality to the 3.4 power of degree of polymerization or 
SRON AC a critical value fatten Fox and Loshaek’8). 
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lated to the expectation that a molecule, in its long- 
range contour, will form a complete loop enclosing 
similar loops of other molecules,’ but the topological 
requirements are not clear. Its influence increases with 
increasing molecular stiffness, as evidenced by com- 
paring synthetic vinyl polymers, cellulose derivatives, 
and DNA." 

The terminal relaxation time in a concentrated solu- 
tion is determined by the long-range coupling, and the 
extent to which motions of one molecule can influence 
others, through chains of mutual entanglement. Since 
the steady-flow viscosity is also dominated by the 
coupling, however, the order of magnitude of 7; can still 
be estimated by the product Jn. 

Quite different from the above topological kind of 
coupling are the stronger linkages which cause some 
concentrated polymer solutions to gel. Here bonds of 
considerable permanence are formed, and the resulting 
network has no finite viscosity. It has no equilibrium 
elasticity either; under constant stress it continues to 
deform indefinitely at a very small and steadily de- 
creasing rate. The spectrum of relaxation times extends 
indefinitely to long times, probably reflecting a mutual 
influence between molecular motions which extends 
throughout the system, communicated through the 
linkages. This is manifested, for low-frequency oscillating 
motions, by a very slight decrease in G’ with decreasing 
frequency together with an increase in 7’ apparently 
without limit, as illustrated in Fig. 7 for a gel of 
polyvinyl chloride.” In some gels, the linkages are 
probably very small crystalline regions, tying together 
handfuls of flexible strands,” growth of the crystallites 
being limited by some kind of structural heterogeneity 
along the macromolecular chain. In others, there may be 
local secondary bonding of specific chemical groups.?§ 
As an extreme case, primary chemical bonds may join 
the macromolecules into a network, as proposed for gels 
of denatured serum albumin”. and fibrin clots formed 
with fibrin-stabilizing factor.*'** Thus, while a netted 
structure is not essential to the storage of elastic energy 


Log G’ in dyne/em? 
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Fic. 7. Frequency dependence of G’ and n’ for a 10%Fgel of 
polyvinyl chloride in dimethyl thianthrene.”” 
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under stress, it plays an important role when it is 
present. 

The nature of molecular responses to very slow ex- 
ternal motions in such networks, involving cooperation 
among widely separated network strands, is poorly 
understood. But if the strands between the linkage 
points are flexible macromolecular chains, the responses 
to rapid motions will be oblivious of the linkages and 
their rates will be governed by the local friction coeffi- 
cient just as in the examples discussed in Sec. IV. 

Changes in the consistency of protoplasm™ suggest 
that within it network linkages are sometimes formed 
and sometimes broken. Such structural modifications 
should not affect responses to rapid motions but would 
enormously alter the relative storage and dissipation of 
energy in slow motions. Correspondingly, they should 
markedly influence frictional resistance to translation of 
large particles while leaving the resistance to small 
molecules essentially unchanged. 


BIBLIOGRAPHY 


G. Kirkwood and P. L. Auer, J. Chem. Phys. 19, 281 (1951). 
E. Rouse, J. Chem. Phys. 21, 1272 (1953). 

3B. H. Zimm, J. Chem. Phys. 24, 269 (1956). 

4G. Oster, J. Gen. Physiol. 33, 445 (1950). 

5 P. J. Flory, Proc. Roy. Soc. (London) A234, 73 (1956). 

ê R. Cerf, Compt. rend. 234, 1549 (1952). 

7H. A. Scheraga, J. Chem. Phys. 23, 1526 (1955). 

8 J. D. Ferry, M. L. Williams, and D. M. Stern, J. Phys. Chem. 
58, 987 (1954). 

9M. A. Lauffer, J. Am. Chem. Soc. 66, 1188 (1944). 

10 T, C. Laurent, J. Biol. Chem. 216, 263 (1955). 

uF, E. Helders, J. D. Ferry, H. Markovitz, and L. J. Zapas, 
J. Phys. Chem. 60, 1575 (1956). 

2 K. H. Meyer and J. Jeannerat, Helv. Chim. Acta 22, 22 
(1939). 

8 L, V. Heilbrunn, The Dynamics of Living Protoplasm (Aca- 
demic Press, Inc., New York, 1956). 

“ J. D. Ferry and F. E. Helders, Biochim. et Biophys. Acta 23, 
569 (1957). 

18 J. D. Ferry in Die Physik der Hochpolymeren, H. A. Stuart, 
editor (Springer-Verlag, Berlin, 1956), Vol. IV, Chap. VI. 

16 J. D. Ferry and R. F. Landel, Kolloid-Z. 148, 1 (1956). 

“J. D. Ferry, D. J. Plazek, and G. E. Heckler, J. chim. phys. 
55, 152 (1958). 

18 T, G. Fox and S. Loshaek, J. Appl. Phys. 26, 1080 (1955). 

2 H. K. Schachman and W. F. Harrington, J. Am. Chem. Soc. 
74, 3965 (1952). 

> J. D. Ferry and S. D. Morton, unpublished experiments. 

2 Y. Nishijima and G. Oster, J. Polymer Sci. 19, 337 (1956). 

= F, Grün, Experientia 3, 490 (1947). ` 

3 W. Kuhn, Makromol. Chem. 6, 224 (1951). 
Gon R. McLoughlin and A. V. Tobolsky, J. Colloid Sci. 7, 555 

52). 

25 F. Bueche, J. Chem. Phys. 20, 1959 (1952). 

a TANS J. Polymer Sci. 25, 243 (1957). 

. T. Alfrey, Jr., N. Wiederhorn, R. Stein, and A. V. 

J. Colloid Sci. 4, 211 (1949). Ae 


Ly. 
aie 


38 G. Stainsby, editor, Recent Advances in Gelatin and Glue? 


Research (Pergamon Press, London, 1958). 


aaah Huggins, D. R. Tapley, and E. V. Jensen, Nature 167, 592 


© V. D. Hospelhorn, B. Cross, and E. V. s 
Soc. 76, 2827 (1954). ; Jensen, J. Am, Chem, 


2 A. Loewy and J. T. Edsall, J. Biol. Chem. 211, 829 (1954). 


2 L. Lorand and A. Jacobsen, J. Biol. Chem. 230, 421 (1958). 


REVIEWS OF MODERN PHYSICS., 


VOLUME 31, NUMBER 1 


16 


JANUARY, 1959 


Respiratory-Energy Transformation“ 


ALBERT L. LEHNINGER 
Department of Physiological Chemistry, The Johns Hopkins School of Medicine, Baltimore 5, Maryland 


INTRODUCTION 


HIS article summarizes the broad outlines of 
current knowledge on some enzymatic and bio- 
physical aspects of the conversion of respiratory energy 
into phosphate-bond energy. There are many problems 
for the biophysicist in this area which not only are 
pertinent to electron transport and respiratory-chain 
phosphorylation but also have wide applicability to 
more-general problems of ultrastructure and arrange- 
ment of enzymes in biologically organized ‘‘solid-state”’ 
arrays. More-detailed treatments of the subject can be 
found in review articles—> 


ENERGY CYCLE IN AEROBIC CELLS 


In all cells, the energy-requiring or endergonic func- 
tions are driven by chemical energy liberated during 
the energy-yielding or exergonic degradation of food- 
stuff molecules. The major energy-requiring functions 
or activities of cells include (a) biosynthesis of complex 
molecules from simple ones when such reactions proceed 
in an environment thermodynamically unfavorable for 
synthesis, as in the formation of glycogen from glucose 
in an aqueous system, (b) mechanical work, as in muscu- 
lar contraction, (c) active transport or accumulation of 

: substances against gradients of chemical potential, 
T (d) transmission or conduction phenomena, and (e) 
bioluminescence. In aerobic cells, oxidative degradation 
of foodstuffs is the source of chemical energy. In 
strictly anaerobic cells, molecular oxygen does not 
intervene but internal oxido-reductions of fermentative 
T or anaerobic metabolic cycles (such as alcoholic fer- 
mentation of glucose) are the primary energy source to 

pM drive endergonic functions. 
j. i The energy cycle in nature (as well as the carbon 
ie" cycle) is then completed by the conversion of radiant 
energy of sunlight into chemical energy during photo- 
synthesis, making possible reduction of carbon dioxide 
back to the oxidation level of carbohydrate, in reactions 
discussed in the companion paper by Calvin (p. 147). 
Respiratory-energy transformation and photosynthesis 
thus constitute the two complementary mainsprings of 


biological energy. 
_ CONCEPT OF PHOSPHATE-BOND ENERGY 


m The general medium of energy exchange between 
certain exergonic reactions of oxidative catabolism and 
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the endergonic functions is the so-called “high-energy 
phosphate bond” of ATP. 

Phosphate esters of cells can be classed into two 
major groups on the basis of the magnitude of the 
standard free energy of hydrolysis of the phosphate 
bond.*-§ Simple phosphate esters, such as 3-phospho- 
glyceric acid, glucose-6-phosphoric acid, and a-glycero- 
phosphoric acid, etc., undergo hydrolysis enzymatically 
or by acid or base catalysis, often at widely different 
rates, with free energies of hydrolysis (A/’°) of — 0.8- 
—3.0 kcal/mole at 25° and pH 7. These are the so- 
called “low-energy phosphate compounds.” On the 
other hand, enol phosphate esters (phosphopyruvic 
acid), guanidine phosphates (phosphocreatine), acyl 
phosphates (1,3-diphosphoglyceric acid), and pyro- 
phosphates (ATP) are in the high-energy class and 
have a AF° for hydrolysis of from 6 to 15 kcal/mole 
under standard conditions. This division is arbitrary 
and exact values of these esters are now in a state of 
flux because AF° values for hydrolysis of ATP are 
currently undergoing substantial revision.’ 

The free energy of hydrolysis represents the difference 
in energy content of reactants and hydrolysis products 
under standard conditions. It is mot equivalent, of 
course, to the true bond energy or the zero-point energy; 
actually, input of energy is required to break chemical 
bonds. It must be made clear also that there is no 
particular biological magic involved in the difference 
between high- and low-energy compounds. Actually, 
the major reason for the relatively high free energy of 
hydrolysis of ATP and other high-energy compounds 
is that the immediate hydrolysis products of these 
compounds undergo secondary, spontaneous reactions 
with high equilibrium constants, leading to thermo- 
dynamically more-stable forms, cither through reso- 
nance stabilization or tautomeric rearrangements. A 
second contribution to the high free energy of hydroly- 
sis, in the case of ATP and phosphopyruvate, for ex- 
ample, is the separation and randomization of the 
neighboring like charges in these molecules at physio- 
logical pH. The structural, resonance, and charge con- 
tributions to the high-energy nature of these compounds 
have been discussed by Hill and Morales. Classically, 
the free energy of hydrolysis of ATP has been assumed 
to have the value 12 kcal/mole, on the basis of a calori- 
metrically observed AH of approximately this figure and 
the assumption of only a small entropy change. How- 
ever, recent work on the enzymatic equilibria of ATP 
and glutamate and the highly refined heat measure- 
ments by the important new method of Benzinger™ 
have revealed that the standard free-energy change at 
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1M reactants corrected for ionization changes at pH 
7.5 is about — 8.4 kcal, and that the true AH for hydrol- 


- ysis is about — 4.8 kcal (see Burton? for most recent 


evaluation). The value of the free energy of the hy- 
drolysis of the ATP bond (— 8.4 kcal) now seems to be 
a valid one; it is thus substantially lower than the 
classical value of 12 kcal. It must be pointed out, how- 
ever, that, under physiological conditions of pH, and in- 
tracellular concentration ranges of phosphate, ADP and 
ATP, the recalculated AF’ of ATP is about — 12.5 
kcal/mole or approximately the classical textbook 
value.’ There is room for a great deal of careful work 
on the measurement of biochemically useful thermo- 
dynamic factors through refined heat measurements, 
enzymatic equilibria, and calculation of free energies of 
formation)?" The values given above for ATP may be 
altered further when the hydrolysis is considered in a 
physiological milieu because of the lack of information 
on activity coefficients of the various ionic species of 
the reaction components—in particular, the Mg com- 
plexes of ATP and ADP in the presence of the high 
ionic strength background of intracellular fluid. Also, 
it must be pointed out that free-energy data for ATP 
obviously are reckoned on a macroscopic, statistical 
basis. In the intact cell, ATP-linked reactions occur with 
reactants bound to proteins (as in actomyosin) under 
nonequilibrium conditions where energy changes and 
transfers at the microscopic level dictate direction and 
rate of reactions. 

There may be good reasons for the evolutionary 
choice of phosphoric acid, rather than sulfuric acid, 
HCl, or a carboxylic acid, as the vehicle for biological 
energy transfer through enzymatic group-transfer re- 
actions. In the first place, phosphate anion has itself a 
rather high resonance energy; however, this property 
is not unique to phosphate. Perhaps it is more important 
that its anhydrides and amides are kinetically (as 
opposed to thermodynamically) much more stable than 
esters or amides of HCl, H:SO4, or carboxylic acids. 
Thus, acetic anhydride in H:O at pH 7.0 decomposes 
rapidly, whereas the anhydride pyrophosphoric acid at 
the same pH has a much greater half-life despite the 
fact that the free energy of hydrolysis of the two com- 
pounds is of the same order of magnitude. Phosphate 
anhydrides and amides may have been selected bio- 
logically not only for their resonance characteristics but 
also for their property of remaining relatively unreac- 
tive in an aqueous solution in the absence of suitable 
directing enzymes; less-stable derivatives would tend 
to react spontaneously and thus not be subject to 
enzymatic control. 


COMMON INTERMEDIATE PRINCIPLE IN 
UTILIZATION OF ATP ENERGY 
Apart from certain photochemical and fluorescence 
phenomena, there is, in general, only one way in which 
energy can be transferred from one chemical reaction to 
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another. The two reactions must be consecutive and 
must have a common intermediate. 

The synthesis of sucrose provides a classical biochem- 
ical example.“ The reaction 


glucose+ fructose = sucrose +H»,O (AF=+5000) (1) 


is strongly endergonic. Sugar-cane juice is 0.5M in 
sucrose. The concentrations of free glucose and fructose 
would have to be enormously greater than that of 
sucrose, and the water concentration low, in order for 
the reaction to proceed spontaneously in the sugar cane. 
This is not the case and sucrose is not made in this way. 
Enzyme studies show that it is formed at the expense 
of the hydrolysis of ATP in a coupled reaction. The 
driving reaction is the strongly exergonic hydrolysis of 
ADRA 


ATP-+H.0 — ADP+P (AF=—12000). (2) 


Reactions (1) and (2) are the partial thermodynamic 
reactions in sucrose synthesis. The actual enzymatic 
mechanism is the following: 


ATP-+ glucose — — — glucose— 1— P++-ADP 


(AF=—7000 cal) (3) 
glucose— 1— P+ fructose — sucrose+ H3PO,4 
(AF=~0 cal). (4) 


The over-all AF is —7000 cal and synthesis of sucrose 
proceeds spontaneously in the presence of the enzymes. 
In this sequence, glucose has been raised to the energy 
level of a glycoside by phosphorylation; glucose—1—P 
is now the common intermediate of two consecutive 
reactions which together have a negative net free- 
energy change, but in which a product of the first, 
exergonic, reaction has sufficient “group potential” in 
the form of the glycosyl phosphate to drive the second, 
endergonic, reaction to form sucrose. 

In principle, this pattern underlies all endergonic 
biosynthetic reactions, such as formation of glycogen 
from glucose, proteins from amino acids, and complex 
lipids from fatty acids. ATP and an intervening phos- 
phate-group transfer reaction represent the mode of 
energy transfer. 


COMMON INTERMEDIATE PRINCIPLE IN BIO- 
LOGICAL CONVERSION OF OXIDATIVE - 
ENERGY INTO PHOSPHATE- 
BOND ENERGY 


From the foregoing, the breakdown of ATP to ADP 
and P is necessary to drive endergonic functions. The 
ATP then is regenerated continuously from ADP and 
P at the expense of catabolic reactions, primarily those. 
of oxidation-reduction reactions, in order to complete 
the energy cycle. 

The regeneration of ATP from ADP and P is very 
largely a matter of the transfer of the energy liberated 
in certain highly exergonic oxidations of metabolism to. 
a reaction resynthesizing ATP from ADP and P. Such 
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respiratory energy conversion also occurs by the prin- 
ciple of the common intermediate. 

The best-understood example for this discussion is 
the enzymatic oxidation of glyceraldehyde—3—phos- 
phate to 3— phosphoglyceric acid by DPN which occurs 
in the anaerobic breakdown of glucose to lactate, to 
which is coupled the synthesis of ATP from ADP and 
phosphate. The over-all reaction first can be broken 
down into two partial thermodynamic reactions. The 
first, the oxidation of an aldehyde to a carboxylic acid, 
is highly exergonic, and the second, formation of ATP, 
highly endergonic. 


R—CH+DPN.x = R—C—O-+DPNyeat Ht 
| | 


O O 
(AF+—16 kcal) (5) 


P+ADP— ATP+H.0 (AF=+12 kcal). (6) 


The energy liberated in the oxidation of an aldehyde to 
a carboxylic acid is recovered as ATP, with energy left 
to spare, the net driving force being some 4 kcal, as is 
shown in the following over-all reaction: 


_glyceraldehyde-3-phosphate+ P;+DPN,x = 
1,3-diphosphoglycerate+DPNyea+Ht, (7) 
1,3-diphosphoglycerate+ ADP = 
3-phosphoglycerate+ATP. (8) 
It is seen that the oxido-reduction of reaction (7) re- 
sults in formation of 1,3-diphosphoglycerate, a high- 


energy mixed anhydride of a carboxylic acid and 
phosphoric acid. This is the intermediate which is 


Glucose-1-PO, = Glycogen 


| ATP > Glucose-6-PO,4 


Fructose-6-PO, 
ATP —~> Hi = aaa e 
Fructose-1,6-di-PO, 


; 8-Glyceraldehyde-PO, + Dihydroxyacetone-PO, 

i beea Tl 
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i 1,3-Diphosphoglycerate 
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common in the two reactions and acts as a high-energy 
phosphate donor in the formation of ATP [reaction 
(8) J. Although this reaction is well understood, it 
accounts for only a very small fraction of ATP synthesis 
in aerobic cells, but it is an extremely important model 
of how the great bulk of ATP may be formed during 
respiration. 


RELATIVE CONTRIBUTION OF FERMENTATIVE 
AND OXIDATIVE REACTIONS TO 
ATP SYNTHESIS 


Figures 1 and 2 show the outlines of the mechanism 
of anaerobic glycolysis and the Krebs citric-acid cycle, 
the main aerobic phase of carbohydrate oxidation. 
However, the amount of energy liberated from a glucose 
molecule during glycolysis to lactate (AF= — 49 700 
cal) represents a few percent only of the total energy 
liberated by complete oxidation of glucose to carbon 
dioxide and water (AF= — 686 000 cal/mole). However, 
this small yield of energy in the anaerobic phase is 
quite efficiently recovered by the two phosphorylations 
occurring in the cycle, and from the following equations 
it is seen that some 50% of the energy liberated when 
2 moles of lactate are formed from 1 of glucose is 
recovered in the form of the phosphate-bond energy of 
two moles of ATP. 


Over-all reaction: 


glucose+2ADP+2P — 2 lactate-+-2ATP 
(AF= —25 700 cal). 
Partial reactions: 


(a) glucose — 2 lactate (AF = —49 700 cal) 
(b) 2ADP+2P — 2ATP (AF= +24 000 cal). 


However, the real energetic mainspring of aerobic 
cells, regenerating 90% or more of the ATP required, 
is the oxidative phase of catabolism. 

Despite the absence of phosphorylated intermediates 
in the Krebs cycle as it is depicted in Fig. 2, it is known 
now that a very large number of phosphorylations of 
ADP accompany such aerobic oxidations, approxi- 
mately 36 per mole of glucose oxidized. In addition, 
the efficiency of energy recovery during this aerobic 
phase is even higher than in the anaerobic phase, 
approaching some 70%. 


OXIDATIVE PHOSPHORYLATION (SYNONYMS: 
AEROBIC PHOSPHORYLATION, RESPIRA- 
TORY-CHAIN PHOSPHORYLATION) 


This term denotes the coupled phosphorylations of 
ADP which occur simultaneously with certain organ- 
ized oxido-reduction reactions in the aerobic phase of 
metabolism. The biological significance of phosphoryla- 
tions causally coupled to oxidations with molecular 
oxygen first was recognized by Kalckar in 1937, but 


the quantitative importance of these in cellular metab- 
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Fic. 2. The Krebs tricarboxylic-acid cycle. 


followed during and after World War II by a number of 
independent and confirmatory studies by Ochoa, Cori, 
Hunter, and others (cf. references 4, 2, 3). These 
studies clearly demonstrated that, for the passage of 
every pair of electrons from a substrate of the Krebs 
citric-acid cycle to molecular oxygen during aerobic 
metabolism, an average of three moles of ATP were 
synthesized from ADP and P;; i.e., the P:O ratio 
(moles P; taken up per atom oxygen used) was 3.0. 
Thus, the over-all equation for the oxidation of pyruvate 
to carbon dioxide and water as it occurs in the Krebs 
cycle was found to be 


CH3COCOOH+15P+15ADP+50 > 
3CO2+15ATP+2H:0. 
This reaction may be broken down as follows: 


CH,;COCOOH+50 — 3CO,+2H,0 
(AF=—273 kcal), (10) 


(9) 


15P+15ADP — 15ATP 
(AF=12X 15) 
(AF=+12X15=180 kcal), 
efficiency = 180X 100/273= 67%. 


(11) 


GENERAL PROPERTIES OF OXIDATIVE 
PHOSPHORYLATION 


Oxidative phosphorylation classically has been rec- 
ognized as an unstable and evanescent phenomenon 
which occurs only with relatively “native” and un- 
fractionated extracts or homogenates of tissues. Aging 
of such extracts, or application of some of the common 
procedures for separation or purification of enzymes, 
quickly inactivated the coupled phosphorylation of 
ADP, whereas the purely oxidative activity of Krebs- 
cycle reactions remained rather stable. The findings 


thus suggested that coupled phosphorylation of ADP 
is not an obligatory accompaniment of oxidation, that 
it may be inactivated or uncoupled. The great lability 
of the coupling mechanisms to even relatively mild 
separation procedures has been a major technical ob- 
stacle to a fuller understanding of the mechanism of 
oxidative phosphorylation. 

The great lability just described is explained in part 
by the fact that oxidative phosphorylation, along with 
the organized Krebs-cycle reactions and fatty-acid 
oxidation cycle, has been found to occur in rather 
highly organized subcellular structures, the mitochon- 
dria, as was first shown by Kennedy and Lehninger, 
and independently by Schneider (cf. Ernster and Lind- 
berg!®), a finding which very quickly was confirmed and 
extended in many other laboratories." Mitochondria 
isolated by the well-known sucrose procedure have 
become the standard study object in this field. It is 
important that mitochondria from many different cell 
types—plant, animal and microbial—show the respira- 
tory-chain phosphorylation, thus providing a sound 
enzymatic basis for the suggestion by Claude that 
mitochondria are the power plants of the cell. __ 

The mitochondria are almost completely self-con- 
tained units capable of carrying out both respiration and 
phosphorylation. It is necessary only to supplement a 
suspension of mitochondria with a substrate such as 
pyruvate, a trace amount of a 4-carbon catalyst such as 


fumarate or oxalacetate as substrates or fuel for the cycle, 


inorganic phosphate and ADP, and also Mgt, which 
help to preserve mitochondrial structure. All of the 
necessary coenzymes and other co-factors are contained 
in the mitochondrial structure in such an organized 
manner that the complex cycle oxidations are smooth 


and complete,with no obvious or conspicuous bottle- 
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necks, indicating the high degree of enzymatic and 
morphological organization in the mitochondria. 

The studies on oxidative phosphorylation have re- 
vealed, however, that relatively intact mitochondrial 
structure is necessary; simple procedures, such as 
freezing and thawing, treatment with detergents, ex- 
posure to hypotonic or strongly hypertonic agents, or 
solution, all cause uncoupling of the phosphorylation.‘ 

Another important property of oxidative phosphory- 
lation is that certain inhibitors in relatively minute 
concentrations are capable of uncoupling phosphoryla- 
tion; that is, in their presence, no phosphorylation of 
ADP takes place, but oxidation proceeds nearly nor- 
mally. These uncoupling agents include 2,4-dinitro- 
phenol, as well as other nitro- and halo-phenols, the 
antibiotic gramicidin, and the anticoagulant Dicumarol, 
in addition to a number of oxidation-reduction dyes 
such as methylene blue. These agents have become 
extremely important tools for the elucidation of the 
character and sequence of the intermediate reactions 
of oxidative phosphorylation.!® 


LOCALIZATION OF PHOSPHORYLATION SITES 
IN RESPIRATORY CHAIN! 


To explain the occurrence of these oxidative phos- 
phorylations, it was suggested at first that the inter- 
mediates of the Krebs cycle actually might occur in the 
f -zel form of phosphorylated derivatives rather than the 

Biss |; free acids, analogs to the glycolytic reactions. Such 
ef phosphorylated intermediates never have been found. 
In 1940, however, Belitser postulated that the aerobic 
phosphorylations did not occur actually in the Krebs- 
cycle reactions proper, but rather during the phenomenon 
of electron transport from primary dehydrogenase to 
molecular oxygen via the known electron carriers mak- 
ing up the respiratory chain, such as diphosphopyridine 
nucleotide, flavoprotein, and the cytochromes. Belitser 
pointed out that the AF for the transfer of a pair of elec- 
trons from reduced diphosphopyridine nucleotide to 
molecular oxygen is approximately —55 000 cal/mole. 
__ Since formation of one mole of ATP requires the input 
of 12 000 cal, it is evident that rather more than 4 moles 
of ATP theoretically could be generated during trans- 
port of each pair of electrons to oxygen, providing 
suitable enzymatic coupling mechanisms existed. As 
logical as the Belitser hypothesis appeared, experimen- 
tal proof for this pattern was difficult to muster, and 
it was not until 1949-1951 that direct proof was 
duced.!9° Highly purified, chemically reduced di- 
phosphopyridine nucleotide was used as the actual sub- 
‘strate for the oxidation, in suspension of rat-liver 
ndria which had been pretreated in a hypotonic 
o produce sufficient alteration in permeability 
ndrial membrane so as to admit the re- 
“nucleotide to the oxidative centers 
tochondria. Oxidation of reduced DPN 

a ents of the respiratory chain, 
tochromes, culminated 
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in oxygen uptake and coupled uptake of inorganic 
phosphate and the formation of ATP, with P:O ratios 
approaching the value 3.0. Oxidative phosphorylation . 
thus occurs along the respiratory chain and may pro- 
ceed in the absence of Krebs-cycle intermediates. 

Similar experiments with pure cytochrome-c, used as 
either electron donor or acceptor, make possible further 
localization of the phosphorylations along the respira- 
tory chain. Thus, passage of electrons from reduced 
DPN to cytochrome-c caused the formation of two 
moles of ATP.* Also, despite earlier experiments of 
Slater to the contrary, it finally was possible to demon- 
strate conclusively that the passage of a pair of elec- 
trons from cytochrome-c to oxygen yields a single 
phosphorylation. More-recent spectroscopic investi- 
gations of Chance have confirmed these gross locations 
of phosphorylation sites.! Thermodynamic considera- 
tions of the known oxidation-reduction potentials of 
the carrier (Fig. 3), together with enzymatic and spec- 
troscopic methods, suggest that one of the phosphory- 
lations occurs between DPN and flavoprotein, the 
second between cytochrome-b and cytochrome-c. The 
site of the third phosphorylation is less certain: Chance 
suggests it lies between cytochrome-c and -a,! but other 
considerations indicate that it occurs between cyto- 
chrome-a and -a3. These findings are summarized as 
follows: 


Phosphorylation sites. 
| 
be l SN 
2 


DPN—flavoprotein—cyt b—cyt c—cyt a—cyt az; >0 


The over-all reaction of the respiratory chain is thus 


DPNyeat+O+3P+3ADP > 
DPNox+H20+3ATP, (12) 
and may be considered as the partial reaction 
DPNyeat+O — DPN.x+H:20 
(AF=—55 000 cal) (13) 
3P+3ADP — 3ATP+H.0 
(AF=+36 000 cal). (14) 


From these results, it can be seen at once that the 
multimembered respiratory chain must represent a 
device for breaking up the large chunks of oxidative 
energy into smaller, biologically useful packets of ap- 
proximately 12 kcal to accommodate the dimensions 
of the energetic currency of the cell—namely, the free 
energy of formation of ATP from ADP and phosphate. 
Tf all 55 kcal were released in one chemical-reaction 
step, even if coupled to a phosphorylation mechanism, 
only one ATP could be generated in this fashion. 


MECHANISM OF RESPIRATORY-CHAIN 
PHOSPHORYLATION 


The mechanism of oxidative phosphorylation, in 
enzymatic and chemical terms, remains as one of the 
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Fic. 3. The thermodynamic relationships in the respiratory chain. Electrons normally flow from substrates at the left to oxygen 
at the right. The approximate oxidation-reduction potentials (pH=7.0) of the carriers are shown on the scale. The free energy of 
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as is shown in the yardstick, from the relation F=nfAE [from A. L. Lehninger in Harvey Lectures (Academic Press, Inc., New 
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most conspicuous and challenging unsolved problems 
of contemporary biochemistry. The immediate situa- 
tion is that the classical approach of separation and 
isolation of enzymes making up a complex metabolic 
sequence, followed by in vitro reconstruction of the 
enzyme system, simply has not been possible to date, 
not only owing to the fact that the process is very 
labile but also because there is a requirement for geo- 
metrical organization of the many enzymes concerned 
in the characteristic morphology of the mitochondria. 
In addition, there is the fundamental dilemma of how 
a set of respiratory carriers of even moderate substrate 
specificity can catalyze both nonphosphorylating and 
phosphorylating electron transport, particularly in view 
of the fact that highly purified respiratory carriers, such 
as cytochrome-c, DPN, cytochrome-c reductase, and 
other respiratory carriers, show absolutely no ability to 
phosphorylate when tested in highly purified form in 

the test tube [cf. reference 3]. 
To facilitate discussion of the problem of the mechan- 
. ism of oxidative phosphorylation and some recent 
successes in unraveling part of the coupling mechanism, 
it is best to consider first the general mechanisms which 
have been proposed to account for energy coupling in 
the respiratory chain, which again make use of the 
principle of the common intermediate described below. 
The following mechanism is one wes have proposed 
and for which considerable evidence has. accumu- 
lated.*?%6 Tt incorporates ideas originally proposed by 
Lipmann.® Similar mechanisms have been discussed 
since by Lardy, Hunter, Slater, Chance, and others 


(see references 2, 1, 23). In skeleton form, it is 


carrierra tH X-+ oxidant > 


carrierox~ X-+reductant, (15) 
carrierox~ X+P; = carrier+P~X, (16) 
P~X+ADP2 ATP+X. (17) 


In this scheme, the substance X is the vehicle of energy 
coupling and forms a high-energy compound with the 
carrier during the course of transfer of electrons from 
the reduced carrier to the next in the chain (designated 
as “oxidant”). Carrierox~X then undergoes phosphor- 
olysis to form P~X (a high-energy compound, pre- 
sumably a phosphoenzyme) which in turn can donate 
its phosphate to ADP. Thus, carrier ~X is a common 
intermediate shared by the exergonic oxido-reduction 
of reaction (1) and by the endergonic reaction leading 
to formation of ATP in reactions (2) and (3). The 
energy liberated in the oxido-reduction thus is utilized 
to cause the formation of ATP. It is possible that phos- 
phate interacts with the carrier directly to form a 
high-energy carrier~phosphate compound; however, 
for a number of reasons, it appears more likely that 
some compound other than phosphate is the first re- 
actant with the carrier in the coupling mechanism. This 
hypothesis is written in skeleton form to indicate in the 
simplest way the principle of the common intermediate. 
It is possible and likely, however, that there are addi- 
tional intermediate reactions between reactions (1) and 
(2) and also between (2) and (3). 

This sequence now may be visualized as occurring 


CC-0. Gurukul Kangri University Haridwar Collection. Digitized by S3 Foundation USA 


142 ALBERT L. LEHNINGER 
be 5 

aN 

DPNH2 DPNHs—(X),FADHe, 2Fe**  2Fe** 2Fett n oFett Fett (7) ,2Fe** HO 
Substrates —> { J Cyt b ( Cyt c yf Cyta ) ( Cyta À ( 

DPN?  DPN~(X) \FAD* 2Fet*  oFetth (O) 2Fettt 7 ore**t opetta (Z) 2Fettt ¥ \ 30, 

x Bi P; ae 

| PX) Pv) PZ) 
ave apr Japp 
ATP + (X) ATP + (¥) ATP + Z) 
Fic. 4. A diagrammatic conception of the phosphorylation-coupled respiratory chain. This diagram is based on the coupling 


mechanism described in the text and assumes that the coupled carriers are DPN, cytochrome-b, and cytochrome-a. [from A. L. 


Lehninger, C. L. Wadkins, C. Cooper, T. M. Devlin, and J. L. Gamble, Jr., Science 128, 450 (1958) ]. 


with three different carrier pairs in the respiratory 
chain, as is shown in the diagram (Fig. 4) to account for 
the three different phosphorylations occurring along the 
chain. 

The postulated skeleton-reaction mechanism indi- 
cates two possible approaches to unraveling the mechan- 
ism of energy coupling in oxidative phosphorylation. 
1 The first and seemingly more direct approach is to carry 
oe | out investigations to isolate and to identify the molec- 
a ular species of the respiratory carrier which exists in 
i the energy-charged form—in short, the structure and 
ya l enzymatic reactions of carrier ~X or its analogs. How- 
Fi ever, not all of the known respiratory carriers have been 

obtained so far in purified soluble form suitable for 
examination of possible complexes of this kind and 
those which have been isolated to date have shown no 
propensity to phosphorylate in the test tube in simple 
systems. Furthermore, there now is increasing evidence 
that additional factors, such as vitamin Kı, a-toco- 
pherol, quinones, copper, nonheme iron, and other sub- 
stances capable of undergoing reversible oxidation-re- 
duction changes, may be members of the respiratory 
chain in addition to the classically accepted flavoprotein 
and cytochromes, but the site of these in the chain is still 
~- quite uncertain. Direct identification of the coupled 
carriers as a first step to determining the identity of X 
also is made difficult by the probability that a coupled 
= form such as carrier ~X reasonably could be expected 
to be rather labile. However, the spectroscopic studies 
of the kinetics of interaction of the carriers in the chain 
carried out by Chance in intact mitochondria provide 
some important landmarks for such efforts.1 
= There is another approach to the mechanism of oxi- 
ee dative phosphorylation, which sometimes is called the 

= “þack-door approach,” and which we have chosen to 
) in our recent investigations.*’** The back-door 
ach starts with the end product of oxidative 
ak on, namely ATP, and works back through 
ions to the carrier level. Our recent 
echanism of oxidative phosphory- 
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lation have been made possible by two developments. 
The first was our finding, in 1955, that phosphorylating 


submitochondrial fragments can be prepared from rat- 
liver mitochondria by the action of digitonin. These 
fragments are very much smaller than mitochondria 
but still retain phosphorylative activity. They are not 


simply miniature mitochondria since they no longer are 
capable of the Krebs-cycle reactions or of fatty-acid 
oxidation. However, these fragments, which are le- 
lieved to be derived from the mitochondrial membrane, 
contain more or less intact respiratory chains together 
with the enzymatic factors responsible for energy cou- 
pling. Study of coupling in these so-called ‘‘digitonin 
fragments” is not complicated by the morphological 
compartmentation and permeability effects seen in 
intact mitochondria, which also catalyze many enzy- 
matic reactions which are extraneous to oxidative 
phosphorylation.?5—4!,22,5 

The second development is the finding that these 
particles catalyze two different isotopic exchange reac- 
tions of ATP which occur in the absence of electron 
transfers. The first is the ATP-P,*? exchange, in which 
labeled P? is rapidly incorporated into the terminal 
phosphate group of ATP.?? The second is the so-called 
ATP-ADP exchange, in which P%- or C-labeled ADP 
is incorporated bodily into ATP.!§ Both of the reactions 
in fresh fragments are completely inhibited by the 
classical uncoupling agent DNP, indicating their rela- 
tionship to oxidative phosphorylation and distinguish- 
ing them from many other similar exchange reactions 
of ATP which have nothing to do with oxidative 
phosphorylation. 

Close study of the interrelationship of these exchange 
reactions has revealed that the ATP-ADP exchange 
reaction is in reality the terminal reaction of oxidative 
phosphorylation 


P~X+ADP = ATP+X, 


and is itself an intermediate step in the ATP-P;” ex- 
change. Since phosphate is not involved in the ADP 


— 
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exchange, but ADP is an obligatory component of the 
P? exchange, the sequence of the two terminal reac- 
tions of oxidative phosphorylation is established as 


R~X+P;= P~X+R 
P~X+ADP = ATP+X 


where R may be an electron-carrier molecule. Further, 
it has been found that the terminal reaction is not 
itself inherently sensitive to DNP but that sensitivity 
to DNP is conferred on it because it is in equilibrium 
with the preceding reaction, which is DNP-sensitive.*!5 

‘These experiments have led very recently to the sepa- 
ration of the enzyme catalyzing the ATP-ADP exchange 
in soluble form from the mitochondrial fragments. It 
has been purified by chromatography on cellulose de- 
rivatives.* The nature of the active site participating in 
formation of the phospho-enzyme intermediate is now 
under investigation. 

With this terminal enzyme in hand as a foundation 
for reconstruction approaches, we now propose to work 
nearer the carrier level and hope to soon obtain defini- 
tive information on the oxidation-reduction state of the 
energy-rich form of the carriers':>** which drives the 
phosphorylation, as well as the identification of the 
three coupled carriers in the chain. 


RESPIRATORY-ENZYME ASSEMBLIES AND THE 
STRUCTURE OF THE MITOCHONDRION 


Consider now oxidative phosphorylation in the con- 
text of its natural habitat—namely, the mitochondrion. 
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FIG. 5. Diagrammatic representation of the structure of mito- 
chondrion according to Sjöstrand. Note the» double-layer char- 
acter of outer membrane and the transverse “cristae.” The mag- 
nified cross-section of the membrane suggests two protein mono- 
layers separated by an oriented double layer of lipid molecules 
[from F. S. Sjöstrand in Fine Structure of Cells—Symposium, 
Eighth Congress on Cellular Biology, Leyden, 1954 (Interscience 
Publishers, Inc., New York, 1955), pp. 16, 222). 
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Fic. 6. Respiratory-enzyme assemblies in recurring units of 
mitochondrial membrane. The assembly consists of 6 electron- 
carrier protein molecules, supplemented by three sets of coupling 
enzymes, each consisting of three proteins as an approximation. 
This is a simplified representation since there may be several 
flavoprotein molecules per mole of cytochrome-a and several 
DPN-dehydrogenase molecules! [from A. L. Lehninger, C. L. 
Wadkins, C. Cooper, T. M. Devlin, and J. L. Gamble, Jr., Science 
128, 450 (1958)]. 


The structural details of mitochondria have been visua- 
lized by electron microscopy in thin sections of intact 
tissues in the laboratories of Sjostrand, Palade, and 
others. Although there are some differences in inter- 
pretation of electron micrographs of these organelles, 
the diagrammatic representation of Sjéstrand (Fig. 5) 
shows that the mitochondrion is surrounded by an 
outer double-layered membrane and also contains 
double-layered lamellar structures across the lumen 
called the “cristae? by Palade. These membranes, 
which are easily separable into fragmented form, are 
rich in lipid, containing some 30 to 40% which is mostly 
phospholipid. Sjéstrand has pointed out, as shown in 
his diagram, that the double-layered membranes cor- 
respond approximately to two monolayers of protein 
molecules separated by a double layer of oriented lipid 
molecules in the fashion suggested, and a similar ar- 
rangement in the cristae. These dimensions are constant 
in the mitochondria of all cell types examined.’ The 
inner matrix of the mitochondrion contains soluble 
proteins and enzymes, electrolytes, and nucleotides 
which are released rather easily into the medium by 
mechanical or by osmotic rupture of the membranes. 
The results of many investigators suggest that most of 
the substrate-level enzymes of the Krebs cycle and the 
fatty-acid oxidation cycle are present either in the 
lumen or in the intercristal space. They are released 
easily in soluble form after damage of the membrane. 
On the other hand, it has been a universal finding that 
the enzymes involved in electron transport and coupled 
oxidative phosphorylation are present entirely in the 
insoluble membranes in organized “solid-state” sys- 
tems; the separate enzymes of this system are not 
dissociated easily in truly soluble form.5:16 34-86 

In phosphorylating subfragments of the mitochon- 
drial membranes, the several electron carriers making 
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up the respiratory chain occur in approximately equi- 
molar ratios,” suggesting that each respiratory ‘‘as- 
sembly” consists of the six (or more) carrier molecules 
arranged in contiguous manner, presumably adjacent 
to each other. These are supplemented at three points 
along the chain by three sets of two or three (or more) 
auxiliary enzyme proteins which are necessary for the 
energy-coupling process, as in the coupling mechanism 
postuiated. Such an arrangement thus would form a 
complete respiratory-phosphorylating multienzyme as- 
sembly, as is shown in Fig. 6. 

The question arises as to the distribution of such 
respiratory assemblies along the membrane. We have 
subjected sonically-prepared fragments of the mitochon- 
drial membrane to differential ultracentrifugation and 
have obtained a wide spectrum of particle sizes. The 
relative distribution of respiratory carriers in these par- 
ticles of widely different sedimentation rate has been 
examined with the finding that, no matter what the 
size of the fragment, the ratio of the respiratory carriers 
to total protein was always constant, and the ratio of 
carriers to each other was relatively constant also. 
Phosphorylation activity is preserved in all fractions.* 
This finding indicates that the respiratory carriers are 
distributed more or less evenly along the membrane, 
and that fragmentation of the membrane produces 
pieces of different size made up of multiples of a basic- 
ally recurring unit, as is indicated in the diagram, where 
each recurring unit contains a complete and organized 
respiratory assembly.® 

Calculations involving the known extinction coeffi- 
cients of some of the respiratory carriers such as 
cytochrome-c indicate that, in the membranes of a 
single rat-liver mitochondrion, there may be several 
thousand such respiratory assemblies. Still further 
analyses of this kind make it possible to estimate the 

total fraction of the mitochondrial membrane substance 
contributed by catalytically active proteins of these 
assemblies (assuming twelve protein molecules per as- 


sembly, each having a molecular weight of 10°) to be 


as high as 20% and possibly higher. From such con- 
siderations, it can be calculated that the average par- 
ticle weight of a single respiratory assembly, together 
with its “filler” protein and lipid, may be 7X10~7 
approximately.*? 

From such considerations, it is clear that the mito- 
chondrial membrane is not simply a “dead wall,” but 
rather a highly complex fabric in which are embedded 
the respiratory assemblies in a highly oriented and 
purposeful manner. These respiratory assemblies thus 
represent the very warp and woof of the membrane 


fabric. 
= MECHANICAL PROPERTIES OF MITO- 


CHONDRIAL MEMBRANE 


e mitochondrial membrane is not only the site of 
hly organized respiratory assemblies, but it is 
ture capable of characteristic and selective 
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permeability and active-transport phenomena. By op- 
tical measurements, it has been observed that the 
permeability may be altered by the ambient concen- 
trations of ATP, magnesium, sulfhydryl reagents, and 
inorganic phosphate. In addition, we have found that 
the permeability of the mitochondrial membrane may 
be changed very drastically by addition of minute con- 
centrations of the thyroid hormone, which causes rapid 
swelling and increased permeability to sucrose and 
other small molecules.**—*° The mitochondrial membrane 
is capable of quite drastic extension and swelling under 
certain circumstances; it may increase in volume as 
much as several fold without actual rupture of the 
membranes (cf. Werkheiser and Bartley“). Phase con- 
trast films of mitochondria in an intact cell show that 
they have remarkable plasticity of shape, and undergo 
rhythmic swelling and contraction as they wander 
through the canaliculi of the cytoplasm. The contractile 
properties of the mitochondrial membrane also can be 
observed in vitro when adenosine triphosphate is added 
to suspensions of heart mitochondria or when phos- 
phorylating respiration is instituted. These agencies 
cause a rapid contraction of the mitochondrial mem- 
brane with the extrusion of water, and it has been postu- 
lated that the mitochondria, at least in specialized tissue 
such as the kidney, are active in water transport.!” 
The contractility of the mitochondrial membrane thus 
may resemble the configurational changes in the myosin 
molecule similarly coupled to ATP energy. The number 
of water molecules extruded from mitochondria is out of 
all proportion to the number of molecules of ATP 
formed or utilized. 

Very recently, we found that the permeability and 
contractility of the mitochondrial membrane are con- 
ditioned by the oxidation state of the respiratory 
carrier enzymes present in the membrane. Thyroxine 
and inorganic phosphate, which are potent swelling 
agents of mitochondria, increasing the permeability of 
the membrane, have this action only when the respira- 
tory carriers are in the oxidized state. The action of 
these compounds results in a relaxation of the mem- 
brane and in increased permeability to certain test 
substances such as sucrose. On the other hand, when 
the respiratory carriers are maintained in the fully 
reduced state, the permeability of the mitochondrial 
membrane cannot be changed by addition of phosphate 
or thyroid hormone.**:*° 

The mitochondrial respiratory assemblies thus net 
only are the site of phosphorylating oxidation and 
major building blocks of the membrane, but also are 
controlling factors of rather dramatic mechanical and 
configurational changes which govern the rate of entry 
of essential metabolites such as substrates and inor- 
ganic phosphate, as well as the exit of respiration prod- 
ucts such as H:O, ATP, carbon dioxide, etc. The 
membrane-bound respiratory assembly thus becomes 
an example of a “mechanoenzyme” system, to use 


yp. 


M 


RESPIRATORY-ENERGY 


Engelhardt’s term for myosin. Indeed, the mitochon- 
drial membrane has many characteristics resembling 


. those of myosin, such as ATP-ase activity, DNP-stimu- 


lation, striking ionic strength relationships, as well as 
the configurational changes. 


SOME BIOPHYSICAL PROBLEMS ASSOCIATED WITH 
RESPIRATORY-ENERGY CONVERSIONS AND 
“SOLID-STATE ENZYMOLOGY” 


Most well-known enzyme systems, such as the citric- 
acid cycle and glycolytic cycle, for example, involve low 
molecular-weight, rapidly diffusible intermediate mole- 
cules like di- and tri-carboxylic acids or phosphorylated 
sugars, as shuttles between relatively slowly diffusing 
enzyme proteins in an alternating manner. The res- 
piratory-chain system, however, differs strikingly; it 
involves protein-protein interactions exclusively with- 
out low molecular-weight intermediates (Fig. 7). Since 
the rate of diffusion of protein molecules is extremely 
low as compared with that of simple sugars, amino 
acids, etc., a multienzyme sequence of a dozen consecu- 
tive reactions, each of which is brought about through 
protein-protein collisions alone, can be expected to 
occur at very low rates only if there were no means to 
direct collisions. In fact, it can be calculated readily 
that the known high rates of respiration of cells could 
not occur if all of the enzymes involved were distributed 
equally in free solution within the volume of the cell.” 
The assembly of the individual proteins making up this 
system into a solid-state array, as in the mitochondrial 
membrane, provides a molecular adaptation to over- 
come this kinetic problem. A variety of questions 
arises, however. 

For one thing, do ordinary mass-law and ideal-colli- 
sion considerations govern the behavior of protein 
molecules in such a “solid-state” array? The question 
is important and underlies all attempts to analyze 
reaction kinetics and reaction mechanisms in solid-state 
enzyme systems by the application of Michaelis- 
Menten principles and other kinetic parameters. 
Chance has commented on this problem.! 


CONSECUTIVE REACTION PATTERNS IN MULTIENZYME SYSTEMS 
I. Low molecular-weight intermediates as “shuttles”: 


SORE ORRE 


Example: Glycolysis via phosphorylated intermediates. 


eS ctc. 


II. Protein-protein interactions; no “shuttles”: 


Q=®=@=@ etc. 


Example: Electron transport > 


FIG. 7. Diffusion rates as factors in molecular interactions in 
multienzyme systems. Low molecular-weight substrate (product) 
molecules S, Ss, etc., have high-diffusion coefficients and may 
“shuttle” between high molecular-weight enzyme proteins; in the 
respiratory chain, slow-moving proteins must interact. 


TRANSFORMATION 
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Fic. 8. Modes of electron transfer along chain of fixed electron- 
carrier molecules (after Chance and Williams’). 
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New methods thus will be required to examine the 
interaction of the individual catalyst molecules in such 
solid-state arrays, and to determine the translational, 
rotational, and vibrational degrees of freedom involved 
in these protein-protein interactions. Possibly, useful 
information will come from refined application of both 
electron-paramagnetic and nuclear-magnetic resonance 
spectroscopy, to supplement the already-extensive light 
spectroscopic studies of Chance and his colleagues.! 
Another question arises. How can the mechanism of 
electron transfer along the respiratory chain be ac- 
counted for, in view of the known dimensions of some 
of these proteins and their prosthetic groups, even 
assuming close juxtaposition of carriers? Chance has 
shown! two alternatives for the mechanism of electron 
transfer from one protein carrier to another along the 
chain (Fig. 8). In one, restricted rotation is postulated 
to permit collision of the prosthetic groups. In the 
second, the molecules are fixed, and electrons then must 
pass through the protein moieties to the prosthetic 
groups, possibly by z-electron interactions as in the 
so-called “charge-transfer” complexes. The passage of 
electrons or energy through the protein moiety of heme 
proteins has been considered frequently, since the 
classical experiments of Biicher on the constant quan- 
tum efficiency of 1.0 in the photodissociation of carbon 
monoxide myoglobin at all wavelengths tested.4* Re- 
cently, Shore and Pardee have considered the physics 
of such energy transfers, which may take place through 
fluorescence and sensitization phenomena.**46 All of the 
respiratory carriers have quite characteristic fluores- 
cence spectra. Szent-Györgyi, Weber, and others have 
suggested that fluorescence phenomena may conceiv- 
ably play a physiological role in energy transfers along 
the carrier chain.” 

The occurrence of such solid-state arrays of cataly- 
tically active proteins, the individual members of which 
are so difficult and as yet impossible to separate from 
each other with preservation of activity, presents a 
whole new area of protein chemistry and techniques. 
To date, it has been sufficiently difficult to examine the 
physical parameters of proteins in more or less ideal 


solutions; the interaction of two proteins with each’ 


other in dilute solution actually has been just barely 
touched upon in recent years. However, these far more 
complex solid-state arrays described above obviously 
present difficulties of a much greater order of magni- 
tude. Similarly, the separation and identification of 
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such proteins making up solid-state protein arrays will 
involve a whole new category of experimental ap- 
proaches, as students of this area of enzyme chemistry 
already are painfully aware. If new techniques finally 
will permit separation of all of the carrier enzymes in 
soluble form, then reconstruction of these systems in 
vitro also can be expected to present great difficulty, 
because of the problem of restoring the spatial relation- 
ships among the members of such a complex in the test 
tube. Perhaps full 27 vitro reconstruction to yield soluble 
systems of high-catalytic activity never really will be 
attained. 

These are but a few of the many biophysical and 
molecular problems posed by the solid-state character 
of the respiratory-energy transformers of mitochondria. 
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Energy Reception and Transfer in Photosynthesis“ 


MELVIN CALVIN 
Department of Chemistry, University of California, Berkeley 4, California 


HE article by Lehninger (p. 136) presents a de- 
tailed and excellent description of how a cell can 
obtain energy by the combustion of carbohydrate. This 
article describes the reverse process, namely, how the 
green cells of plants are able to transform electromag- 
netic energy into chemical energy—by the absorption 
of carbon dioxide and water, which are the end products 
of the animal cell, and by the absorption of light—and 
how they produce the foodstuffs which are the begin- 
ning of the process of combustion. Figure 1 illustrates 
diagrammatically the content of this article. 

The starting points in this case are carbon dioxide 
and water which contain the elements carbon, hydrogen, 
and oxygen in their lowest energy forms with respect to 
biological processes. The chemical energy which is ac- 
cumulated is represented here (Fig. 1) in the form of 
oxygen (molecular oxygen), on the one hand, and a 
carbohydrate, on the other. The process itself has been 
divided, both theoretically and physically, into two 
rather easily separable stages. The first of these is the 
absorption of light by chlorophyll or by some related 
pigments and the subsequent separation of water into 
a reducing agent, here represented by [H], and some 
oxidizing fragment not specifically designated here but 
presumably one of the A, B, C, series. The oxidizing 
agent, or the primary oxidant, ultimately becomes 
molecular oxygen. In the second stage, the reducing 
agent is used to reduce carbon dioxide to the level of 
carbohydrate and other plant materials. 

In order to see how the energy of light actually is 
accumulated in chemical form, it seems wise to describe 
what is known about the sequence from carbon dioxide 
to carbohydrate, and to determine at what point in that 
sequence the energy ultimately derived from the light 
enters, and from that point on to recognize and to 
define the problem of the primary quantum conversion 
into its first recognizable chemical form. Consider first 
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Fic. 1. Elementary photosynthesis scheme. 


* The work described herein was sponsored by the U. S. Atomic 
Energy Commission. 
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in some detail what is known about the path of carbon 
so that one can define more precisely into what sort of 
energy the light must be converted in order to carry 
out that process. 

With the availability of radiocarbon (carbon-14) from 
the nuclear reactors some 15 years ago, it became pos- 
sible for our laboratory to trace this sequence in some 
detail. The plant material used in most of the experi- 
ments was the unicellular green alga, Chlorella, and 
occasionally the alga, Scenedesmus ; higher plants as well 
as separated photosynthetic material were used also. 
Figure 2 shows a photomicrograph of the algae cells 
commonly used; these are the Chlorella cells and the 
green stuff contained in a cup-shaped chloroplast can 
be seen. It is illustrated well by one of the cells in the 
upper right-hand corner. 

The steps taken to trace the carbon sequence are as 
follows. The first operation constitutes a selection of 
cultures, which are grown in 200-cc flasks, and are 
transferred later into much larger continuous one-liter 
culture flasks. These are called shake-flask cultures in 
which algae can be maintained for years at a time. 

The most recent type of culture device that is used 
in our laboratory is a continuous tube culture in which 
the density of the cells is monitored by a photoelectric 
cell which controls the automatic feeding of the medium, 
so that the cells are maintained in a steady state of 
growth. 

The algal sample then is harvested and is used for 
the feeding of radiocarbon which is done in a special 
“hot box.” In this box, the cells are placed in a little 
vessel (lollipop) between lights and are adapted with 
the concentration of normal carbon dioxide of interest. 
Radioactive carbon dioxide then is administered to the 
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FIGs 2. Photomicrograph of Chlorella cells. 
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Fic. 3. Chromatogram of extract from algae, indicating uptake 
of radiocarbon during photosynthesis (60 sec). 


adapted cells for a suitable length of time in order to 
trace the paths taken by the carbon atoms. The radio- 
active carbon usually is injected in the form of a solution 
of sodium bicarbonate. It is kept in contact with the 
cells for a specified period of time after which the cells 
are killed by a variety of methods; for example, the 
cells may be dropped into methanol at room tempera- 
ture. The cell extract then is analyzed by the method 
of paper chromatography for the radioactive compounds 
which it may contain. In order to achieve this analysis, 
the extracts must be concentrated and a vacuum evapo- 
rator is used in a routine fashion to reduce the volume 
from 200 cc, or a liter, down to a cubic centimeter or so. 
From this concentrated extract, an aliquot is taken 
and placed on the corner of a piece of filter paper for 
chromatographic separation. Prior to chromatography, 
the radioactivity of the origin is counted in a quantita- 
tive way. The filter paper then is hung in a box, ina 
trough in which a solvent is placed which passes over 
the filter paper and spreads the compounds down the 
side of the paper according to their relative solubilities 
in the solvent. The most soluble run the most rapidly. 
This procedure results in a set of spots along the side of 
the filter paper, depending upon the properties of the 


compound being analyzed. Some of these compounds 


overlap each other in one solvent system. The paper 
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then is removed from the box and dried overnight. The 
paper is rotated 90° and placed in another trough, in 
another box, with another solvent. The starting point 
is now a whole series of spots along the top edge of the 
paper. Another solvent is put in the trough and it 
spreads those spots out again in a similar operation. 
After this operation is completed, the paper is dried 
for a second time. The next problem is to locate the 
radioactive materials on the paper. 

The coordinates of the material with respect to the 
origin constitute a physical property which is useful in 
the identification of the compound being analyzed. The 
compounds are not colored, and the only property that 
can be used to locate them is their radioactivity. This 
is done by placing the paper in contact with a sheet of 
photographic single-coat blue-sensitive x-ray film which 
becomes exposed by the radioactivity on the paper. 
Wherever there is a radioactive spot on the paper, 
there appears an exposed area on the film after a suitable 
period of time. For quantitative work, the film, covered 
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Fic. 5. Path of carbon from CO» to hexose during photosynthesis. 


by the paper, is placed on an x-ray viewer. The paper 
is explored with a Geiger counter. The compound can 
be identified then by its coordinates, and its amount 
can be determined by the amount of radioactivity found 
in the spot. A greater or lesser degree of resolution de- 
pends upon the nature of the solvent systems used and 
upon the time used for the chromatography. 

Figure 3 shows a chromatograph picture of the ex- 
tract of a 60-sec illumination of Chlorella. One can see 
that a 60-sec illumination is much too long to find the 
earliest compounds into which the carbon enters in the 
course of its conversion from carbon dioxide to carbo- 
hydrate. Figure 4 shows a chromatogram of a shorter 
illumination (10 sec). Here, one compound, phospho- 
glyceric acid, dominates the scene. Our laboratory has 
been able to get the same type of sequence of events 
with isolated chloroplasts plus a number of cofactors. 
The phosphoglycerate appears no matter how the 
chloroplasts or the algae are killed: whether they are 
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killed hot or cold, whether they are killed in alcohol, 
acetone, etc. 

This, then, is an initial clue that the first isolable 
stable compound obtainable by these methods, or at 
least identifiable by these methods, is phosphoglyceric 
acid: 


CH»—CHOH— CO2H 


| 
OPO;H 


a three-carbon compound containing a low-energy 
phosphate group. 

The next problem is to determine which of these 
carbon atoms is radioactive. This has been done by 
chemical-degradation methods. By taking the com- 
pound (phosphoglyceric acid) apart, one carbon atom 
at a time, it was found that the carboxyl group became 
radioactive first and the other two later. From this, 
together with the degradation of sugar molecules that 
came out in the same experiment, our laboratory 
was able to determine how the sugar molecule was 
constructed. 
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Fıc. 6. Distribution of radioactive carbon in certain sugars. 


Figure 5 shows what was supposed to have occurred. 
The phosphoglyceric acid is shown as PGA. By reduc- 
tion, this goes to triose phosphate. If the ketose phos- 
phate is isomerized, and then combined with the isomer, 
a hexose diphosphate can be formed with the radioactive 
carbon atoms in the middle of the molecule. In this 
manner, the six-carbon molecule can be formed, but 
one does not know the origin of the three-carbon com- 
pound. Although two-carbon-containing molecules have 
not been found, Fig. 6 illustrates some findings of our 
laboratory. 

In addition to the PGA, there is a five-carbon-atom 
compound, a sugar (ribulose diphosphate), a seven- 
carbon-atom sugar (sedoheptulose diphosphate), and, 
of course, the six-carbon-atom sugars. The stars on 
Fig. 6 indicate some idea of the way inewhich the radio- 
activity is distributed in these various sugar compounds. 
The heptose and the pentose can be made from the 
hexose, as shown in Figs. 7 and 8. 

Figure 7 illustrates the method by which the heptose 
may be produced. From one molecule of hexose and one 
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Fic. 7. Formation of a heptose from triose and hexose. 


molecule of triose (taking off the top two carbon atoms 
of the hexose), a pentose and tetrose can be formed; 
the tetrose is labeled in the top two carbon atoms. The 
tetrose then can combine with a triose to make a heptose 
with the proper distribution of radioactive carbon. 
Figure 8 shows the way in which the pentose is put 
together, by combination of a heptose and a triose, in 
the same kind of reaction (the transketolase reaction) 
leading to two different pentoses which are in equilib- 
rium with each other. This analysis does not distinguish 
between the two pentoses. All of these rearrangements 
are done at the sugar level; triose, tetrose, pentose, 
hexose, and heptose are all at the same redox level. 
They are all of them very nearly at the same energy 
level and there is thus practically no energy required 
for these rearrangements. However, no experiments of 
this type gave the desired information—namely, the 
origin of the three-carbon piece in the first place. This 
awaited a quite different kind of experiment, an experi- 
ment in which a steady state first was established in 
the organism, after which some environmental variable 
was changed suddenly. The transients that resulted 
from changing some of these variables were examined. 
Figure 9 shows the results of such an experiment. A 
steady state is established by feeding the radiocarbon 
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Fic. 8. Proposed scheme for labeling of pentose. 
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Fic. 9. Light-dark changes in concentrations of 
phosphoglyceric acid and ribulose diphosphate. 


long enough to the plant to saturate the phosphoglyceric 
acid and the other compound mentioned. The lights are 
turned off suddenly, and the transient ensues. The 
phosphoglyceric acid rises suddenly and the ribulose 
diphosphate falls precipitously. This complementary 
behavior is the clue needed for the relationship between 
ribulose diphosphate and phosphoglyceric acid. It 
seemed as though the ribulose diphosphate was dis- 
appearing by combining with the carbon dioxide (that 
is, five carbons plus one, making a total of six carbon 
atoms) to produce two molecules of phosphoglyceric 
acid. If this is the case, then the relationships of the 
various compounds can be shown diagrammatically, as 
in Fig. 10. 

In Fig. 10, the ribulose diphosphate combines with 
carbon dioxide to form phosphoglyceric acid which is 
reduced by light to triose, triose then going through 
this series of sugar rearrangements (shown in Figs. 6, 
7, and 8) back again to the pentose. Turning the light 
off stops this reaction. When the reduction reaction is 
stopped, phosphoglyceric acid builds up and the ribulose 
diphosphate disappears. Figure 10 simply expresses in 
a scheme what the transient experiment revealed. 

But this scheme (Fig. 10) predicts another type of 
transient. If the light is kept on and the CO» stopped, 


there should be a different kind of transient; namely, 
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the ribulose diphosphate should build up suddenly and 
the amount of the phosphoglyceric acid should fall. 
_ This experiment has been accomplished with consider- 

able difficulty. The results are shown in Fig. 11. 
= Figure 11 shows the steady state for ribulose diphos- 
phate and the steady state for phosphoglyceric acid with 
CO» at a concentration of 1%. At the vertical line, the 
mncentration is shifted from 1 to 0.003% by turning 
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the stopcocks. Under those circumstances, the predicted 
changes were observed, at least in the initial phase of 
the transient. The amount of phosphoglyceric acid fell ` 
and the amount of ribulose diphosphate rose. There is 
a number of kinetic oscillations here which are remi- 
niscent of the kinds of oscillations one gets in circuitry, 
and possibly they are analogous. One or two attempts 
were made to reproduce these oscillations by putting 
first-order rate constants into the various reactions that 
are involved here and then running them through a 
digital computer. This kind of oscillation can be ob- 
tained but this work has not been pursued yet beyond 
the elementary stage of the first kind of transient. This 
kind of study will lead to much more-detailed knowledge 
of the mechanism of cellular response to changes in 
external or internal environments. It is a very simple 
system to use and one which is amenable to complete 
analysis, both experimentally in terms of the compounds 
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Fic. 11. Transients in the regenerative cycle. 


involved and theoretically in terms of the simple 
kinetics involved. 

Figure 12 shows the completed photosynthetic cycle 
in which there are put together all of the rearrangements 
of hexose and triose through heptose and pentose, back N 
again to the ribulose diphosphate which then picks up 
carbon dioxide to make two molecules of phospho- 
glyceric acid. 

In trying to visualize this particular step, a proposed 
mechanism for this reaction is shown in Fig. 13. Here, J 
the ribulose diphosphate is written as the ene-diol, 
combining with bicarbonate ion to form an intermediate, 
hypothetical up to this point, an a-hydroxy-G-keto acid, 
which then is hydrolyzed to give two molecules of 
phosphoglyceric acid. This a-hydroxy-6-keto acid, ac- 
cording to our chemical knowledge, would be very un- 
stable either to decarboxylation, in which case it would 
n. Digitized by S3 Foundation USA 
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Fic. 12. The photosynthetic carbon cycle. 


lead back to ribulose diphosphate, or to hydrolysis, in 
which case it would lead the other way. Fortunately, 
in our laboratory, a close relative of this intermediate, 
and probably some of the compound itself, has been 
found.! On theoretical grounds, it was decided that one 
might expect a compound of this sort to appear down 
in the diphosphate area of the chromatogram. Figure 14 
shows that chromatogram which ran in a solvent for 
48 hr in both coordinates. What was originally a single 
spot, which is dominantly ribulose diphosphate, now 
breaks up into at least three spots. The principal spot 
is the ribulose diphosphate; another one is hexose di- 
phosphate and heptose diphosphate; and the last spot 
turned out to be a keto-acid diphosphate. It is not the 
B-keto-acid but rather the y-keto-acid diphosphate 
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Fic. 13. Mechanism of carboxylation reaction. 


which apparently is an artifact of the method of killing, 
but it does come from the 8-keto-acid diphosphate 
which is still a labeled compound and does show its 
presence in small amounts. 

In Fig. 15, one sees the diphosphate plus some de- 
phosphorylated compounds, particularly the y-keto-acid 


Fic. 14. Chromatogram of extract of Chorella after three minutes _ 


of photosynthesis in the presence of radiocarbon. The so 
were allowed $ run for 48 hr in each dimension. vents 
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diphosphate. And here is seen a trace of the 8-keto acid 
in its lactone form because it is stabilized as a lactone, 
enough to catch it on a chromatogram. Racker? has 
carried out this whole sequence (Fig. 12) by collecting 
all of the enzymes that were indicated in that figure. 
By putting in the suitable substrates, ribulose and 
carbon dioxide, he was able to pull out glucose phos- 
phate from the CO». The immediate sources of energy 
are the two compounds adenosine triphosphate (ATP) 
and reduced pyridine nucleotide (TPNH). Figure 16 
shows the needed relationships. 

Figure 16 shows the photosynthetic carbon cycle in a 
simplified form. Carbon dioxide enters to make the 
b-keto acid which then goes to the phosphoglyceric acid. 
The only points of entry of energy into this system, as 
it is written now, are where there is a need for ATP and 
for reduced pyridine nucleotide. (These are the points 
of entry of energy into this wheel.) These points are the 
gears which drive the cycle in a forward direction. 
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Fic. 17. Photomicrograph of chloroplasts from liverwort. 


=a 


Clearly the ATP and the TPNH (energy sources for 
the photosynthetic cycle) must come ultimately from 
the light. The remainder of this paper is devoted to 
this problem. 

How does the light, which is absorbed by the chloro- 
phyll, produce these two substances (ATP and TPNH) 
which are known to be required for carbon reduction? 
Before going into the details of a possible answer, con- 
sider some pictures of the apparatus which does it. 
Figure 17 shows a photomicrograph of liverwort tissue; 
one can see the cells, the cell walls, and the chloroplasts 
in which the chlorophyll is distributed very nicely inside 
the cells. Figure 18 shows isolated chloroplasts from 
spinach. (These are bigger chloroplasts and they have 
been isolated in sucrose solution.) All of this carbon 
reduction, oxygen evolution, phosphate production can 
be done with these chloroplasts removed from their 
natural habitat inside a cell. However, in order for that 
to be possible at anything approaching the rates at 
which it occurs in the living cell, one must add cofactors, 
some of which are heat stable, some of which are heat 
labile, and some of which are unknown but which are 
obtained out of the sap of the cells. In any case, this 
whole process can be done outside the cell. 

Figure 19, from the work of Steinman and Sjöstrand,’ 
shows an electron micrograph of a chloroplast. The 
picture on the right is shown at a higher magnification. 
The outstanding features of the chloroplast structures 
are lamellae, discussed in the following. 

For some twenty years, it has been possible to carry 
out the photochemical evolution of oxygen by isolated 
chloroplasts using a suitable hydrogen acceptor such as 
ferrocyanide or quinone. This is called the Hill reaction. 
In the last five years, by preparing the chloroplasts in 
a manner which presumably does not destroy a chloro- 
plast membrane or perhaps precipitates enzymes from 
the cytoplasm onto the chloroplast (i.e., preparing the 
chloroplasts in salt or sugar solutions), our laboratory 
has been able to carry out two other reactions with the 
chloroplasts. These reactions are the reduction of COs 
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as well as the evolution of oxygen, and, finally, the 
production of ATP by illumination of the chloroplasts. 
These three reactions, carbon-dioxide reduction (or, one 
step further back, the production of reduced pyridine 
nucleotide rather than CO» reduction), ATP production, 
and oxygen production are the three processes that one 
now can accomplish with the chloroplasts. The reduc- 
tion of CO2 requires two of the items and the evolution 
of oxygen may or may not require ATP. 

How many of these things can be done simultaneously 
by the chloroplasts? In a recent conference,‘ it became 
evident that all of the pair combinations of these proc- 
esses (i.e., CO» reduction, ATP production, and oxygen 
production) could be demonstrated. 

It has been demonstrated that one could make one 
mole of pyridine nucleotide for every atom of oxygen 
produced. Simultaneously, one can demonstrate the 
production of one mole of ATP for every equivalent of 
reduced pyridine nucleotide produced. (One can demon- 
strate now that one mole of ATP is created for every 
equivalent of oxygen produced simultaneously.) This is 
something beyond the oxidative phosphorylation about 
which Lehninger writes (p. 136); that is, the oxidative 
phosphorylation would be the production of ATP by a 
recombination of TPNH and intermediate oxidant. It 
now appears that all three of these things can be re- 
produced equivalently at the same time. 

The apparatus which does this in the plant has been 
shown in three magnifications—the whole chloroplasts 
in the cells, the chloroplasts outside the cells, and, 
finally, the lamellar structure of the chloroplasts as 
seen by electron microscopy. Studies of this lamellar 
structure have resulted in a particular conclusion which 
is sufficiently general to be stated; namely, the chloro- 
plast lamellae seem to be (no matter what plant cell is 
investigated) disk-like in character; they seem to be 
connected at the edges to form a hollow disk—this is 
the lamella. The lamellae are quite long, about 2000 A 


Tic. 19. Ultrastructure of 
chloroplasts [from E. Steinman 
and F. S. Sjöstrand, Exptl. Cell 
Research 8, 15 (1955) ]. 
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Fıc. 18. Photomicrograph of spinach chloroplasts. 


in spinach chloroplasts. The lamellae do not appear in 
the chloroplast in the absence of chlorophyll or proto- 
chlorophyll. If, in some way, one prevents either the 
formation of protochlorophyll or of chlorophyll, one 
prevents the appearance of well-developed lamellae. 
Protochlorophyll alone will induce in cells which are 
normally capable of making them structures which 
look like these lamellae. 

The possible function of this lamellar structure of the 
chloroplasts is discussed now. The basic problem of 
photosynthesis can be reduced to the problem of con- 
verting a 35- to 40-kcal quantum of energy into some 
chemical potential. In order to do this, one presumably 
has to find a reaction, which will take up 35 kcal at one 
time. The products of this reaction must not back-react. 
The 35 kcal are a great driving force for the back- 
reaction, so there must be some mechanism provided 
in the apparatus to prevent it. 

There are a number of other difficult requirements 
which must be fulfilled in this quantum-conversion 
process with respect to the time constants involved. For 


Fic. 20. Hypothetical scheme for light-energy 
utilization on chloroplasts. 
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example, following the absorption of the quantum, 
there must be a very efficient way in which the excited 
state of chlorophyll can be converted very quickly into 
a long-lived chemical potential because of the efficiency 
of the over-all process, regardless of whether one be- 
lieves the maximum efficiency to be 30 or 60%. ` 

There are a number of approaches to this problem 
which are based upon ordinary statistical solution 
photochemistry. In the past, I have looked for reactions 
unique to chlorophyll that might conceivably be used 
to store this 35 kcal of energy, such as reduction of 
chlorophyll to bacteriochlorophyll (that is, adding two 
more hydrogens to the chlorophyll molecule). Perhaps 
the process can be accomplished in the reverse manner, 
taking off two hydrogens from the chlorophyll molecule, 
making protochlorophyll, and hanging the hydrogen 
< atoms onto something else. These are possible reactions 
= of chlorophyll. 

As a matter of fact, our laboratory demonstrated both 
of these reactions some years ago. More recently, and 
more elegantly, they have been demonstrated by 
Krasnovskiiř in systems that are more nearly related 
to those which one finds in the living organism. 

As a result of a variety of requirements, as a result 
of the recognition of this highly organized apparatus in 
which chlorophyll occurs in the chloroplast, and as a 
result of the failure to solve the problem with solution 
_ photochemistry, our group has turned to the notions of 
_ cooperative phenomena of organized systems such as 
those which are represented by barrier-layer cells in 
physics. Our group is trying to visualize how a lamellar 
str cture such as this conceivably might be an unsym- 

‘metrical layer in which one could generate, by the ab- 
Ge t, an oxidant and a reductant, on op- 
f the layer, so that they could not back- 
nd | persist for a long period of time. 
eductant and oxidant) should live 
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oxygen. I should like to present a proposal which fulfills 
all of the necessary requirements of the molecular inter- 
actions together with the need for conductivity (elec- 
trical conductivity) in certain parts of the lamellae and 
the consequent separation of charges. 

The basic proposal is given in Fig. 20 which suggests 
how these lamellae achieve this energy conversion. 
Chlorophyll in the ground state absorbs light which 
brings it to its lowest singlet excited state. The excited 
state can move around among the chlorophyll molecules 
by resonance transfer (exciton migration ) until a point 
is reached where ionization occurs. Then charge separa- 
tion can take place. The exciton can be visualized as a 
charge-pair which cannot move separately—a positive 
charge and a negative charge which must move together. 
When a suitable point in the chlorophyll lattice is 
reached where the charges can be uncoupled so that 
they can move separately, there is a conduction band. 
The electrons can move in one direction and the holes, 
or positive charges, in another. The electrons and holes 
move around until they find suitable places of lower 
potential energy into which they fall, and there sit for 
times sufficiently long so that suitable chemicals can 
come up and take off electrons, on the one hand, and 
the positive holes, on the other. This leads to chemical 
reactions which then produce stable chemicals such as a 
pyridine nucleotide and perhaps hydrogen peroxide, or 
something else of that sort, ultimately going on to the 
final products. 

With this concept, consider how the structure of the 
lamella may be interpreted in terms of the molecular 
constitution. It is suggested that this layer is made up 
of at least four components [Fig. 21(b) ]. The protein 
enzymes involved in carbon-dioxide reduction are on 
the outside of the disk. The protein enzymes on the 
inside of the disk are involved in oxygen evolution. The 
separation of the two processes (carbon-dioxide reduction 
and oxygen evolution) is achieved by a layer of chloro- 
phyll packed in the characteristic aromatic way. This 
is a very characteristic pattern of packing. The aromatic 
rings do not pile flat on themselves; they lie at an angle, 
approximately 45° to the stacking axis. This type of 
packing is suggested for chlorophyll. 

Figure 21(a) represents chlorophyll molecules tipped 
this way. Packed between them are carotenoids and the 
phospholipids. The proposal is that, after absorption, 
the exciton can migrate around among a few of these 
chlorophyll molecules to find a suitable point of ioniza- 
tion where the electrons may move in one direction and 
the positive holes in the other. Thus, one side leads to 
oxygen production, and the other to the reduction 
of carbon. 

What kind of experimental evidence might detect this 
kind of system? Electrodes cannot be placed on these 
lamellae; they are too small. But one part of this scheme 
is susceptible to experimental observation—namely, the 
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Fic. 21. Schematic representation of possible molecular structure for a lamella. 


these trapped electrons are single, trapped electrons and, 
therefore, are detectable by paramagnetism. Our labo- 
ratory set out to search for photoinduced paramag- 
netism in the chloroplasts. Figure 22 shows the results 
of that search.*® This is an illustration of electron spin- 
resonance signals for illuminated whole spinach chloro- 
plasts at 25°C and at —150°C. (Similar signals, at 
least at room temperature, were reported from the 
St. Louis laboratory by Townsend.) The fact that one 
can get the signals at — 150°C, either in chloroplasts or 
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Fic. 22. Light signals from whole spinach chloroplasts. 
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in algae, indicates that their production is not an enzy- 
matic process. 

The next question that may be asked is: “How fast 
can the signals be produced at — 150°C as compared to 
25°C? This is shown in Fig. 23. At 25°C, when the lights 
are turned on, the signals grow more rapidly than the 
instrument can follow them. At —150°C, the signals 
grow equally rapidly. The difference lies in the rate of 
decay of the signals. They have a complex decay— 
partly rapid and partly slow. At room temperature, the 
decay is rather rapid. At —150°C, there may be a rapid 
decay, but most of it is slow. This at least eliminates 
the possibility that the signals result from enzymatic 
formation. The questions remain, could the signal result 
from a triplet state—that is, a paramagnetic excited 
chlorophyll—or could the signal be the result of a 
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Fic. 23. Signal growth and decay time curves of whole 
spinach chloroplasts at 25°C and at —150°C, 
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questions are discussed in the next article. 
$ BIBLIOGRAPHY 


7. Moses and M. Calvin, Proc. Natl. Acad. Sci. U. S. 44, 260 


CALVIN 


oA Steinman and F. S. Sjöstrand, Exptl. Cell Research 8, 15 
(1955). 

4 Brookhaven National Laboratory Biology Conference No. 11 
(June, 1958). 

5 A. A. Krasnovskii, J. chim. phys. (to be published). 

6 M. Calvin and B. P. Sogo, Science 125, 499 (1958). 

1P. B. Sogo, N. G: Pon, and M. Calvin, Proc. Natl. Acad. Sci. 
U. S. 43, 387 (1957). 

8 P. B. Sogo, M. R. Jost, and M. Calvin, Radiation Research 
(to be published). 


$> 


REVIEWS OF MODERN PHYSICS 


VOLUME 31, 


NUMBER 1 JANUARY, 1959 


18 
Free Radicals in Photosynthetic Systems* 


MELVIN CALVIN 
Depariment of Chemistry, University of California, Berkeley 4, California 


NE bit of evidence introduced in the preceding 

paper concerned the possibility that the quantum- 
conversion act of photosynthesis might be a production 
of unpaired electrons which were trapped successively 
and then handed down to other acceptors (chemical 
acceptors) to do their job by ultimately reducing 
carbon dioxide. 

That particular evidence was not explained. It simply 
was stated that there was evidence for unpaired elec- 
trons. The nature of this evidence now may be explained. 
These curves previously shown are known as electron 
spin-resonance absorption spectra. This article describes 
electron spin resonance, how it may be used for the 
detection of unpaired electrons in biological systems, 
how it has been used, and how it further might be used 
to identify the nature of the unpaired electrons that 
might and do occur in biological systems. 

Electron spin resonance is another spectroscopic 
method. Zavoisky! was the first to make use of it to 
detect unpaired electrons in physical systems. It is based 
on the principle that an electron has a spin giving it a 
magnetic moment such that, when placed in a magnetic 
field, the electron orientates itself with respect to the 
magnetic field in certain specific directions. In the case 
of the electron, the spin is said to be one-half of a unit, 
and this leads to only two possible orientations of the 
electron in an external field—with and against the field. 

Figure 1 shows the diagrammatic representation of 
this situation. On the left is the case in which the elec- 
trons are in between the pole faces of a magnet, but 
the electric current is not flowing yet and the electron 
magnetic moments are arranged randomly. When the 
field is turned on (as on the right), some of the electrons 
orientate themselves with the external field, and some 
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Fic. 1. Energy states of free electrons in 
an external magnetic field. 
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against it. These two orientations do not have the same 
energy. The energies are designated as Æ and E» in the 
figure, and the difference between them is equal to the 
product of the magnetic moment yo, the gyromagnetic 
ratio go, and the value of the external magnetic field Ho. 
This difference in energy can be expressed in terms of 
the frequency 4v, and for an electron in a field of about 
3000 gauss, the wavelength corresponding to this tran- 
sition is about 3 cm. 

One may observe unpaired electrons by shining them 
with electromagnetic energy of a wavelength corres- 
ponding to the transition, and by watching for the 
absorption of this characteristic frequency, v, when 
the magnetic field, Ho, is varied. One then determines 
the magnetic field at which absorption occurs. 

Figure 2 indicates the manner in which the experi- 
ment is done, showing the 3-cm generator and the ex- 
ternal magnetic field. In the presence of the magnetic 
field, some of the electrons are oriented parallel with 
the field, some antiparallel. There will be more in the 
lower (parallel) state than in the upper (antiparallel), 
so there results a net energy absorption as transitions 
between states occur. 

Most of the apparatus in use today does not record 
direct absorption, recording instead the derivative of 
this absorption. The two graphs in the preceding paper 
(Figs. 22 and 23) are these derivatives of absorption 
rather than the absorption itself. 

This experiment thus provides a method for the 
specific detection of unpaired electrons (since paired 
electrons mutually each quench the magnetic moment 
of the other). It provides a highly sensitive device, more 
sensitive in general than that provided by any mass 
susceptibility measurement, because the latter always 
must correct for the diamagnetic material in which the 
electron is buried. In recent years, this device has been 
used more and more in chemistry and now is being used 
on biological materials to detect the presence of free 
radicals and to determine whether or not free radicals 
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Fic. 2. Absorption of 3-cm waves resulting from transition of free _ 
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Fic. 3. ESR spectra of nitrosyl disulfonate [from Varian Asso- 
ciates Technical Information Bulletin, No. 2, 11 (1958) ]. 


and unpaired electrons are participants in biochemical 
transformations. 

The measurements obtained by this method not only 
can reveal how many electrons there are, but also, by 
the nature of the absorption record, may reveal some- 

thing of the environment in which the electrons are 
_ situated. One recalls that the energy of the transition 
= is dependent upon the field that the electron sees—that 

_is, the external field due to the magnet, plus any mag- 
netic field which the molecule may cause. The molecule 
itself is made up of nuclei and other electrons. Most of 
_ the other electrons, of course, are paired off, but the 
= nuclei may have magnetic moments. Some have mag- 
netic moments that produce magnetic fields with which 
_ the unpaired electrons may interact. If the odd electron 
= sees not only the external magnetic field, but also that 
= Oof the molecule, a spectrum similar to that shown in 
= Fig. 3 from Varian Associates? may be observed. In 
= this case, the molecule is one which has in it a nitrogen 
atom, which has a nuclear spin of one unit. This means 
that the nitrogen nucleus can take up one of three 
tions in the external field—parallel, antiparallel, 
aal. The electron near the nitrogen atom actually 
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may be subjected to three different magnetic fields. In 
one configuration, the magnetic field of the nucleus is 
added to the external field. In another, it is subtracted 
from the external field. In the third case, there is no 
effect on the external field. So instead of one there 
should be three peaks. Figure 3 shows the three peaks 
attributable to the interaction with the spin of the 
nitrogen nucleus: one in the middle, and one on each 
side resulting from the three possible orientations of 
the nitrogen nucleus. 

Nitrogen is not the only nucleus having a magnetic 
moment. One of the most common in organic substances 
is hydrogen. Hydrogen has a spin of a half, and it also 
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Fic. 5. Paramagnetic-resonance absorption 
spectrum of bisdibenzene chromium cation. 
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Fic. 6. Paramagnetic-resonance absorption 
spectrum of perinaphtheny] radical. 


can influence the spin spectrum of an odd electron in 
the case where molecules are constructed such that the- 
electron interacts with these protons. 

Figure 4, from work by Wertz and Vivo,’ shows the 
case for tetrachlorohydroquinone (lower right). The 
resonance field is not modified by any of the atoms in 
the molecule since their nuclei are all of spin zero (no 
magnetic moment), and thus only one line obtains. If, 
however, one removes one of the chlorines and replaces 
it by a proton, and if the electron can interact with that 
ton, the proton, having two possible orientations in 
agnet field, splits the resonance into two lines. 
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change the external magnetic field if two protons are 
present? Both of the magnetic moments may be had 
with or against the field, or they can be had one with 
and one against the field. There are, therefore, three 
different ways in which these two protons can be ar- 
ranged with respect to the external magnetic field. The 
third situation is twice as probable as either one of the 
other two. Therefore, the middle peak should be twice 
as high as either of the two outside ones. Indeed, this 
is the case. In both cases, they produce no net modifica- 
tion of the external field. If three protons are present, 
there are four possible arrangements, and one gets four 
peaks; with four protons, there are five possible arrange- 
ments, and the amplitude ratios are the ordinary bi- 
nomial coefficients, as seen in the figure. 
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Fic. 7. ESR spectra from coenzyme-A dehydrogenase plus sub- 
strate [from H. Beinert, EPR Talk No. 10, Varian Associates 
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Fic. 8. Dark signal from Rhodospirillum 
methanol extract at room temperature 


Not only may one determine, therefore, the number 
of odd electrons present in a system, but also one may 
ascertain something about the kind of an environment 
in which they are. Figure 5 shows the compound, di- 
benzene-chromium cation, which has one unpaired 
electron. The question of interest is the location of this 
unpaired electron. It was found, in our laboratory, that 
this unpaired electron of the chromium actually can see 
the ten protons of the two benzene rings of this com- 
pound associated with the chromium atom. Ten protons 
have eleven possible arrangements, and there should be 
eleven peaks in the curve. Those on the outside are 
very weak, because the probability of having all ten 
protons oriented in the same direction is very small as 
compared with the mixed configurations. One can de- 
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Fic. 9. Light and dark signals from Chlorella methanol 
extracts in two atmospheres at room temperature. 


termine, therefore, something about the environment 
of the transition elements by this method. 

Asa final chemical example, considera perinaphthalene 
radical—a beautifully symmetrical radical (Fig. 6). 
During the preparation of the perinaphthalene, it acci- 
dentally became oxidized, and a very formidable ab- 
sorption spectrum was obtained. The odd electron sees 
two different kinds of protons. On careful examination, 
one sees that there are seven groups of lines—each group 
a quadruplet. This means that there are six protons of 
one kind and three of another. The electron interacts 
more strongly with the six-proton group than it does 
with the three-proton group. 

How much of this can be used for the investigation 
of biological free radicals? Beinert* has been working 
with a fatty acyl CoA which he found produced a 
transient, colored intermediate when mixed with its 
substrate. This, he proposed, was a free-radical inter- 
mediate, and when it was examined at Stanford Uni- 
versity and at Varian Associates, an electron spin- 
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Fic. 10. Light and dark signals from Chlorella 
methanol extract at two temperatures, 
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Fic. 11. ESR signals from Rhodospirillum rubrum, 
5 min continuous illumination. 
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tinguished from that shown in the preceding paper 
(Fig. 22)? Firstly, radicals of this kind do not fade, at 
low temperatures. They are frozen in. That is exactly 
what happens upon cooling the system of chloroplasts 
as described earlier. The fact is that structured systems 
can be found wherein the signal can be induced at a low 
temperature, but wherein it also fades at a low tempera- 
ture at a very high rate. 

The occurrence of a signal owing to an oxidation 
mechanism alone is illustrated by Fig. 8, which shows 
the behavior of a methanolic extract of Rhodospirillum 
when exposed to oxygen and nitrogen, alternately. 
When a similar methanolic extract of Chlorella is 
illuminated first in an oxygen and then in a nitrogen 
atmosphere, the increased signal of the former (Fig. 9) 
is considered the sum of the contributions of an oxida- 
tive, a photooxidative, and an odd electron produced 
and trapped in a free radical. The relative contributions 
of these processesare apparent from the figure. Figure 10 
shows how cooling, even in oxygen, reduces the signal 
amplitudes, and one sees that cooling to — 145°C nearly 
has eliminated the dark spectrum. 

Figure 11 shows the spin-resonance signal for Rhodc- 
spirillum after 5 min of illumination at the indicated 
temperatures. At 25°C, it is the next smallest signal; 
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Tic. 13. Hypothetical scheme for light-energy 
utilization on chloroplasts. 
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Fic. 15. Proposed scheme for various photochemical processes in photosynthesis. | 


at — 55°C, it is larger; and at — 150°C, it is larger yet. 
When the temperature is further reduced to — 160°C, the 
intensity falls to the smallest shown. The time behavior 
is shown in Fig. 12. The spectrometer is set on the peak 
of the signal, and one may observe the rates of rise and 
of decay of each of these signals at each temperature as 
a function of time. The room-temperature signal is the 
next smallest shown, and it rises as rapidly as the in- 
strument can respond. This signal decays immediately 
after illumination ceases, to within the time constant of 
the apparatus. There is, however, no rapid decay at 
—15°C; at —55°C, there is a very rapid rise, a further 
slow rise, followed by a small component of rapid decay, 
and a much longer slow decay. At —160°C, all that 
remains is very rapid rise and very rapid fall. Thus, I 
believe the possibility has been eliminated that these 
free radicals are being produced directly by illumination. 
This behavior must be the result either of untrapped 
carriers or of trapped carriers in well-shielded traps. 
How can this whole sequence of events be accounted 
for? Figure 13 (discussed in the previous article) indi- 
cates the first act, the absorption of light to produce an 
exciton. The exciton then is converted into conductive 
carriers, these into chemical radicals, and the chemical 
radicals lead to stable chemicals. To account for the 
data shown in Fig. 12, it is necessary to presume that 
each one of these successive acts has a higher tempera- 
ture coefficient. The first step has no temperature co- 
efficient, the second may have a very slight one, the 
third a higher one, and so on. At room temperatures, 
the energy is transported into the chemical radicals, but 
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these decay very rapidly into stable chemicals by the 
usual enzymatic process. The possibility that a triplet 
state exists is eliminated by the nature of the signal. 
A triplet signal would be a double band very charac- 
teristic of two unpaired electrons in the same molecule. 
A similar sequence of temperature effects is seen in the 
luminescence intensity curve (Fig. 14) and similarly is 
accountable, presuming the integrated intensity to bea 
measure of the back-reaction. 

Another representation of the same scheme is shown | 
in Fig. 15. The chlorophyll ground state is represented 
by a band instead of by a single line—as a result of the 
interaction of the chlorophyll molecules in these arrays ` 
previously shown in electron micrographs—and the 
chlorophyll excited singlet state actually would be a 
broader band. The triplet state is shown as a rather 
narrow band overlapping the excited sequence. * 
triplet band has not been put directly in line be 
triplet light emission is not observed. 

The foregoing scheme gives a working hypo 
account for some of the problems, outlined at tl 
—that the initial 35 kcal of energy be convert 
useful chemical potential. 
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1. INTRODUCTION 


IOPHYSICAL investigations on a molecular level 
turn naturally to the phenomena of energy trans- 
fer and electron transfer for the interpretation of the 
behavior of the ordered structures found in biological 
systems. The famous Korányi lecture!” of Szent-Györgyi 
has done much to stimulate interest in these phenomena, 
although earlier recognition of their importance, e.g., in 
the photosynthetic system, had been made. 

The general tendency in this field at present seems to 
be the adoption without caution of the ideas and term- 
inology of the solid-state physics of atomic and ionic 
semiconductors, to the interpretation of biological 
systems. Since the latter are usually molecular aggre- 
gates bound weakly by van der Waals forces (molecular 
lamellae) or very strongly by overlap forces (intra- 
molecular forces in proteins), it is worthwhile to examine 
critically the differences between the atomic and molec- 
ular array problems, to see how valid is the application 
of the ideas of solid-state physics to biological 
mechanisms. 

Since most of the treatments of the exciton theory are 
presented in fairly intricate mathematical terms, an 
attempt is made here to give a simple but accurate 
account of excitation-energy transfer as well as of elec- 
tronic conduction in atomic and molecular aggregates. 


2. EXCITONS 


The theory of excitons in atomic lattices was de- 
veloped by Frenkel,** and elaborated later by other 
physicists. ® The application of exciton theory to 
molecular crystals was first made by Davydov." ? A 
semiclassical treatment of excitons in van der Waals 
pigment polymers was given by Förster.” Qualitatively 
the same idea was also used by Förster to interpret the 
spectra of pigment dimers, and has been treated 
quantum mechanically by Simpson’s group.” The ex- 
tension to excitons in pigment aggregates of various 
geometries has been made by McRae and Kasha.’ 
The problem of energy transfer between pairs of ran- 
domly oriented molecules was treated quantum me- 

chanically by Férster.’ All of the foregoing studies 
make use of the same physical basis, which is described 
Ba the following. Several papers of a descriptive nature 
os study as made dag redo, 1 ine summer of 1538, 
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on the topic of energy transfer have been published," =% 
mainly from the phenomenological viewpoint. 

Three different definitions of the exciton have been 
given, illustrating different aspects of the idea, but not 
all of which seem equally suitable for a critical 
understanding: 

(1) Exciton states involve excitation of an assembly 
of atoms (or molecules) in concert, instead of localized 
excitation of individual species of the assembly. 

(2) An exciton can be described by a wave packet 
traveling through an assembly of atoms (or molecules), 
and arising from a superposition of exciton states 
(defined in the foregoing). 

(3) An exciton is a neutral excitation “particle,” 
consisting of an electron and a positive hole, traveling 
together through the lattice. 

Each of these definitions is next considered in order 
to clarify the picture of an exciton. 

For a discussion of the first definition of an exciton, 
consider a simple two-dimensional model of an atomic 
and a molecular lattice [Figs. 1(a) and 1(b)]. The 
atomic lattice is assumed to be an ionic one held to- 
gether by Coulombic forces (for simplicity, the positive 
and negative ions are not distinguished, the ions are 
assumed to be isoenergetic in their electronic states 
throughout, and overlap is assumed to be negligible in 
the ground state of the lattice). All but one of the ions 
are shown to be in their ground state, indicated sche- 
matically by circles representing spherically symmetri- 
cal orbitals.| The ion (2,2),t however, is shown with 
an electron excited to a p-type orbital. This model of 
the excited lattice forms the basis of what is designated 
as a zeroth-order description, because it is not possible to 
localize excitation to one atom in such a system. Thus, 
atom (3,2) might equally well be excited, etc. Abstract- 
ing the second row of atoms for simplicity, the following 
zeroth-order descriptions could be given of singly excited 
states (where asterisk denotes excitation of atom An) - 


(i) = A1*A2A3Ag, 

(ii) = A A 2*A3A4, 
(iii) = A 14 oA 3° A 4) 

e (iv)=A142A3A4*. 

} The biologist unfamiliar with the characteristic three-dimen- 
sion orbital wave functions of atoms and molecules will find it 
helpful to consult The Chemical Aspects of Light by E. J. Bowen 
(Clarendon Press, Oxford, England, 1946), second edition], 


especially pages 68, 100-115, and 139. 
tf Row 2, column 2, from left upper corner. 
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Fic. 1. (a) Zeroth-order 
description of a section of 
an atomic lattice having 
one atom with an electron 
in its lowest excited orbital, 
the remaining atoms with 
electrons in their ground 
orbitals. (b) Zeroth-order 
description of a section of 
a molecular lattice with one 
molecule with an electron 
excited to its next highest 
orbital, the remaining mole- 
cules with electrons in their 
ground orbitals. 


None of these is by itself a suitable description of the 
excited array, since they could (according to the as- 
sumption) all have the same energy, and if they could 
interact with each other, new states which are true 
stationary states in the quantum-mechanical sense will 
arise. These new states are the exciton states of the first 
definition; if only the foregoing four atoms were con- 
sidered, they would have the form 


1 
I= EO (iv) J, 


m= 0+ Gi) — Gii) — (av) J, 
m= 0- (ii) — (iii) + Gv) J, 


IN Fi — (ii)-++ Git) — Gv) J, 


where the signs (see later) result from the requirement 
(of orthogonality) that the stationary states (I) to 
(IV) be completely independent. This statement would 
be true only if it were appropriate to assign an equal 
weighting to each of the zeroth-order states, i.e., for an 
infinite array. The present argument is artificial in the 
sense that it attempts to simplify the picture by ab- 
stracting four molecular units from a very large aggre- 
gate. It should be noted that each of the stationary 
exciton states involves all of the zeroth-order descrip- 
tions; that is, each of the stationary exciton states involves 
excitation of all of the atoms considered as locally excited. 
Thus, in the exciton states the excitation is delocalized. 

The second definition arises from the possibility, 
pointed out by Frenkel,** of construction of wave 
packets from the stationary exciton states. This is a 
useful aspect for discussions of electronic energy trans- 
fer in atomic and molecular systems. The properties of 
the wave packet can be calculated by quantum-me- 
chanical methods.” 

The third definition can be misleading, so its meaning 
is examined more fully. In Fig. 1(a), the atomic ion 
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(2, 2) is schematically shown to have an electron in a 
p-type excited orbital. Both the excited atom and the 
ground-state atom are electrically neutral and electron- 
ically symmetrical. The excited atom is produced from 
a ground-state atom by the polarizing influence of the 
electric vector of the interacting light wave (the electric 
vector having been taken to be horizontal, and in plane). 
While the light wave impinges, a transitory dipole 
moment is induced in the atom, but no permanent 
charge separation, i.e., ionization, takes place. Thus, 
the definition of an exciton as an “electron-positive 
hole” combination can be meaningful largely in a 
figurative sense, since the exciton involves excited 
bound-electron states of the electron and its originating 
atom. To actually ionize the “pair” requires the addi- 
tional energy between the exciton level and the ioniza- 
tion limit. 

On the other hand, the transitory dipole induced by 
the light wave would be directly the cause of the 
interaction of the zeroth-order states, since all neigh- 
boring dipoles could interact with each other in a 
purely electrostatic way.§ Incidentally, the signs in the 
exciton states (I) to (IV) should not be taken to mean 
electron nodes (as in ordinary electronic wave func- 
tions), but nodes signifying the phase relation between 
the transitory dipole moments of the zeroth-order 
states (see Sec. 4). 

The molecular lattice of Fig. 1(b) can be discussed 
analogously. It is assumed that here van der Waals 
forces bind the molecules into ordered layers (crystals, 
or lamellae), and that negligible electron overlap is 
present in the ground and lowest excited states. The 
molecular axis is shown inclined at 45° to the layer 
direction ; for simplicity, a two-center molecular orbital 
is considered. The molecule shown excited (2, 2) has an 
electronic orbital node perpendicular to the molecular 
axis. The excited molecule is thus electrically neutral 
and symmetrical, and is produced from a ground-state 
molecule by a light wave of the right frequency with 


§ In the formal treatment of the theory, the expansion of the 
intermolecular interaction potential results in a series of terms, of 
which the dipole-dipole terms are the predominant ones for 
formally allowéd electronic excitation. 
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its electric vector oriented along the molecular axis. 
The identical symbolic description of the zeroth-order 
states and exciton states, used for the foregoing atomic 
lattice model, is applicable to this molecular model. 

From the foregoing discussion, it should be clear that 
exciton states cannot impart electrical conductivity. It 
is found experimentally that excitation of atomic lat- 
tices to exciton states produces no photoconduction of 
electrons (see Sec. 4). 


3. ORIGIN OF EXCITON BANDS AND 
CONDUCTION BANDS 
The origin and relation of exciton bands and conduc- 
tion bands may be clarified by a discussion of their 
dependence on interspecies distance. In Fig. 2 are given, 
schematically, energy vs interatomic (interionic) dis- 
tance (a) and energy vs intermolecular distance (b) for 
the energy levels of a crystal or aggregate taken as a 


atomic array is assumed, as before, to be made up 
with isoenergetic electronic states, brought to- 
to a Coulombic lattice (positive and negative 
ot distinguished). Since the Coulombic potential 
as 1/1, it would be effective at the greatest inter- 
mic separations, and strong ground-state binding 

: [Fig. 2(a) J. Next in distance dependence would 
mng-range 1/7 dipole-dipole interaction between 
eae - Depending upon the strength of the 
each state of the isolated ion would split 
ton band (where W equals the num- 
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Fic. 2. Schematic energy vs inter- 
species-distance curves for a crys- 
tal or aggregate taken as a whole. 
(a) atomic lattice, (b) molecular 
lattice or lamellar structure, show- 
ing onset of exciton bands and 
conduction bands. Exciton-band 
splitting exaggerated, and overlap 
interaction shown independent 
of dipole-dipole interaction (cf. 
Fig. 3). 


r co 
(b) 


ber of ions in the assembly). The exciton-band width'® 
is proportional to the intensity of the transition in the 
unit species, and depends upon the orientation of the 
transition dipole moments (besides the 1/r? dependence, 
and a rather insensitive function of the number of unit 
species in the array). Finally, there is the extremely 
short-range electron overlap interaction. This short- 
range characteristic of overlap forces arises from the 
exponential fall of electron orbital wave functions in the 
exterior of atoms (and molecules). In Fig. 2(a), it is 
assumed that negligible overlap is present in the ground 
state of the atoms, but that, in the excited state, over- 
lap may occur yielding a conduction band, because of 
orbital expansion upon high quantum-number excita- 
tion. If there is overlap in the ground state, a valence 
band is said to arise, analogous to the conduction band 
for the excited states. 

The point to be noted from the above is that, because 
of the long-range nature of the dipole-dipole interac- 
tion, nonphotoconducting exciton bands can exist 
discrete from the photoconduction bands. Neverthe- 
less, atomic exciton bands usually lie not far below the 
photoconduction band of atomic lattices. 

The molecular case has some qualitatively different 
features. However, it is well known that gaseous mole- 
cules exhibit converging series absorption in the vac- 
uum ultraviolet (molecular Rydberg series) whose 
convergence limits represent the ionization potentials 
for various electrons in the molecule. One could antici- 
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Fic. 3. Energy-band diagram 
at equilibrium separation for a 
crystal or aggregate taken as a 
whole. (a) atomic lattice, (b) 
molecular lattice or lamellar 
structure, 
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pate that dipole-dipole interactions between the elec- 
trons in Rydberg levels would give rise to “atomic” 


crystals conform roughly to a behavior described by 
Fig. 2(a), which for the equilibrium interatomic separa- 


N exciton bands. These are indicated in the upper part tion yields an energy-level diagram such as Fig. 3(a). 

of Fig. 2(b), analogous to the exciton bands of Fig. 2(a). The pure alkali halide crystals are properly described 

In addition, a molecule during its formation from as electrical insulators rather than as semiconductors, 

atoms [at r=% for molecular lattice; lower right as far as their electronic behavior is concerned. For 

corner of Fig. 2(b)] gives rise to excited (unfilled) example, the KCI crystal*!° exhibits its first exciton 

molecular orbitals. For simplicity, only one filled and band as a broadened line at 7.6 ev (#~61000 cm™, 

one excited orbital are indicated. In the formation of A~1620 A). Excitation of this band produces no photo- 

the molecular lattice, therefore, while the ground state conductivity. At 9.44 ev (#~76000 cm™, \~1300 A) 

becomes slightly lowered in energy through a weak photoconduction is observed. Thus, both types of 

van der Waals interaction, the molecular excited states bands are observed in the vacuum ultraviolet not far 

of the assembly spread by dipole-dipole interaction from each other. No contribution to dark conductivity 

4 into an Ņ-fold exciton band (N equals the number of would be expected at ordinary temperatures from this 
molecules in the assembly). Since molecular transitions high-energy conduction band. 

are frequently very intense (oscillator strength f~1), Another characteristic of atomic exciton bands is 


d 
4 


a large band splitting is to be expected (see later). 
Moreover, since highly conjugated molecules (i.e., pig- 
ment or dye molecules) have rather low-energy excited 
states, exciton bands far from the conduction band 
should be observed. 


4. PROPERTIES OF EXCITON BANDS 


The electrically nonconducting characteristic of ex- 
citon bands has been referred to above, and the physical 
basis for this property has been discussed. Some addi- 
tional characteristics of atomic exciton bands are now 
mentioned. It is typical of atomic lattices that exciton 
bands are usually observed just below the conduction 
band. The optical properties of, e.g., the alkali halide 


their occurrence in sets which fall into a converging 
series.!° This behavior could be anticipated from ‘their 
atomic character; i.e., the orbitals which would be used 
in the zeroth-order description of an atomic-lattice 
exciton are essentially hydrogenic in character, although 
extending over many lattice points. The series formulas 
are, of course, modified by the dielectric constant of 
The lattice, lattice interactions, etc.'° 

The band width for molecular exciton cases has been 
calculated® to be approximately 0.2 ev (A~1600 cm~) 
in linear pigment assemblies for an oscillator strength 
of 1 for the isolated molecule transition, at intermolec- 
ular distances of 10 A, and in a suitable geometry. 
Experimental studies indicate that this is the order of 
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Fic. 4. Selection rules for exciton levels in linear arrays (cf. 
reference 15), for a hypothetical 4-unit molecular array. (a) head- 
3 to-tail, (b) card-pack, (c) alternate-translational-arrays of transi- 
tion dipoles. Dashed line: forbidden exciton level; solid line: 
allowed exciton level. Diagram shows dipole-dipole splitting of 
zeroth-order excited states. 


magnitude observed. For weaker optical transitions, or 
for greater intermolecular separations, smaller exciton- 

band widths can be expected. 
The foregoing comments on exciton-band width are 
not meant to imply that optical transitions of isolated 
EJ molecules are necessarily to be broadened in the lamel- 
lar aggregates. On the contrary, one of the character- 
istics*41L15 of exciton bands is the restriction of optical 
transitions to only certain components of an exciton 
eo band. For example, once more consider the abstracted 
$ ; linear array of four atoms of Fig. 1(a). The exciton 
states (I) to (IV) can be depicted by vector models 
representing the dipole moments induced by the light 


T | wave with electric vector in the direction of the array, 
e 

< $ I7~>-> >, 

$ 4 I> > e+ e, 

& ; HI —> <->, 

E IV> <><. 


af l The dipole-dipole interaction would give (I) the 
' | lowest energy and (IV) the highest. Also, the resultant 


| transition moment (vector sum of the individual mo- 
i ments) is finite in the case of (I), but zero for the re- 
i maining three states. Thus, the exciton band for this 
} case would be observed to have the structure shown in 
Fig. 4(a), with only the lowest exciton state permitted 
in optical absorption. Thus, for this example, a long 
wavelength or red shift would be observed in the linear 
aggregate compared with the isolated atomic species, 
rather than a band broadening. Cases 4(b) and 4(c) 
correspond to other geometries for a linear array, the 
formėr for a card-pack orientation of dipoles, and the 
Jatter for an alternate translational orientation. The 
: ‘staggered arrangement of naphthalene molecules in the 
= crystalline state also leads to this last result.” For an 
_ N-fold linear array, even though the exciton band will 
g have WV components, analogous selection rules to 
‘illustrated in Fig. 4 still apply. Obviously, the 
] state is not affected by the exciton process. 
lly, in the special case of dye-molecule aggre- 
; ‘additional result of exciton-band formation 
This is the enhancement of triplet 
pects st. anying molecular aggregation." 
na ), four separate electronic proc- 
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esses in molecular aggregates can be delineated: (a) 
molecular exciton-band excitation, (b) “atomic” ex- 
citon-band excitation, (c) photoconduction excitation, 
and (d) triplet excitation following exciton-band ex- 
citation."!5 The recently reported, dramatic experi- 
ments by Albert Szent-Györgyi, demonstrating” great 
enhancement of phosphorescence in dyes frozen in 
water solutions, seem to be due to the foregoing 
phenomenon. 

Furthermore, since triplet states of molecules have 
lifetimes of a millisecond or longer, double quantum 
processes such as the triplet to conduction-band process 
shown in Fig. 3(b) may be expected, following exciton- 
band, then triplet excitation. The exciton-band-to- 
triplet process is radiationless. In most dyes, the radia- 
tionless excitation of triplets is very weak in isolated 
molecules. 

The Förster theory, with its phenomenological 
aspect, permits useful calculations to be made on 
distance of dipole-dipole transfer of excitation energy. 
Karreman and Steele? have shown that excitation 
energy transfer in proteins by dipole-dipole interaction 
through the aromatic residues is plausible according to 
the Förster formula. 

In this section, the complications”? of intramolecular 
vibrational-electronic coupling in exciton interactions 
have been omitted for simplicity. 


5. MOLECULAR SYSTEMS AS PHOTOCONDUCTORS 
AND SEMICONDUCTORS 


If the energy gap between the ground state (or 
valence band) and the conduction band of an atomic 
lattice is small, intrinsic electrical conductivity is ob- 
served. For example, in pure germanium and pure 
silicon, the energy gaps are 0.7 and 1.10 ev (#~5600 
and 9000 cm™), respectively. In these solids, enough 
electrons can be thermally excited at room temperature 
so that dark electrical conductivity can be observed. 
The term semiconductor is applied,! arbitrarily, to 
solid materials which have a resistivity in the range 
10~ to 10° ohm cm. For pure diamond, with an energy 
gap of 6 ev (y~49 000), light of A~2100 A would be 
required to raise an electron to a conduction band. 
Materials which exhibit only photoconductivity are 
properly classified as insulators. Since molecular ioniza- 
tion potentials (gas phase) are in the range” 8 to 10 ev, 
one would expect pure molecular crystals and aggre- 
gates to act as insulators, having photoconductivity 
perhaps for 6 to 8 ev quanta, not intrinsic electrical 
semiconductivity. Observations of a weak photocon- 
ductivity have been made in organic crystals absorbing 
near-ultraviolet light, but evidence has accumulated 
that, in pure single crystals, even this weak photocon- 
ductivity becomes vanishingly small as surface effects, 
impurities, etc., are eliminated.?5 

On the other hand, there are several reports of semi- 
conductivity in certain organic substances in the class 
of molecular crystals, especially those of Eley et al.?°** 


EXCITON AND 
The general observation seems to be that, if a molecule 
has an optical transition at around 1 ev (~8067 cm, 
A~12 500 A), semiconductive behavior may be ob- 
served. This seems to be a necessary condition for 
semiconductivity, but not a sufficient one (see the 
following). For example, metal-free phthalocyanine 
exhibits semiconductivity (temperatures up to 400° C 
are used), with an indicated energy gap of ~1.2 ev 
(v~9900 cm, A~10 200 A). However, the lowest 
allowed optical transition in phthalocyanine is near 
the infrared, but not in the infrared. The side question 
that arises is whether the energy gap determined by 
the temperature dependence of the semiconductivity 
is in error, or whether the lowest triplet state (which 
seems to be in the region of A~10 000 A”) is involved. 
Contrariwise, those substances which absorb only in or 
near the ultraviolet, such as diphenylbutadiene and 
diphenyloctatetraene, exhibit no dark semiconduc- 
tivity,°~* indicating an energy gap of greater than 
2.2 ev (~50 kcal/mole, »~17 000 cm, \~5800 A). 

For intrinsic semiconductivity, not only must there 
exist a small gap between the filled levels and the un- 
filled levels, but there must exist sufficient intermolec- 
ular orbital overlap for a true conduction band to 
develop. Apparently, in the case of near ultraviolet- 
absorbing hydrocarbons, only a nonconducting exciton 
band may exist in the crystal. The stronger the van 
der Waals forces become, the greater is the chance that 
overlap interaction resulting in a conduction band will 
arise. Especially in the case of such dye-like molecules 
as pure phthalocyanines, chlorophylls, and cyanine 
dyes, one might anticipate significant development of 
a conduction band from overlap of the lowest excited- 
state orbitals. However, since resistivity measurements 
cover a wider decadic range! than almost any other 
physical property, and since conductivity measure- 
ments using powder samples are so subject to experi- 
mental artifacts, one cannot accept the observed 
results without caution as indicating that molecular 
crystals are good semiconductors at room temperature. 

The semiconductivity reported?*® for proteins oc- 
cupies a strange position. It is well known that pro- 
teins absorb in the ultraviolet, which would indicate 
that only photoconductivity could result, again if there 
were sufficient orbital overlap. Evans and Gergely,® in 
a widely misinterpreted paper, proposed that 7-electron 
overlap across hydrogen bonds could give rise to suff- 
cient overlap interaction to yield a “conduction” band. 
Using three different models, they obtained choices for 
the energy gap between the valence band (ground 
state) and conduction band of 3.5, 4.8, and 3.2 ev 
(A~3500, 2600, and 3800A), respectively. [Eley 
et al? quote an energy gap as given by Evans and 
Gergely to be ~2 ev (A~6200 A), but this is their 
value for the gap between two filled valence bands. ] 
Thus, according to this discussion, and according to the 
pattern of results of Eley ef al.7°°% Evans and Gergely 
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proved (within the severe limits of their calculation) 
that a pure protein should be an electrical insulator, ex- 
hibiting only photoconductivity and not semiconductivity. 

The semiconductivity observed***8 in proteins can- 
not be intrinsic, but if it is truly electronic, it may be 
extrinsic. Extrinsic semiconductivity arises generally 
from the presence of lattice imperfections or impurity 
centers, which introduce new energy levels into the 
term scheme for a lattice structure. In biological sys- 
tems, such perturbations of the perfect lattice probably 
play a very important role, and all of these discussions 
of the pure or perfect molecular lamellar structures are 
modified by their consideration. New energy levels 
appear in such systems, often greatly lowering the 
energy required for conductivity phenomena.” 

The experiments of Arnold and Sherwood*! demon- 
strate that, in the complex, heterogeneous structure of 
chloroplast, electron traps exist and are involved in 
luminescence and conductivity properties of the chloro- 
plast. This type of investigation probably has valuable 
extension to other complex biological systems, and 
much valuable correlation with solid-state physics 
might be made in this connection. On the other hand, 
in the extremely small conductivity of weak semicon- 
ductors and insulators, it is difficult to unravels ionic 
from electronic semiconductivity. Probably in proteins 
and chloroplasts, there is an appreciable ionic com- 
ponent of electrical conductivity. 


6. PHOTOPHYSICAL STEPS IN THE PRIMARY 
PROCESS OF PHOTOSYNTHESIS 

Since the work of Emerson and Arnold’?! and of 
Gaffron and Wohl,* it has been evident that absorp- 
tion of energy by chlorophyll in chloroplast involves 
exciton behavior (called “resonance energy migration” 
in most of the literature). The review by Rabinowitch™ 
summarized the recent status of this topic. However, 
since newer investigations have revealed that in pig- 
ment lamellar aggregates triplet excitation is facili- 
tated,“® the detailed primary photophysical steps 
require re-examination. 

The lowest triplet states of chlorophyll-a and chloro- 
phyll-6 have been established spectroscopically by 
Becker and Kasha” to be at approx 1.42 ev (g~11 500 
cm, A~8700 A). The quantum yield of triplet excita- 
tion (via excited singlet state) is so small in monomeric 
chlorophyll as to suggest that the lowest triplet state 
could not participate in the efficient process of photo- 
synthesis. However, the finding™® that triplet excita- 
tion is greatly enhanced by aggregation of pigment 
molecules suggests strongly that this mechanism may 
play an important role in the chloroplast. The process 
would involve excitation to the chlorophyll exciton 
band, followed by triplet excitation [Fig. 3(b), first 
steps of process d]. As in other pigments,"+!5 the quan- 
tum yield of fluorescence of aggregated chlorophyll (in 
the leaf) is diminished many-fold over that of isolated 
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molecules in solution, as would be required if enhance- sive review covers additional researches not included for 
ment of triplet excitation occurred. consideration in the foregoing and emphasizes the diffi- | 
The detection of photoproduced radicals in chloro- culty of making significant semiconductivity measure- —< d 
plast by electron paramagnetic-resonance studies made ments. of 
by Commoner eż al.**** and investigated further by The quantitative treatment of excitons in molecular | 
Calvin ef al. (p. 157 and references 39-41) poses the aggregates is in preparation by E. G. McRae and M. | 
question of the relation of the primary photophysical Kasha, and will be published shortly elsewhere. Refer- 
process of exciton-band excitation to the secondary ence to a preliminary communication on this work! 7 
photochemical process of radical production. The pic- has been made in the text. 
ture presented by Calvin and co-workers* proposes 
jonization at the exciton stage. It is possible that in the ACKNOWLEDGMENTS k 
hete Boe cous pae an, stick PEN Pes The author gratefully acknowledges a very useful i 
A uel ae peas My re ti w E “H Pi discussion with Professor R. E. Peierls, who was 
SAA S MEN £ Be Py us rA T ` ra all Visiting Professor of Physics at the University of fF 
Š ar ae ahaa i ) ai are ou. oe ae Colorado concurrently with the Boulder conference. i 
i mee. ti ee aaa sins ae at ar ee? The author is indebted to Dr. C. G. B. Garrett and t 
a a occu aa to Professor C. A. Hutchison, Jr., for preliminary copies | 
over, the lowest exciton state is very short-lived, some- : : te g á 
oe ; è of their manuscripts, referred to in the text. 
$ what shorter in life than the lowest excited singlet state 
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General Patterns of Biochemical Synthesis 
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HE general pattern of biochemical synthesis is 
outlined in Fig. 1. The first block indicates the 
processes of photosynthesis which Calvin describes 
(p. 147). The second block represents the processes of 
intermediary metabolism wherein the raw materials of 
the medium are converted into the building blocks and 
energy needed for synthesis of the macromolecules. 
The third block represents the processes of ordering and 
polymerizing the intermediates. And lastly, the fourth 
block indicates the final folding and arranging of the 
macromolecules into products which may be excreted, 
used to maintain existing cells, or used to form new 
cells. 

The division into these blocks is quite obvious from 
consideration of the growth requirements of various 
cells. A large variety will grow with CO» and light as 
the sole carbon and energy sources whereas others lack 
this capacity. Furthermore, even those cells which have 
this ability can hold it in abeyance and use chemical 
energy when the occasion demands. Accordingly, it 
seems quite reasonable to segregate the photosynthetic 
processes from the rest of the cell. Likewise, there are 
a number of cells which can synthesize all of their re- 
quirements from a single carbon source, such as glucose, 
whereas others require an exogenous supply of one or 
more of the more complicated molecules. In addition, 
most cells utilize preformed intermediates whenever 
they are available. Again, it seems reasonable to indi- 
cate a distinction between those processes and other 
operations of the cell. The division of the third and 
fourth stages is somewhat more arbitrary, but Ken- 
drew’s model of a protein molecule (p. 94) is quite 
convincing evidence that these are at least two distinct 
steps in the synthesis of protein. 

Actual cells, in fact, do show some of the localizations 
of function indicated in the model cell. Calvin showed 
pictures of the chloroplasts which correspond to region 
1, and Lehninger showed a similar localization of the 
Krebs-cycle reactions in the mitochondria (p. 136). 
Synthesis of protein and nucleic acid may occur in 
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Fic. 1. General pattern of biochemical synthesis. 


other special regions of the cell, such as the microsomes 
and the nucleus. 

However, the main purpose of this diagram is not to 
indicate the workings of a cell, but to indicate the out- 
line of this paper. After Calvin and Lehninger, nothing 
remains to be said about photosynthesis or the energy- 
yielding reactions. Macromolecular synthesis is covered 
in later papers. Thus, emphasis here is on the flow of 
material in the synthesis of the precursors of the 
macromolecules. 

Before getting into the details of the synthetic path- 
ways, there are two assigned topics which should be 
disposed of. The first is “least common denominators of 
life.” One of the pleasures of biophysics is that such 
fascinating subjects come within the legitimate field of 
interest. How can life be defined? If one attempts to 
create life, how can success be recognized? Is the slimy 
precipitate in the bottom of an autoclaved mixture of 
organic chemicals alive or not? In part, this is a ques- 
tion of semantics, but the time spent in framing a 
definition is not wasted as it is spent in trying to select 
the essential from a multitude of nonessential properties. 

In the model cell of the diagram, clearly photosyn- 
thesis is not essential; neither is the synthesis of inter- 
mediates, as many obviously living cells lack these 
capacities. Cells lacking an energy system and requiring 
ATP surely would be considered living. In a similar 
way, if some of the cells grown in tissue culture are 
shown to require one or more proteins, they are none 
the less alive! What then is essential? One obvious 
feature of living organisms is their capacity for growth 
and reproduction. Yet, these qualities alone do not 
define life. In framing a definition, it is necessary to 
choose words and concepts that include such creatures 
as mules which cannot reproduce and that exclude salt 
crystals which can grow. Furthermore, objects such as 
seeds and perhaps even viruses should be included. 
Both have the capability of catalyzing the synthesis of 
more of their kind even though they may not exhibit 
growth or other metabolic activity for long periods of 
time. ` 

There presently seems to be no simple and universally 
acceptable answer as to how newly created life might 
be recognized. On the other hand, there are two par- 
ticularly important properties. One is autocatalysis 
with its implications for growth and reproduction ; 
another is a capacity for evolution. It would be difficult 
to recognize anything as alive which lacked either of 7 
these capabilities. Yet, any definition even framed in 
these terms is arbitrary, as the choice to include or 
exclude virus is largely a matter of taste. 
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Another assigned topic is “comparative biochemistry 
at the molecular level.” At this level, the processes are 
amazingly similar in different organisms. The reactions 
of glycolysis originally were worked out in yeast and 
in mammalian muscle; the synthesis of certain amino 
acids is identical in a wide variety of creatures from 
bacteria to man. The ribonucleoprotein particles of the 
microsome fraction (ribosomes for short) appear iden- 
tical whether isolated from bacteria, yeast, plants, or 
animals. Thus, to a first approximation, all cells are the 
same. One can consider Æ. coli as the problem to be 
solved and treat all other cells as perturbations of coli. 
Thus, Chlorella is a variant of coli which has gained some 
additional capacities, while man is a deficient mutant 
which has lost a number of important synthetic 
pathways. 

The differences between cells are often quantitative 
and much less frequently qualitative. The most fre- 
quently observed difference is that a pathway is com- 
pletely lacking or quantitatively unimportant. It is 
much less common to find that two cells have found 
quite different solutions to a synthetic problem. Per- 
haps the space biologists will find some radically 
different mechanisms on another planet. 

Consider, now, some of the details of synthetic path- 
ways. Suppose there exists in the cell a series of reac- 
tions catalyzed by the enzymes i, E» E; how can 
these be demonstrated? 


Ey Ez E; š 
A-B-C-D 


The most common approach is to disrupt the cells and 
then to isolate the enzymes and intermediates from the 
extract. The individual reactions can be studied then 
in cell-free systems. This is the classical method of 
biochemistry mentioned in Kornberg’s introduction 
(p. 200). In spite of the facts that the intermediates are 
often present in extremely low concentrations and that 
the enzymes are frequently unstable, most of the im- 
portant pathways of intermediary metabolism have 
been worked out in detail by this method. As Lehninger 
mentions, there are difficulties when the spatial ar- 
rangement of the enzymes in the cell is important. This 
method is amply illustrated in other papers (Korn- 
berg, p. 200; Meister, p. 210; Lehninger, p. 136). 
Another versatile tool is the deficient-mutant tech- 
nique. Mutants which require a certain compound (D) 
for growth are isolated. Among these mutants, some 
are able to grow with one or another precursor (B or C) 
depending upon which enzyme has been lost in the 
mutation. The growth requirement of these different 
classes of mutants often suggests the reaction sequence. 
Such evidence also can be strengthened by demonstra- 
ting the accumulation of intermediates or the absence 


of a particular enzyme. 


~ Calvin describes the kinetic approach to the identi- 


fa? 


fication of intermediates and his experiments demon- 
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Fic. 2. Radioautograph of hydrolyzed protein of cells given with 
C acetate as sole carbon source. 


strate its power (p. 147). It presents certain difficulties 
when applied to amino-acid synthesis, because the 
reactions are rapid and the quantities of intermediates 
small. The kinetic delays, consequently, are extremely 
short. 

Another technique, one used extensively in our lab- 
oratory, is isotopic competition. Two different mole- 
cules, one having an isotopic label, are allowed to 
compete for a place in the final product. Thus, if the 
starting material (A) has a C™ label, C“ will appear in 
the final product (D). If, however, nonradioactive 
precursors (B, C) also are supplied, the radioactivity 
of the product (D) often is reduced strongly. Alterna- 
tively, radioactive forms of suspected precursors (B, C) 
may be added to compete with a nonradioactive carbon 
source (A). 

The figures below show some examples of isotopic 
competition. Cells are grown with C“ acetate as the 
sole carbon source; they then are harvested; the pro- 
tein is isolated and hydrolyzed, and the amino acids are 
displayed on a paper chromatogram. A radioautogram 
of the paper (Fig. 2) shows all of the amino acids. No 
competition is involved in this case and no information 
is gained concerning the synthetic pathways. Measure- 
ment of the different spots gives only the amino-acid 
content of the protein. 

If C”? glucose also is present, the acetate enters only 
a limited group of amino acids (Fig. 3). Thus, a class of 
amino acids, which are derived more directly from 
acetate, is immediately segregated. If, in addition, C® 
aspartic acid is added (Fig. 4), little radioactivity 
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Fic. 3. Cells grown with C® glucose and C™ acetate show radio- 
activity in the amino acids which incorporate acetate. 


Fic. 4. C? aspartic acid (in addition to C¥ glucose and CH 
acetate) reduces the radioactivity of aspartic acid and the amino 
acids derived from it. 


appears in aspartic acid, threonine, methionine, İso- 
i Ten cine, and lysine, as these amino acids are derived 
from the exogenous aspartic acid. Figure 5 shows the 
effect of adding glutamic acid. The incorporation is 


Pe ce er 


Frc. 5. C glutamic acid prevents the incorporation of acetate 
into the amino acids derived from the Krebs cycle leaving only 
leucine strongly radioactive. 


suppressed in all of the amino acids except leucine. C! 
leucine blocks the incorporation into leucine (Fig. 6) 
but has no effect on the other amino acids. Further 
details of pathways can be established by adding sus- 
pected intermediates as competitors. 

This technique also gives information concerning the 
flow along the pathway. When radioproline is added, 
its sole end product is protein-bound proline. The rate 
of incorporation is equal to the proline content times 
the growth rate, as there is little if any incorporation 
by exchange in the growing coli. Glutamic acid, on the 
other hand, supplies carbon for proline and arginine. 

A few additional experiments permit an analysis of 
the flow pattern in the Krebs cycle (Fig. 7). The flow of 
four carbon units into the cycle is equal to the flow out 
of the cycle into glutamic acid and its products, aspar- 
tic acid and its products, plus leakage products found 
in the medium. The circulation in the cycle can be 
calculated from the relative specific radioactivities of 
aspartic acid and glutamic acid when CO: or acetate is 
used as tracer. The resulting flow pattern shows that, 
when these cells grow on glucose, the Krebs cycle is not 
the main source of energy in the cell but is principally 
concerned with synthesis. i 

In contrast, when acetate is the sole carbon source, 
the cells must derive all of their energy from the reac- 
tions of the Krebs cycle. As a consequence, the circula- 
tion increases so that the input of acetate is roughly ten 
times the input of four carbon units. The reactions of 
the cycle must be considered quite flexible responding 


to changes in the environment. 
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This type of analysis can be extended to include the 
flow of carbon into nucleic acids, lipids, and the other 
families of amino acids. More than 80% of the carbon 
incorporated can be accounted for. Minor flow patterns 
can be discerned, and the changes in the flow patterns 
caused by adaptation to other carbon sources can be 
observed. 

Several features are notable. First, the flows are well 
balanced. The amino acids, although synthesized in 
several different systems, are each supplied at the rate 
needed for protein synthesis. Usually, there is very 
little leakage of amino acids to the medium. On occa- 
sion, alanine and valine are synthesized in excess and 
pour out into the culture fluid. 

One clue to this balance is the control of synthesis 
by the product. When proline is added to the medium, 
even at low concentrations, the synthesis of proline 
promptly stops. This is not simply a reversal of the 
usual reactions, as the exogenous proline only is con- 
verted back to glutamic acid when present at very high 
concentrations. Other amino acids which strongly in- 
hibit their own synthesis are arginine, threonine, serine, 
and methionine. A less complete inhibition is observed 
with lysine, cystine, leucine, and isoleucine. 

Glutamic acid, aspartic acid, glycine, alanine, and 
valine, however, continue to be synthesized from glu- 
cose even though they are present in the medium. In 
this case, the observed competition effects must come 
about by a simple isotopic dilution of the internal pools 
of amino acids. The balance of synthesis among these 
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Fic. 6. C# leucine supplies the carbon for leucine without 


“influence on the radioactivity of the Krebs-cycle group of amino 


acids. 
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Fic. 7. Flow pattern of the Krebs cycle for cells growing on 
glucose as carbon and energy source. The flow numbers are ex- 
pressed as micromoles per gram dry cells per 100 seconds. 


and the other amino acids must be the result of other 
controls. 

Perhaps the enzymes are present in exactly the 
distribution required to achieve the balance, but there 
may also be other less obvious control mechanisms. 
For example, there is a coupling between phosphorus 
and nitrogen metabolism which is quite obscure; ex- 
ternal phosphate does not exchange with internal phos- 
phate unless a nitrogen source is present. This is an 
unexpected result, as glycolysis continues. Growth stops 
when the exogenous adenine is exhausted, but the pools 
of adenine ribotides are not depleted. 

It is clear, however, that the cell has an immediate 
chemical response to changes in its environment, in 
addition to the biological response of enzyme induc- 
tion. Also, the entire system is closely coupled so that 
seemingly unrelated parts interact strongly. 

Competition studies provide some other clues to the 
degree of organization of the cell. Exogenous threonine 
is cleaved promptly to supply the carbon of glycine. 
Internally synthesized threonine contributes no carbon 
to glycine. The enzyme which splits threonine is clearly 
present, yet it does not have access to the endogenous 
threonine. As another example, the intermediates of 
glycolysis do not compete with glucose. Fructose-6- 
phosphate, in particular, contributes no carbon to cells 
growing on glucose even though it supplies all of the 
phosphorus and completely suppresses the incorpora- 
tion of orthophosphate. It will, however, compete suc- 
cessfully with acetate as a carbon source. These anom- 
alies suggest highly organized enzyme systems wherein 
substrate molecules proceed from one active site to 
another, perhaps by surface diffusion. 

Isotopic competition in common with the mutant 
technique has one basic ambiguity. It fails to distin- 
guish whether a compound is a true intermediate or 
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Fic. 8. The incorporation of phosphate shows transfer of P? 
from internal pool of small TCA-soluble molecules to TCA-in- 
soluble material. A precursor-product relationship is indicated. 


merely on a side line. For example, it is not possible to 
tell by competition studies with C™ whether the keto 
analog of leucine is on the main line of the sequence 
from valine to leucine or whether it is merely a com- 
pound which is readily animated to give leucine. In 
this case, additional experiments using N** tracers are 
needed.* 

As a next stage in intermediary metabolism, one 
might expect to find the synthesis of small peptides and 
small polynucleotides. Unfortunately, the same methods 
cannot be used. Peptides are seldom found in the cells; 
mutants requiring peptides have not been isolated; 
isotopic competition fails because the peptides used as 
competitors are rapidly split and incorporated as indi- 
vidual amino acids. To investigate the reactions of 
amino acids and nucleotides prior to their incorporation 
into macromolecules, it is necessary to use kinetic 
methods. 

In our laboratory, we have used a technique of adding 
a tracer compound to a growing culture and then taking 
a rapid succession of samples with a syringe. The cells 
are separated from the culture medium by filtration 
which requires less than two seconds. The filter can then 
be counted directly or extracted for chromatography. 

Alternatively, the samples can be squirted directly 
into 5% trichloracetic acid (TCA) which extracts the 


* The experimental work described in this paper was done by 
the Biophysics Section, Department of Terrestrial Magnetism, 
Carnegie Institution of Washington, which at present includes E. T. 

- Bolton, R. J. Britten, D. B. Cowie, and R. B. Roberts. The ma- 

terial up to this point is described in detail in “Studies of Bio- 

nthesis in Æ. colz.,” R. B. Roberts, P. H. Abelson, D. B. Cowie, 

i. T Bolton, and R. J. Britten, Carnegie Institution of Washing- 

i Publication No. 607, 1955. A more complete account of the 

N der of the experimental work can be found in the Carnegie 

EON of Washington Yearbooks Nos. 54, 55, 56, and 57 

TEA 1058) and in Microsomal Particles and Protein Synthesis, 
g f Roberts, editor (Pergamon Press, New York, 1958). 
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small molecules. Subsequent filtration provides a sample 
which measures incorporation into the large molecules. 

Figure 8 shows the incorporation of phosphate and 
its subsequent transfer from small molecules to large 
ones. To the accuracy of this experiment, the TCA- 
soluble and the TCA-precipitable material follow the 
curves expected for a precursor and its product. 
Greater accuracy and more-detailed information can 
be obtained by using a “pulse” of radioactivity. Figure 
9 shows such an experiment; the pulse was terminated 
at 63 sec by adding an excess of P#!Oy. Also, the time 
scale was stretched out by lowering the temperature to 
15° C to increase the generation time to 3.5 hours. 
There is a noticeable delay in the incorporation into 
nucleic acid after which the maximal rate is maintained 
for nearly 8 min. Figure 10 shows radioactivity of two 
nucleotides of uracil separated out by chromatography. 
The radioactivity of the individual phosphorus atoms 
of adenosine triphosphate is shown in Fig. 11. The 
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Fic. 9. Incorporation of P® into total cell and nucleic-acid frac- 
tion when the radioactivity is available for a short interval. 


outermost phosphorus equilibrates rapidly as it enters 
into the energy-transfer reactions. The innermost phos- 
phorus appears to be the precursor of nucleic acid, as 
its specific radioactivity corresponds to the slope of the 
nucleic-acid curve. 

A large number of similar experiments have been 
performed to follow the incorporation of amino acids 
into protein. As a first step, the amino acids enter the 
cell to form a pool wherein the concentration is typically 
several thousand times the concentration of the medium. 
This pool has extremely complicated characteristics 
which cannot be interpreted either in terms of simple 
adsorption on sites or in terms of simple transport 
across an impermeable membrane. A model which in- 
cludes transport by a limited quantity of carrier to an 
array of sites, however, seems adequate for Æ. coli. In 
yeast, the pools are considerably more complicated. 


Fig. 12 shows a typical time course of incorporation. =^ 


As the size of the pool depends on the concentration in 
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the medium, the cells first are given a C amino acid 
to establish a pool of the desired size, and then the 
tracer is added. The pool exchanges rapidly with the 
external medium, as shown by the curvature of the line 
representing the total incorporation. Often the outflow 
is two-thirds the inflow. The rate of incorporation into 
the protein is proportional to the specific radioactivity 
of the pool as would be expected in a precursor-product 
relationship. Chromatography shows that the material 
extracted (either by TCA, alcohol, or water) is the 
unaltered amino acid and the nonextractable material 
is protein. Other tests showed that the amino acids 
incorporated during the first fifteen seconds are not 
concentrated at the ends of peptide chains and that 
they are released by acid hydrolysis at the usual rate. 

These curves show that there is no appreciable quan- 
tity of partially formed proteins. If such were the case, 
the first protein to appear would be deficient in radio- 
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Fic. 10. Corresponding incorporation of P® into uridine 
diphosphate and uridine triphosphate. 


activity. The only kinetic delay is accounted for by the 
time to build up radioactivity in the pool. 

Similar experiments using radiosulfate as a tracer 
showed that glutathione (the only peptide found in 
E. coli) is not an intermediate in protein synthesis 
although it can serve as a sulfur source in sulfur-defici- 
ent medium. No evidence of other peptides was found 
even when protein synthesis was blocked either by 
chloramphenical or by lack of a required amino acid. 

Activated amino acids and amino acids attached to 
ribonucleic acid are suspected intermediates which are 
presently under intense study. These compounds and 
their possible role in protein synthesis are discussed in 
detail by Meister (p. 210). Unfortunately, they are 
present in E. coli in such small quantities that our 
methods have not given any conclusive evidence of 
whether or not they are intermediates of protein 


synthesis. x 
These various attempts to find intermediates beyond 
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Fic. 11. Corresponding incorporation into individual phosphorus 
atoms of adenosine triphosphate. 


the amino acids and nucleotides have failed. Looking at 
macromolecular synthesis from the point of view of 
the intermediates, one sees both amino acids and 
nucleotides held in the cell but exchangeable with the 
outside medium, until suddenly and without further 
preparation they appear linked into macromolecules. 
At this stage, it becomes necessary to determine in 
which macromolecules they appear. The ribosomes are 
of particular interest because it has been suspected for 
some time that RNA plays an important role in protein 
synthesis. 

A column of Sober’s modified cellulose (DEAE) pro- 
vides an adequate resolution of the juice from disrupted 
cells. Using a salt-gradient elution, the proteins are well 
spread out and the protein of the ribosomes stands out 
as a distinct peak in the distribution. The ultraviolet 
absorption shows a corresponding peak with a second- 
ary peak of nucleic acid. 

Following a short exposure to radiosulfate or to 
radioactive amino acids, the specific radioactivity of 
the ribosome peak is somewhat lower than that of the 
other proteins. Neither is there any extra radioactivity 
in the proteins which spin down with the ribosomes. 

Corresponding experiments with P?O, show at first 
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Fic. 12. Radioactive proline is rapidly incorporated by Æ. coli. 
The difference between the total and the TCA-insoluble lorate) 
measures the pool of extractable amino acid. The rate of incorpo- 
ration into protein is proportional to the specific radioactivity of 
the pool, indicating a precursor-product relationship. 
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) w and distinct peak with its radioactivity trans- 
ly after a considerable kinetic delay to the 


product of synthesis, differ from the observa- 
veral other laboratories that radioactive amino 
pear first in the ribosomes. Two explanations 
discrepancy seem plausible. One is that in 


alian tissues may be owing to exchange. Another 
ty is that the ribosomes of E. coli work much 
rapidly, and there is correspondingly much less 
rmed protein adhering to them. It should be 
out also that the recovery from the columns is 
plete and an important part of the kinetics 
yy may be missed. In particular, the cell mem- 
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branes which are lost on the column are initially high 
in radiophosphate. 

In the space allowed, it is difficult to present more 
than a brief glimpse of the operation of the synthetic 
machinery of a cell, of the way its macromolecules are 
bathed in an internal medium which is different from 
but rapidly responding to the external environment, of 
the way the reactions change to meet changing condi- 
tions in the environment. It has been necessary ‘to 
discuss the various elements separately, but the inter- 
couplings and interactions are equally important to 
maintain a balance among the different systems. I have 
attempted to emphasize those findings in the study of 
intermediary metabolism which throw some light on 
the more obscure processes of protein and nucleic-acid 
synthesis, but they provide little information, only a 
few boundary conditions which must be met by any 
model of macromolecular synthesis. 
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) LMOST all reactions in living things occur 

spontaneously at insufficient rates and have to be 

catalyzed. The biological catalysts, enzymes, have all 

\\ been found to be proteins. With respect to their turn- 

over numbers, specificity, and variety of reactions 

catalyzed, enzymes are the most remarkable catalysts 

known, especially since they provide the means by 

which the rates of various reactions are controlled so 

that the rates of the large number of necessary reactions 

d have a proper relation to each other and to the require- 
ments of the situation. 

Thus, one is presented with the tremendous challenge 
of trying to understand these catatalytic actions. One 
can distinguish roughly between two types of ap- 
proaches to this problem, the kinetic approach and the 
chemical approach. Of course, these distinctions are 
arbitrary and it is really not possible to make a clear 
separation. Through the kinetic approach, which is 
emphasized in this article, one studies the effect of 
various independent variables on the rate of reaction 
and the interpretations of these effects. Through the 
chemical approach, one studies the effect of chemical 
modification of the substrate and of chemical reactions 

S of the enzyme. 

The general features of the kinetic approach have 
been reviewed in the literature. °? For any given 
mechanism, it is possible to write an equation for the 
rate of formation of each substance involved. These 
equations consist of summations of concentrations or 
products of concentrations, each multiplied by a rate 
constant. These rate constants are constants under a 
given set of conditions, but the values of the rate 

à constants are affected in a number of ways by com- 
ponents of the solution. The effects of constituents of 
the reaction medium may be classified according to: 
(a) whether they have a significant effect on the ac- 
l * tivity coefficients of reactants and the transition state; 
or (b) whether they are directly involved in steps in 
the mechanism. Effects of the first type are encountered 
always when the electrolyte concentration is changed. 
Effects of the second type are produced by hydrogen 
ions, metal ions, inhibitors, coenzymes, and even by 
buffer ions. 
The solution of the simultaneous differential equa- 
> tions for a mechanism yields the functional form for the 
dependencies of the various concentrations upon time. 
-- Asa further aid in this analysis, there are two types of 
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relations between the various rate constants in a 
mechanism. According to the first, provided by thermo- 
dynamics, the values of the rate constants for all of the 
forward and reverse steps must give the correct equi- 
librium constant for the over-all reaction. According 
to the second, provided by irreversible thermodynamics, 
the frequency of transitions from A to B must be equal 
to the frequency of transitions from B to A at equi- 
librium, even though there are other indirect paths by 
which this equilibrium may be maintained. This 
principle of detailed balancing“ applies when there 
are cyclic paths in a mechanism. 

The kinetic behavior of a given mechanism always 
can be represented by a theoretical rate equation, but 
in the laboratory one is faced with the inverse problem 
of having to devise a mechanism to explain certain 
kinetic observations. This process requires imagination 
and is not one that can be summarized by a series of 
rules. Kinetic data show the composition of the acti- 
vated complexes for the various steps in the reaction 
but do not prove a particular mechanism. Many 
conceivable mechanisms, however, may be eliminated 
by the use of such data. 

A minimum requirement for devising a mechanism 
to account for kinetic data is a complete understanding 
of the kinetic consequences of any given mechanism. 
As an illustration, the simplest type of enzymatic 
mechanism is considered first, in which an intermediate 
X is formed by reaction of enzyme and substrate or of 
product and substrate. The concept of an intermediate 
complex dates back at least to 1902 when Henri and 
Brown independently sought to explain the fact that 
the velocity of inversion of sucrose by invertase is 
independent of the sucrose concentration at high 
concentrations and is directly proportional to sucrose 
concentration at low concentrations. In 1913, Michaelis 
and Menten attracted considerable attention with 
their derivation of the initial-velocity equation assuming 
that the reaction £+S@X was in equilibrium, a 
rather restrictive assumption. 


ENZYMATIC REACTION WITH TWO SUBSTRATES 


kı kz 
A X> E+ P 

kə kı 
Initial conc. €o So 0 €0 0 
Conc. at? e=@—x s=Sy—-x—p x e=% t 


(1) 


If there is more than one site per enzyme molecule, this. 
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derivation will be applicable if there is no site-site 
interaction. If the reactants initially present are Æ and 
S, it is convenient to introduce the conservation 
equations (eo=e+x and so=s+p+<) and express the 
various concentrations at any time in terms of the 
instantaneous concentrations v and p of intermediate X 
and product P and the initial molar concentrations eo 
and so of enzymatic sites and S. Owing to the conser- 
vation equations, only two of the four rate equations 
which may be written are independent. According to 
mechanism 1, the rates of change of concentration of 
X and P are given by 


dx/dt= kyse+kspe— (Ro+ks)x (2) 
dp/di= k3x— kape. (3) 


When these equations are written in terms of the two 
unknown concentrations y and p by use of the expres- 
sions under mechanism 1 and are rearranged, they 
become 


dx/dt= kySoeo+ (Ra— kı) eop 
— [kieo kiıso+ (k4— hi)pt+hotks jut kx? 


Babee i) 4 dp/dt= (kat+-kap)x—Raeop. 


(4) 
(5) 


4 Transient Phase 


Apparently, these simultaneous nonlinear differential 
equations can be solved in complete generality only 
with the use of a differential analyzer. The general 
character of the solutions is that x rises from zero and 
“it may or may not go through a maximum.”!5 Families 
ee E y of plots of x and p for this mechanism have been given 

et by Chance’® for the case that k4=0. Various approxi- 
mate solutions have been discussed for the transient 
phase of the reaction when k4=0.17-? ` 

For the particular case that kı= k4, a condition which 
is met in practice at certain pH values, Eq. (4) is 
simplified so that it is immediately integrable to yield 
x as a function of time.” Substitution of this relation 
into Eq. (5) and integration yields p as a function of 
time. Rather than give the resulting equations, which 
are rather complicated, only the forms obtained when 
so>€o are given; they are: 


.  y€0S0 


== {1—exp[— (kısotk:+k3)t]} 
kisot k2 +k 


(6) 


x 


[ kı (ko+ka)eot | | 
AETA IAA 


X ee 1—exp[— (kisotho+hs)t]}. (7) 
EE  (hsot hot hs) 


hus. for this special case, the concentration of the 
e X rises in a first-order manner with a 


73 =0.693/, (kaso hat ka) (8) 


ROBERT A. 


0. Gurukul Kangri University Haridwar Collection. Digitized by S3 Foundation USA 


ALBERTY 


Equation (7), for the concentration of p, contains two 
first-order terms. The half-life for the first term is _ | 
given by 7; and that for the second term is given by 7». i 


0.693 (Risotko+k3) 
T= —. (9) 
ki (ko+ks)€o 


Since so>>¢e0, T271; that is, the half-life for the appear- { 
ance of P is very large as compared with that for the | 
appearance of the intermediate X. The second term 

in Eq. (7) is identical with the solution obtained below ` } 
with the steady-state approximation. 

If kıžką and so>eo, Miller and Alberty™ have 
obtained a perturbation solution which is useful if 
0.1kı<kı< 10k. b 

The study of the transient phase of enzymatic K 
reactions requires very sensitive and rapidly responding 
analytical methods. When suitable methods can be 
developed as Chance has done for the peroxidase and 
catalase reactions and for the respiratory chain,” more 
information can be obtained about the intermediates 
than could be obtained in general from steady-state 
measurements. 


Peme 


Steady State 


If the substrate concentration is high as compared 
with the concentration of enzymatic sites, Æ must be 
converted to X and reconverted to Æ many many times te 
before equilibrium is reached. As a result, dx/dt is very | 
small as compared with other terms in the rate equation 
and, to a high degree of approximation, may be taken 
equal to zero. This is an approximation of considerable 
value for most enzyme kinetic experiments. Setting 
dx/dt in Eq. (4) equal to zero for the case that so>eo 
yields 

kasot (Ri— ka) (So— p) 


x= 5 
(Rohs) + kasot+ (ki— ka) (So— p) 


The concentration of the intermediate X may either 
increase or decrease during the steady-state phase of 
the reaction,!® depending upon the sign of kı— k4. If 
k4> kı, the concentration of X increases; if ki>ks, the 
concentration of X decreases; and if kı= k4, the concen- ʻa | 
tration of X remains sensibly constant for (>>7,. If the 
concentration of X increases in the steady state, it 
does not pass through a maximum; if the concentration 
of X decreases in the steady state, it does pass through a 
maximum before the steady state is established. 

Substitution of Eq. (10) into Eq. (5) yields the 
steady-state rate equation” for mechanism 1.{ 


(10) 


EEG 
1+s/Kst+p/Kp = E 


t Writing the equation in this form rather than in terms of the 
individual rate constants has the advantage that the constants, 
are those directly determinable from experiment and that the 
steady-state rate equation for the n-intermediate case can be = 
written in the same form. 
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where 
Vs= R3€0 
K s= (k+ ks)/kı (12) 


The values of the kinetic constants usually are obtained 
by measuring the steady-state velocity v; of the forward 
reaction immediately after the transient phase of the 
reaction but before the second term in the numerator 
and the third term in the denominator of Eq. (11) 
become important. Under these conditions, the velocity 
of the over-all reaction is given by the Michaelis- 
Menten equation. 


v= V s/(1+K 5/50). 


V p= ko€0 
K p= (ko+ks)/ka. 


(13) 


The initial steady-state velocity of the reverse 
reaction may be measured similarly if the equilibrium 
constant is not too large. 

v= Vp/(1+Kp/po). (14) 
to this mechanism, as so (or po) is in- 
creased, the initial steady-state velocity approaches 
V s= keo (or Vp= keo) which is referred to as the 
maximum initial velocity. If V is expressed in moles per 
liter per unit time and if eọ is expressed in moles per 
liter, k is the turnover number of the enzyme, the 
number of moles of product produced per enzyme 
molecule (or better, per enzymatic site) per unit time. 
Under optimal conditions, the turnover number for 
catalase is greater than 108/min; for fumarase it is 
105/min; and for chymotrypsin it is 0.01-10/min, 
depending upon the substrate. 

The constants Ks and Kp in Eqs. (13) and (14) are 
referred to as Michaelis constants. It is evident that 
Michaelis constants have the units of concentration and 
are equal to the substrate concentration required to 
produce an initial steady-state velocity half as large 
as the maximum initial velocity. At substrate concen- 
trations well below the Michaelis constant, the initial 
velocity is directly proportional to the substrate 
concentration. 

In order to determine vy or v;, a very sensitive 
analytical method is very much needed since these 
velocities must generally be determined during the first 
several percent of reaction. Spectrophotometric, fluoro- 
metric, and titrimetric methods are those most often 
used, and there is a real need for better analytical 
methods for measuring small changes. 

If the Michaelis constant for the product is small so 
that the third term in the denominator of Eq. (11) 
quickly becomes important™ or the equilibrium con- 
stant is unfavorable, the steady-state velocity decreases 
rapidly during a kinetic experiment making it difficult 
to determine the initial velocity. To determine vy, it is 
convenient to plot p/t or — (1/4) In(1—p/peg) versus t 


According 


“and extrapolate to =0 to obtain vs or vs/peg, respec- 


tively. Such an extrapolation is linear at sufficiently 
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short times and further kinetic information may be 
obtained from the initial slope.** 

The values of the four kinetic parameters are not 
independent of the equilibrium constant for the over-all 
reaction. At equilibrium, ds/dt=0, therefore, the follow- 
ing equation” is obtained from Eq. (11): 

Pea VsKp kiks 


K e= — = = 5 
: Seq VpKs kok 


(15) 


This useful test of the validity of kinetic data applies 
even if there are further complications in the mecha- 
nism, such as hydrogen-ion equilibria, or metal-ion 
binding, provided only that the effects of substrate and 
product concentration under a given set of conditions 
are represented by Eq. (11). Also, when the reaction 
goes essentially to completion, Vp may be calculated 
using this relation, provided that the other three 
kinetic constants have been determined and that Keq 
is known. 


Integrated Steady-State Rate Equation 


Integration of Eq. (11) which is applicable if so>>eo, 
and introduction of Eq. (15) yields® 


Vs 1 


1 1 
ion k 


Ks 
-[i+ 2 | inf] (16) 
Kp eae, 


If Ke1 and ssK >, this equation becomes?’ 
Vst=p=Ks In(1— p/so). (17) 


Thus, plots of (1/4) n(1— p/peq) versus p/t are linear. 
However, since it is rather complicated to calculate 
the maximum initial velocities and Michaelis constants 
from the slopes and intercepts, this approach has not 
been used very often.?**8 

Once the values of Ks, Vs, Kp, and Vp have been 
determined, the values of the rate constants in mecha- 
nism 1 may be calculated from 


Pea 


kı= (V s/eo+ V p/eo)/Ks, (18) 
ka= V p/e0, (19) 
ka= V s/o, (20) 
ka= (V s/eo+ V r/eo)/K p. (21) 


Thus, from steady-state kinetic studies, it is possible to 


obtain the values of all four of the rate constants in - 


mechanism 1. If the number x of enzymatic sites per 
molecule is unknown, the calculated values of kı to ka, 
using the molar concentration of the enzyme in Eqs. 
(18) to (21), will be larger than the true constants by 
an integer factor n. 

Although mechanism 1 has certain features which are 
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i common to many enzymatic reactions, it can be stated 
: confidently that Eq. (2) is too simple to represent the 
i; mechanism of any enzymatic reaction. However, this 
mechanism is of great importance for two reasons. First, 
it can be extended in many ways to include further 
steps. Second, under a given set of conditions, reactions 
following more-complicated mechanisms may appear 
to follow mechanism 1. Thus, it is often useful to 
calculate the magnitudes of the four apparent rate 
constants. As an example of the extension of the simple 
mechanism, the mechanism of the fumarase reaction 
is considered. 


Effect of pH Changes on the Fumarase Reaction 


Fumarase catalyzes the hydration of fumarate (F) 


ie ; to L-malate (M). The effect of pH changes on the 
š A H CO2- 
4 ree SA 
$ yi C H H CO,- 
5 ki I +H.0= INVA (22) 
EE C C 
F te 4 Vad | 
i ba | —0:C H C 
Patt JAN 
bi E ue —O.C OH H 
i 


maximum initial velocities and Michaelis constants for 
the forward and reverse reactions can be accounted for 
in terms of the mechanism” shown in Fig. 1. Two 
intermediates are required because the maximum 
initial velocities for the forward and reverse reactions 
have different pH optima. 

As expected for the steady-state treatment of a 
mechanism of this type, only the equilibrium constants 
for the acid dissociations (represented by K) are 
involved and not the individual rate constants which 
make up these equilibrium constants. The steady-state 
rate equations may be arranged in the form of Eqs. (13) 
and (14).1° The Michaelis constants and maximum 
velocities are given by 


V p'eo 
OTT Kone Ker OP) 
1+ (H+)/Kaz+Koe/(H*) 
4 “1 (HY)/Koer’+Keer’/(*) 
V m'eo 
E/K F Kern E) 
n 1+(H*)/Kozt+Kox/(H*) 
Rigs Ri = aT ra 20) 
© 1-4 (H*)/Kaoeu'+Koem’/(H*) 


parameters in these four equations 
tal data over the range pH 
e b en determined at five ionic 
using tris- (hydroxymethyl)- 
fers. Only two of them, Kax 


(23) 


(24) 


Vum (25) 


Kept 
pid} J f 


AEs 


<i 


D N 
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and Kəg, can be identified with constants in the 
mechanism. The other eight kinetic parameters are 
related to the constants in the mechanisms by 


k3ks 
Vp = — (27) 
ks+kı+ ks 
p kəks+kzka+ kzks 
a Sn (28) 
kı(ka+kı+ ks) 
Roky 
A (29) 
Rot+ksthy 
kəks+kokı+ ksks 
C ar = — (30) 
ke(ko+ka+ k4) 
; ka+ks+ ks 
EF = = (31) 
(RatRs)/Karrt+ k3/ Kae M 
A (Rahs) Korr+k3K ger 
Kore = = (32) 
ks+kı+ ks 
kotkat ka 
Koni S= (33) 
(k+ k3)/Ka EM +k4/K aEF 
(Rotks)K oem +ksK ver 
bEM = . (34) 


kə+ks+ ks 


The K,’ and K,’ values are calculated from the plots of 
V versus pH and V/K versus pH.® The eight relations 
of Eqs. (27) to (34), however, are insufficient to 
determine the values of the other ten constants for 
the mechanism illustrated in Fig. 1. But, these relations 
do place considerable restrictions on the values of the 
various rate constants and acid-dissociation constants. 
A detailed analysis!® of the data shows that ks and k4 
may be determined within their experimental un- 
certainties, and minimum values are obtained for kı, ke, 
Rs, and Re. 

The experimental data show that the minimum 
values of the rate constants for the bimolecular reaction 
of the substrate with the enzymatic site are of the 


E EF EM E 
Kw! Kaa om | 
k3 k5 
E + EH EHF EHM EH + M 
a 2 4 6 
4 
2 a Kass Kan Ka 
EH? EHF EHM EH2 


Fic. 1. Mechanism of the fumarase reaction required to explain ` 
the effect of pH on the kinetics of the forward and reverse 
reactions, 


Aa 
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Fic. 2. Schematic representations of the enzyme-fumarate and enzyme-L-malate complexes indicating the two weak acidic groups 
R and R’ in the catalytic site and accounting for the cis-hydration [from T. C. Farrar, H. S. Gutowsky, R. A. Alberty, and W. G. 


Miller, J. Am. Chem. Soc. 79, 3978 (1957) ]. 


order of 10° liter mole! sec! when extrapolated to 
zero ionic strength. This high value for the rate constant 
of a second-order reaction in solution indicates that the 
reaction rate is controlled by diffusion. The maximum 
rate with which the substrate could diffuse into a 
hemispherical sink on a flat surface of a protein molecule 
is expressed as a second-order rate constant (in liter 
mole sec) by 


2rN 


k= 


=—_RiD f, 
1000 


(35) 


where V is Avogadro’s number, Ri is the reaction 
radius (sum of the radii of site and substrate), Die is 
the sum of the diffusion coefficients of substrate and 
enzyme, and f is an electrostatic factor which corrects 
for the slowing down or speeding up of the diffusion 
process owing to electrostatic interactions between the 
substrate and enzymatic site.’' The theoretical values 
of the second-order rate constant calculated for fuma- 
rase using Eq. (35) are greater than the minimum value 
obtained from the kinetics by a factor of 3 at most.” 
Therefore, it is likely that the first step in the enzymatic 
reaction is simply the formation of an ion pair which 
subsequently undergoes an intermolecular reaction. 
The effect on the second-order rate constant of changing 
the ionic strength indicates that the enzymatic site has 
two positive charges.” 

Because of the high value of this second-order rate 
constant, the half-time for the rise in concentration of 
intermediates in the fumarase reaction is of the order of 
10-5 to 10-* sec. Such rapid changes cannot be studied 
by mixing methods, but may be measured by relaxation 
methods such as those developed by Eigen. 

Our interpretation of the total effect of two acid 
groups in the enzymatic site is that these groups donate 
and accept protons in the catalytic process. This is 
illustrated in Fig. 2 which shows the ionizable groups 
on one side of the bound substrate ion as required by 
the stereochemistry of the deuteromalic acid obtained 


when the reaction is carried out in D.O.* According to 
this mechanism, the enzyme is a clever acid-base 
catalyst which makes available a proton at exactly 
the required position in space and which has a site to 
accept a proton at precisely the right position. This 
sort of mechanism accounts for the absolute stereo- 
chemical specificity. 


Inhibition and Activation 


Many substances combine with enzymes in such a 
way as to reduce or even to increase the steady-state 
velocity. Compounds which are structurally similar to 
the substrate often combine at the enzymatic site. This 
may be represented by adding the reaction 


E+I=El, (36) 


where EI cannot combine with substrate and J is the 
inhibitor. The steady-state rate equation for reactions 
(1) and (36) becomes 


V, 
EK O O 


v 


(37) 


where 7 is the concentration of the inhibitor J and Kr 
is the dissociation constant for the ÆJ complex. An 
inhibitor whose effect on the rate may be represented 
in this way is called a competitive inhibitor. Information 
on the competitive-inhibition constants for a number 
of inhibitors is useful in learning about the nature of 
the enzymatic site. However, since inhibition constants 
will, in general, depend upon the pH, what is needed in 
the study of inhibitors are kinetic data over a range of 
pH, so that the affinity of a particular ionized form of 
the enzymatic site for the inhibitor may be calculated .35 

If the inhibitor combines only at a site where sub- 
strate does not combine, the effect on the kinetics is not 
competitive and the inhibitor concentration enters the 
rate equation in a different way. A mixture of com- 


petitive and noncompetitive effects may be accounted 
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for by a mechanism of the type® 
ky ky 
E+S A ES — E+P 


2 


k4 
IB = IDI 
5 
i Re (38) 
; E+I = IE 


7 


ks kio 
IE+S k IES — I[E+P 


9 


ku 
ki BIHI TEI 
4 kiz 
where writing J to the left of E means that it is com- 
bined at a neighboring site. The combination of an ion 
at a neighboring site may increase the rate of reaction 
of the enzyme-substrate intermediate to yield product. 


OE ne a aa E a 


we gi . The steady-state rate equation for this mechanism is 
SAN the Michaelis equation with 
ae | H 
a ns a 1+kioK esi/ksKresKre 
is i Hy K= a (39) 
ry ie 1+Kesi/KresKre 
f ee 7 a ; f 
ee ti 1+472(1/Kert1/Kre)+?/KerkKrer 
tae ata ee, (40) 
Eten 1+Kesi/KresKre 


ifs 


where Krs=(ko+ks)/hi, Kres= (ko+kio)/ks, Ker 
J = ks/ ka, Kiz= k/ko, and Kre1=hio/ hi. Accordingly, 
linear plots of 1/v versus 1/s are obtained, but both the 
| slope and intercept vary with the concentration of 
ae bat inhibitor in this general case. Mechanisms with further 
steps have been discussed in detail in the literature.%¢ 

Buffer ions, and even substrate ions, may cause 
activation or inhibition as discussed above for substance 
7.37 Tf substrate is bound at a neighboring site and this 
alters the rate constants in the catalytic mechanism, 
the steady-state kinetics will not follow the Michaelis- 
Menten rate equation except in the limit of low sub- 
i ooe concentrations. The binding of substrate at 
= higher substrate concentrations may cause either in- 
: hibition or activation, effects which are not provided 
_ for by the simple Michaelis-Menten mechanism. 


ENZYMATIC REACTIONS WITH 
FOUR SUBSTRATES 


_ - Such a reaction may be represented in general by 
7 Y ia? ae E 


A+B=C+D. (41) 
ple is the alcohol-dehydrogenase reaction 
! a+ DPN= aldehyde+DPNH+H*, — (42) 


spe poiie nucleotide (a co- 
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enzyme) and DPNH is the reduced form. This reaction 
is catalyzed by the enzyme alcohol dehydrogenase. The 
kinetics of this reaction have been extensively in- 
vestigated by Theorell and co-workers.*8—*! 

Steady-state treatments have been given for a 
number of simple mechanisms that can be conceived 
of for a reaction of this type. The objective of the 
kinetic experiments is to determine which of the paths 
is followed and which are the important intermediates 
and rate-determining steps. The same type of rate law 
is obtained for several of these mechanisms, and, as 
shown by Dalziel,‘ it is convienient to write the steady- 
state rate equations for the initial velocity when A and 
B are mixed with enzyme in the general form 


eo oi 2 
== ap 


v Go Wb) @ 


Q12 


(43) 


wherever this is possible. Here a and b represent the 
initial concentrations of A and B, and ġo, 1, $», and 12 
are parameters which may be obtained readily from the 
experimental data by making plots of 1/v versus 1/a 
with 6 held constant and 1/v versus 1/b with a held 
constant. There are certain relationships between 
these parameters for certain mechanisms and various 
relations between the parameters for the forward and 
reverse reactions. This is illustrated for the simplest 
type of mechanism only. Both steady-state and 
transient-state kinetic data consistent with the following 
mechanism have been obtained**! for the alcohol- 
dehydrogenase reaction 


kı 
E+A=EA 


—1 


ke 
EA+-B = EC+D 


2 


(44) 


_, Rs 
EC = E+C. 
—3 


The steady-state rate equations for the forward and 
reverse reaction are conveniently written in the forms 


ĉo afi + 
=s (= D J (45) 
Us k3 kya Rob kya 

1 1 1 k 
= : (46) 


a» ka kac kod Dapper: 
It can be seen that, from studies of the forward reaction, 
the values of kı, ke, ks, and k—ı may be obtained, and 


that, from the reverse reaction, the values of k-1, k-2, 


cr e 
kN 


k_z, and ks. Thus, all six of the rate constants are . 


obtained and check values are obtained for ks and k- 
There is a further relation between these kinetic 
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parameters which must be obeyed; namely, 


- Ceqdeq Rykoks 
Keq= = ; 
leqbeq k—ık—zk—3 


(47) 


Further checks may be obtained by the use of experi- 
ments in which one of the products is added along with 
the two reactants.“ For example, if A, B, and C are 
added, the steady-state rate equation becomes 


(E)o 1 1 
——=—+ 
vs kz 


1 k_y k_3¢ 
++ (1+ ). (48) 
kab kıkab ks 


Thus, by determining the effect of C on the initial 
steady-state velocity, it is possible to obtain k—3/ks 
and, consequently, a value of k_3 from the forward 
reaction, since k; is known. By adding D, it is possible 
to obtain k_». Thus, all six of the rate constants can be 
obtained from steady-state velocities of the forward 
reaction only. 

If there is a ternary complex in the second step of 
mechanism 44, there are eight rate constants in the 
mechanism, and all eight may be determined by means 
such as those just described. Actually, there must be a 
ternary complex in this mechanism, since a proton is 
transferred stereochemically specifically from DPNH 
to aldehyde to form an optically active monodeutero- 
ethanol, but apparently its steady-state concentration 
is very low under the usual experimental conditions. 

Hearon? has given a generalized mathematical 
treatment of mechanisms of reactions of this type which 
provides for the possibility of ternary complexes and 
for further reactants and products. Some rather unusual 
kinetic features are encountered in the steady-state 
studies of coenzyme mechanisms. 


USES OF ISOTOPES IN ELUCIDATING 
MECHANISMS 


The use of isotopes often permits the determination 
of the cleavage point in an hydrolysis reaction, as 
represented by*>*6 


AOB-++H,0*=AO*H+BOH or AOH+BO*H, (49) 


or of the position of the hydrogen atom transferred, 
as in 
CH;CD,0H+DPN+=CH;CDO+DPND. (50) 


Isotope-exchange experiments can be used to dis- 
tinguish between the following two mechanisms for a 
transfer reaction. 


e 


AB+D=A+BD 
A) AB+E=EAB 
EAB+D=EABD=EBD+A 
EBD= E+ BD 
AB+E=BE+A 
D+BE=BD+E 


(1) 


If the mechanism is given by (II), a small amount 
of free A is formed when AB is added to the enzyme, 
even when D is absent. If labeled A is placed in the 
medium, the label appears in AB. If the mechanism is 
given by (1), no incorporation of label into AB occurs 
unless D is present also. 

When the fumarase reaction is carried out in D2O, a 
specific monodeutero-L-malate is formed. The absence 
of an isotope effect in the enzymatic dehydration of this 
monodeutero-L-malate suggested that the deuterium 
might be incorporated in a rapid reversible step. In 
order to determine whether such an exchange precedes 
the rate-determining step in the fumarase reaction, an 
experiment was carried out with L-malate as substrate 
in D.O, and L-malate was isolated at various times and 
analyzed for deuterium.“ The experiment was carried 
out under conditions where the Michaelis constants for 
malate and fumarate were equal so that the over-all 
reaction was pseudo first order, and the incorporation 
expected in the absence of exchange could be calculated. 
The experimental data were in agreement with this 
calculation showing that there is no direct exchange of 
the specific proton of malate with the medium. 

Investigations of the type discussed in the foregoing 
reveal the nature of the kinetically important steps in 
the reaction and the values of the rate constants, but 
they do not reveal much about the chemical nature of 
these intermediates. A basic difficulty in understanding 
the chemistry of the intermediate compounds at the 
present time is the lack of information about the 
chemical structure of the enzymatic site. Progress is 
being made, however, and recent gains in the under- 
standing of the chemical action of chymotrypsin 
(Neurath, p. 185) is an example, where the chemical 
mechanism has been filled into these blank spaces in the 
formal mechanism. 
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HE idea that proteins are enzymes is so firmly 
entrenched in our minds that it may be regarded 
as a “self-evident truth.” However, only a few decades 
ago, this concept was the subject of a lively controversy, 
and the first report, by J. B. Sumner,! of the crystalliza- 
tion of a protein having all of the properties of the en- 
zyme urease, was greeted with critical reservations. The 
subsequent crystallization of other enzymes, including 
several proteolytic enzymes, notably by Northrop, 
Kunitz, and Herriott, has added support to the identi- 
fication of enzymes as proteins, and today over 130 
enzymes of known specificity and function have been 
obtained in crystalline form (see Dixon and Webb). 
Of course, crystallinity, per se, is neither a necessary 
nor a sufficient criterion for the protein nature of a 
compound. Suffice it to say that every criterion of 
protein chemistry has been found applicable to purified 
enzymes, and no enzyme has yet been found which 
would be an exception to this statement. Thus, it has 
been known for many years that processes leading to 
protein denaturation, in turn, are likely to cause 
enzyme inactivation, and, when denaturation could be 
reversed, enzyme activity would reappear. Typical 
examples for this statement are the work of Kunitz,* on 
the heat denaturation of the soy-bean trypsin inhibitor, 
and the work of Eisenberg and Schwert, on the 
reversible denaturation of chymotrypsinogen. As de- 
naturation may be described as an effect on the 
secondary and tertiary structure of proteins, it may be 
said, therefore, that enzymatic activity is related to a 
specific configuration of the protein molecule or of a 
part thereof. The enzymatic activity is evidently also 
related to the primary structure of the protein molecule; 
i.e., the sequence of amino acids along the polypeptide 
chain but not all amino-acid side chains are required for 
activity. Thus, in some cases, some of the side chains 
may be chemically modified without adversely affecting 
catalytic functions, and in others, some portions of the 


polypeptide chain may be split off without deleterious 
effects on activity. Furthermore, enzymes of like 
specificity but of different origin have been found to 
differ from one another in amino-acid composition. It is 
evident, therefore, from these sketchy comments, that 
not the entire molecular structure of the protein is 
necessarily involved in the enzymatic process, a concept 
which is fully consistent, on the molecular level, with 
the idea previously held by enzymologists that an 
enzyme has an “active site” or “active center” which is 
the primary locus of the catalytic process. Such a view, 
though probably an oversimplification, provides a 
convenient starting point for an inquiry into the relation 
of protein structure to enzymatic function. 

It seems desirable at this point to differentiate be- 
tween the catalytic activity as such and the specificity 
of the enzyme, because, unlike inorganic catalysts or 
catalysts of simpler molecular structure, enzymes are 
endowed with a high degree of substrate specificity. 
Though the same type of bond may be hydrolyzed by an 
enzyme in different substrates, the chemical environ- 
ment of the bond being split will determine the speci- 
ficity requirements of an enzyme. Consider, as an 
example, the proteolytic enzymes which, as the name 
implies, catalyze the hydrolysis of proteins. A protein 
may be considered a repeating pattern of peptide bonds, 
but each peptide bond will find itself ina different environ- 
ment, depending on the nature of amino acids con- 
tributing the CO and NH groups. Table I indicates the 
specificity requirements of trypsin, chymotrypsin, and 
pepsin. 

This specificity is related to the binding of substrate 
by the enzyme, the simplest postulate being that the 
enzyme has a structure complementary to that of the 
substrate. Conversely, if the enzyme is presented with 
a typical substrate, the bond being hydrolyzed can be 
varied within relatively wide limits provided that the 
various bonds do not differ greatly in bond strength. 


TABLE I. Peptide bonds hydrolyzed by proteolytic enzymes. 


Bonds hydrolyzed 


Enzyme Substrate 
Trypsin Insulin (oxidized B) 
Ribonuclease (oxidized) 
Chymotrypsin Insulin (oxidized B) 
Melanophore-stimulating 
hormone 
Pepsin 8-Corticotropin 
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TABLE II. Types of bonds hydrolyzed by chymotrypsin. 


Peptide R,CONH—CHR,2—CO-7-NH— 
Ester R:CONH —CHR:—CO-7-0R; 

Amide R,CONH—CHR2—CO-7-NH, 

Hydroxamide R,CONH—CHR:—CO-7-NHOH 

Hydrazide RiCONH—CHR.2— CO-7-NHNH2 

—C—C— CsH;OH—CH2—CH2—CO-7-CH2—COOC2H; 


Thus, chymotrypsin can hydrolyze not only peptide 
bonds but also amide, ester, hydroxamide, hydrazide, 
and certain labile C—C bonds (Table II). The feature 
common to all these substrates is that the surroundings 
of the bonds are similar and meet the specificity require- 
ments of the enzyme. 

Having thus proposed the existence of a catalytic site 
and a specificity site within the enzyme molecule, one 
may start the inquiry by posing the following questions: 

(1) What is the chemical configuration of the active 
site responsible for catalysis? 

(2) Is the same or a different site responsible for the 
specificity of the enzyme, and what is the chemical 
nature of that site? 

(3) How many active sites (including both the 
catalytic and the specificity sites) exist in an enzyme 
molecule? 

(4) What is the function of the remainder of the mass 
of the enzyme molecule, or, to put this question in an 
operational frame, how extensively can an enzyme be 
degraded before it loses its enzymatic activity? 

At this point, one should differentiate clearly between 
the protein portion of the enzyme and the nonprotein 

portion, which may also be part of the active site. The 
nonprotein moiety, usually referred to as a prosthetic 
group, may be an organic molecule of a different kind 
such as the flavin group in a flavoprotein, it may be a 
metal ion, or it may be a combination of an organic 
molecule and a metal ion. Such a prosthetic group and 
the particular group of the protein involved in the 
specific interaction are operationally included in the 
definition of the active site. The important role of 
metals in these catalytic processes is discussed in recent 
reviews by Vallee’ and by Smith ef al.” which should 
be consulted. 

Consider now enzymes which, as far as is known, do 
not contain a prosthetic group, and where the active 
site seems to be composed entirely and exclusively of 


TABLE III. Structural characteristics of chymotrypsinogen. 


= Molecular Weight 
‘Chemical analysis _ 25 081336 
£8 ‘Sedimentation, diffusion, oH 310 a a 
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specific amino-acid residues. Such enzymes include 
certain, but by no means all, proteolytic enzymes. 
Trypsin, chymotrypsin, and papain are proteolytic 
enzymes which are believed to be free of nonprotein 
prosthetic groups, whereas carboxypeptidase is definitely 
known to be a zinc-metalloenzyme® and is not to be con- 
sidered here. The three proteolytic enzymes mentioned 
in the foregoing have each been fully characterized with 
respect to molecular weight and certain other physico- 
chemical properties,’ some of them with respect to 
their amino-acid composition and partial amino-acid 
sequences.*:" The physicochemical properties of an 
inactive precursorf of one of these enzymes, chymo- 
trypsinogen, are summarized in Table III. It is clear 
that chymotrypsinogen has a molecular weight of about 
25 000, a well-defined isoionic point, and is composed 
of a single polypeptide chain with a single half-cystine 
as the N-terminal residue and a single asparagine as 
the C-terminal residue.t The 25 000 molecular-weight 
unit contains 242 amino-acid residues.” 

The relation of the primary structure of the enzyme 
to its catalytic function may be elucidated by a study 


| 
Val— Ge ame LYS ats Tleu a a) 
Y 
Trypsinogen 


— - Val —(Asp),— Lys + Tleu ay OSE) 


Peptide Trypsin 


Fic. 1. Schematic representation of the tryptic 
activation of trypsinogen. 


of reactions which involve the hydrolysis of a limited 
number of peptide bonds (“limited proteolysis”). Two 
types of reactions may be considered—namely, (1) those 
which involve the conversion of enzyme precursors to 
the active form (zymogen activation), and (2) those 
which lead to the partial degradation of active enzymes. 

The controlled hydrolysis of one or more peptide 
bonds in trypsinogen or chymotrypsinogen causes these 
zymogens, having potential enzymatic activity, to 
become converted to active enzymes. As a result of 
this primary event, an intramolecular rearrangement 
occurs which presumably leads to the formation of 
the active site. Figure 1 shows schematically the 


* In the case of one enzyme, ribonuclease, the entire sequence of 
amino acids had been recently determined by Moore, Stein, and 
co-workers at the Rockefeller Institute. Ribonuclease is not a 
proteolytic enzyme, but catalyzes the hydrolysis of ribonucleic 
acid. 

ft The suffix -ogen denotes a precursor of an enzyme which, after 
certain enzymatic changes, becomes converted into an active 
enzyme. Such an enzyme precursor is also known as a zymogen. 

t The C-terminal asparagine residue appears to be inaccessible 
to carboxypeptidase in intact chymotrypsinogen, and it remains 
inaccessible in the chymotrypsins. It can be detected after rupture 
of the disulfide bonds, by reduction or by oxidation. 
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conversion of trypsinogen to trypsin. The trypsin- 
catalyzed hydrolysis of a specific, single peptide bond 
near the N-terminal region of the molecule, causes the 
hexapeptide, HVal— Asp— Asp— Asp— Asp—LysOH to 
be released and the N-terminal valine residue of the 
protein to be replaced by an isoleucine group. This 
seems to be the one and only chemical event which 
underlies the conversion of the inactive zymogen into 
the active trypsin enzyme." 

Figure 2 illustrates similarly the conversion of chymo- 
trypsinogen to chymotrypsin. Here, only one of the 
seventeen bonds susceptible to hydrolysis by trypsin 
needs to be split in order to convert the zymogen into an 
active enzyme. These are fruitful processes in the sense 
that an active enzyme is formed from an inactive 
precursor. 

Another example of peptide-bond hydrolysis relates 
to the partial degradation of an enzyme with or without 
concomitant loss of biological activity.“ For instance, 
the removal from chymotrypsin of a few amino acids 
from the carboxy] end, with the aid of carboxypeptidase, 
yields a derivative which is still fully active. Similarly, 
in the case of ribonuclease, which consists of a single 
polypeptide chain, removal of as many as three amino 
acids from the carboxyl terminus yields an active 
derivative. If, however, four or more amino acids are 
removed from the same position, the product will be 
inactive. Another case is that of the ACTH, the 
adrenocorticotrophic hormone, which contains 39 amino 
acids. The removal of three amino acids from the 
C-terminal portion of the chain has no effect on 
hormonal activity, whereas the removal of one or two 
amino acids from the N-terminal portion, by use of 


Thr 
| 
oe Tad Asp(NHo) 


a-Chymotrypsin 


leucine aminopeptidase, causes the hormone to become 
inactive. Perhaps the most dramatic case known today 
is that of papain, studied by Emil Smith and co- 
workers,” a proteolytic enzyme containing 180 amino 
acids. One hundred and twenty of these can be removed 
by stepwise hydrolysis with leucine aminopeptidase, 
and the derivative still maintains enzymatic activity, 
suggesting that the active center, including both the 
catalytic and specificity sites, is located in the carboxyl 
terminal region of the polypeptide chain and that the 
remainder of the molecule is not essential for over-all 
catalytic function. 

In order to identify the chemical environment of the 
active site, it is necessary that it be labeled. In what 
follows, consider the reactions which are involved in the 
labeling of the active site of an enzyme and in the 
subsequent identification of the amino-acid composition 
of its environment. 

The kinetics of the reaction of an enzyme—in this 
particular case, a proteolytic enzyme reacting with a 
substrate—indicate that at least two steps are involved. 
In the first step, the enzyme reacts with the substrate 
forming an acyl derivative; in the second step, this 
intermediate then reacts with water to form a free acid, 
and the enzyme is reconstituted. 


RCOX+EH — RCOE+HX, 
RCOE+HOH = RCOOH+EH. 


It is this type of reaction which may be thought to 
underline the processes which Alberty describes for 
other systems (p. 177). In the reaction of a proteolytic 
enzyme with esters, the acyl-enzyme is formed first 
with the liberation of the alcohol, subsequent hydrolysis 
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of the acyl-enzyme leading to the free acid. This type of 
= reaction has been studied in considerable detail, because 
= substrates such as p-nitrophenylacetate, saint contain 
Er a chromophoric group absorbing in the visible region 
= ofthe spectrum, lend themselves excellently to precise 
kinetic studies. 17 

In the reaction of chymotrypsin or trypsin with 
organic phosphates such as diisopropyl phosphoro- 
_ fluoridate (DFP), which has been studied for a number 
ars, a phosphoryl-enzyme i is formed instead of an 
enzyme. This is a convenient reaction for pur- 


TABLE IV. DIP-peptides from esterases. 
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2 Fic. 3. Proposed mechanism of enzymatic ester hydrolysis, involving the interaction of the imidazole nitrogen of a histidine side 


chain and the hydroxy] group of a serine side chain. 


poses of labeling, 
not occur. 
EH+ F— P(OC3H7)2 — E-P (OC3H7)o-+ HE 
Il Il 
O 


because the second step normally does 


O 
(stable). 


It requires the action of a more strongly nucleo- 
philic reagent than water for the phosphoryl group 
to be removed from the enzyme. It is, therefore, 
possible to react an enzyme such as chymotrypsin with 
DFP”, leading to the incorporation of the diisopropyl 
phosphate into the protein, and then to degrade the 
protein and see where the P3? went. 

This type of study has shown that, in the case of 
trypsin, chymotrypsin, cholinesterase and in the case 
of the enzyme thrombin which is involved in the blood 
coagulation process, only one mole of DFP was specifi- 
cally taken up per mole of enzyme, suggesting that these 
enzymes contain only one active center. In each case, 
the organic phosphate was found to be esterified to the 
hydroxyl group of serine. The numerous studies de- 
signed to identify the peptide sequence around the 
reactive serine residue have been described in con- 
siderable detail in recent reviews and are not considered 
here.?4.18 Suffice it to summarize, as shown in Tables 
IV and V, the composition and structure of the peptides 


bene from _DIP-trypsin and chymotrypsin, respec- : 


e serine group which is involved here must be 
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H2N—CyS—Gly—Val—(Ala,Pro)—Ieu—Val—Pro—Glu—Leu —Ser—Gly—Leu- Ser + Arg | 


Fic. 4. Fragmentary representa- 
tion of the amino-acid sequence of 
the single polypeptide chain of 
a-chymotrypsinogen. The dipeptide 


sequences delineated by the dotted Ea 
lines are split off during the activa- į Thr! 
tion (see Fig. 2). H L 
perme nny } 


i 
Coz s 


NH2 | 


Tyr—Leu—Lys —_7~~——Leu— Pro— Gly —Gly— (DIP) Ser — Asp— Gly ———~7~——_Val 


MP2 SSO, —$— — aaa 


(7 CyS) 


| Asp—NHz L Ala———__—Try—(Ser, Ala) —Val Thr —Leu— Ala— Asp — COOH 


—-————— 4 


~<— 32 residues —> 


(2 CyS) 


of unusual reactivity since there are twenty to thirty 
serine groups in trypsin and chymotrypsin, respectively, 
but only one of these is involved in the binding of DFP. 
Incidentally, the same seryl group is involved in the 
reaction of these enzymes with acylating agents such as 
p-nitrophenylacetate. Simpler seryl peptides do not 
Ņ behave in this way, and, if the enzyme is first denatured, 
the reaction with DFP does not take place at all. 
Conversely, if chymotrypsin is allowed to become 
acylated at pH 5, the acetyl group can be removed at 
this pH by reaction with hydroxylamine; but, if the 
protein is denatured by 8 molar urea after acylation, 
the acetyl group cannot be removed until denaturation 
is reversed, suggesting that separation and uncoiling 
of polypeptide chains removes in space another group 
or grouping which confers the high reactivity on this 
particular serine. A number of lines of reasoning and 
experimental data has led to the conclusion that the 
second group involved in the enzymatic reaction is the 
imidazole group of a histidine side chain. This subject 
has been considered in detail in a recent review” and 
no attempt is made, therefore, to evaluate the experi- 
mental evidence for this hypothesis. The presence of 


Fic. 5. Schematic rep- 
resentation of the struc- 
tural changes involved in 
the tryptic activation of 
trypsinogen. Rupture of 
the lysyl-isoleucine bond 
in the N-terminal region 
(dotted arrow) leads to the 
liberation of the activation 
peptide, and causes the 
newly formed N-terminal 
region of the polypeptide 
chain to assume a more 
nearly helical configura- 
tion. This, in turn, permits 
a histidine and serine side 
chain to come into juxta- 
position so as to form the x 
esteratic site (see Fig. 3). 

The specificity side 8) is 
believed to be pre-existent 
in the zymogen molecule. 
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histidine in the active region could account for the 
special reactivity of an adjacent serine residue, and, in 
a scheme developed by Cunningham, such an inter- 
action has been proposed to include a hydrogen bonding 
between serine and histidine, as shown in Fig. 3. 
According to this scheme, the hydroxyl group of the 
serine and the nitrogen of the imidazole group of the 
histidine interact by hydrogen bonding, forming the 
active configuration ; the pH dependence of the reaction 
is related to the pH dependence of the formation of 
these bonds. 

The exact position of the serine and histidine within 
the structure of trypsin or chymotrypsin is unknown. 
In order to describe the structure of the active site, it 
would be necessary to know the complete amino-acid 
sequence, whereas, in fact, only partial sequences can 
be established, as shown in Fig. 4 for a-chymotrypsino- 
gen. The exact tertiary structure of the polypeptide 
chain likewise needs to be elucidated. While this goal 
is still a long way off, nevertheless certain interpreta- 
tions can be advanced on the basis of the data at hand, 
as shown schematically in Fig. 5, for the formation and 
the structure of the active site of trypsin. 


Trypsinogen 
sS—s 
| A center 
(PS II 
l l 
Trypsin 
A = Asp I = Ileu 
G = Gly SE = Ser 
H = His V = Val 


X = specificity site 
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The active site does not exist prior to activation of 

l trypsinogen (or chymotrypsinogen), although the 
© ability to bind the substrate may be pre-existent in the 

: zymogen molecule. Furthermore, although the serine- 
histidine interaction would account for a generalized 
process of ester or peptide hydrolysis, it would fail to 
account for the high degree of specificity characteristic 
of proteolytic enzymes. It becomes of paramount 
importance, therefore, to elucidate those aspects of 
structure which differentiate between trypsin and 
chymotrypsin, and which, at the same time, may be 
correlated with their respective enzymatic specificities. 
It is not fruitful to speculate about the reasons why 
_ enzymes are as large as they are just as no one has given 
a physical interpretation for the role of the complex and 
large structures of cholesterol or porphyrin. The fact 
- that one has just begun to identify the functional 
relations among a few amino-acid residues in a molecule 
ontaining 200 or more does not justify the conclusion 
f that the remainder is without function. On the other 
T hand, the demonstration that în vitro one-half of a 
= molecule may have similar if not identical functions as 
sh the whole does not lead to the compelling conclusion 
= that, in a complex system such as living cell, the other 
vg half is of no consequence. It remains for the future to 
R fully what today can at best be considered as a 
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NE of the most useful, although simplified, views 
of the role of the nucleic acids in cells is that they 
function in a manner analogous to a punched tape in a 
computer. The punched tape and the nucleic acids are 
very long elements which direct the larger unit, either 
computer or cell, in which they are located. They both 
carry information through the linear arrangement of a 
few fundamental repeating units along their length. 
The nucleic acids are very long polymeric molecules 
which are built up by the repeated connection of a very 
few small molecules. It is generally assumed that the 
specificity of nucleic-acid function arises from the par- 
ticular sequence of its constituent residues, as well as 
from their geometrical configuration. 

There are two classes of nucleic acids, deoxyribo- 
nucleic acid (DNA) and ribonucleic acid (RNA), and 
they differ in structure as well as in function. However, 
they are made up of similar though not identical 
chemical units. The fundamental building block of the 
nucleic acids is the nucleotide. It is a complex molecule 
consisting of a purine or pyrimidine base, a sugar resi- 
due, and a phosphate group. In DNA, the sugar is 
deoxyribose, while in RNA, it is ribose. These sugars 
differ by the presence of a hydroxyl group on C2’. Both 
of the nucleic acids have four types of bases, two 
purines and two pyrimidines, and three of these are 
found both in the deoxyribose and in the ribose poly- 
mers. DNA and RNA contain the purines adenine (A) 
and guanine (G), as well as the pyrimidine, cytosine 
(C) (see Figs. 3 and 4). In addition, RNA has the 
pyrimidine uracil (U), whereas DNA has the closely re- 
lated pyrimidine, 5 methyl-uracil (thymine, T). Thus, 
both of the nucleic acids have a similar chemical com- 
position, and only differ by the presence of a systematic 
hydroxyl group on each nucleotide of RNA and by the 
absence of a methyl group on one of the bases. While this 
description of the chemical composition of the nucleic 
acids is roughly correct, it should be pointed out that 
some nucleic acids have modified purine or pyrimidine 
residues, such as a — CH2OH group or a glucose residue 
attached to cytosine in the bacteriophages. 

Both ribose and deoxyribose are five-carbon sugars 
which are in the furanose form—i.e., in the form of a 
ring involving four of the carbons and one oxygen. The 
nucleotides of DNA and RNA are connected by the 
same linkage through the phosphate, group which is 
attached to the Cs’ and the C;’ atoms of successive 
sugar residues. In a schematic way, the polynucleotide 
chain for both RNA and DNA can be written as shown 
in Fig. 1. It should be pointed out that the chain in 
Fig. 1 is asymmetric in that it has a direction which is 


most easily seen by the sense of the C;’—C;’ linkage in 
the sugar residue. 


INFORMATION TRANSFER AND THE 
NUCLEIC ACIDS 


It is generally believed that DNA alone functions as 
the carrier of genetic information. This understanding 
is based upon the classic experiments of Avery who dis- 
covered bacterial transformation—that is, the ability 
of purified DNA from one bacterial species to alter 
the metabolic characteristics of another bacterial species 
in an inheritable manner. This interpretation of DNA 
function was further strengthened by the demonstra- 
tion by Hershey and Chase! that DNA alone is the in- 
fective component and hence the carrier of genetic 
information in the bacterial viruses. 

All cells of a given organism have the same DNA 
content. The only exception to this statement is to be 
found in spermatozoa and ova, where the DNA content 
is one-half of the normal amount. Further, if one ana- 
lyzes the chemical composition of the DNA in all tissues 
of a given animal, it is found to be the same. Thus, it is 
believed that all cells contain the same set of DNA 
molecules. The DNA is located in the chromosomal 
material of the nucleus, and during cell division the 
DNA is replicated in some manner so that an equal 
amount is found in the two daughter cells, and with the 
same chemical composition as that found in the paren- 
tal cell. One of the goals of molecular structural work 
in the nucleic acids is to discover the fundamental in- 
terpretation for these phenomena. 
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Fic. 1. Diagram of the nu- N 
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Fic. 2. Schematic diagram of DNA. The two chains are anti- 
parallel, as shown by the arrows. The dotted lines between the 
bases represent hydrogen bonding. Although the chains are drawn 
as flat in the diagram, they are actually wound around each other 
in the molecule. 


Although the amount of DNA is the same in each 
cell of a given organism, the amount varies from species 
to species. In general, the more complex species have 
more DNA. A bacterial cell has the order of 108 nucleo- 
tides in its DNA, which would make a molecular strand 
about 2 cm in length. In mammalian species, there are 
the order of 10 nucleotides which corresponds to a 
total molecular length of 1 to 2 m/cell. Thus, the actual 
length of the primary coding material in a living cell is 
in the range of macroscopic dimensions and is much 
longer than the metabolic machine (or cell) which it 
directs. The same is often true for the punched tapes 
which are fed into computers. 


MOLECULAR STRUCTURE OF DEOXYRIBO- 
NUCLEIC ACID 


One of the most stimulating suggestions in molecular 
biology was a proposal made by Watson and Crick? 
that the molecular structure of DNA may consist of 
two polynucleotide chains helically wrapped around 
each other, with the sugar-phosphate chain on the out- 


ing the hydrogen bonding between ade- 
NA The AOE of the base pair are 


and Corey.’ 


side and the purine and pyrimidine bases on the inside. 
They suggested that the purine and pyrimidine bases 
from the two chains are joined by hydrogen bonds to 
form specific pairs. Thus, the adenine residue hydrogen 
bonds with thymine, and guanine hydrogen bonds with 
cytosine. These hydrogen-bond pairs are specific in that 
only these combinations have the necessary stereo- 
chemistry to fit into the repeating lattice formed by the 
regular helical polynucleotide chains. In a schematic 
way, the DNA molecule is illustrated in Fig. 2 which 
shows the pairing relationship between the two poly- 
nucleotide chains. The arrows indicate the direction of 
the sugar-phosphate backbone. The two strands are 
organized in an antiparallel fashion so that the molecule 
looks the same even if it is turned about by 180°. If 
one ignores the varied base sequence, the backbone 
sugar-phosphate chains are organized about a diad axis 
perpendicular to the fiber axis and passing through the 


To chain 


Fic. 4, Diagram showing the hydrogen bonding between gua- 
nine and cytosine in DNA. This pair is held together by three 
hydrogen bonds in contrast to the two found in the adenine- 
thymine pairing [from L. Pauling and R. B. Corey, Arch. Bio- 
chem. Biophys. 65, 164 (1956) ]. 


center of each base pair. The pairing of the bases is 
shown in Figs. 3 and 4. An important feature of the 
Watson-Crick hypothesis is the identity of the two 
types of base pairs. That is, the distance between the 
two sugar-phosphate chains must be the same both for 
the adenine-thymine pair and for the guanine-cytosine 
pair. In this way, both base pairs could fit into the helix 
interchangeably. Pauling and Corey* have made a 
critical survey of x-ray diffraction results obtained from 
crystals containing purines and pyrimidines. From this, 
they concluded that the cytosine and guanine residues 
are probably held together by three hydrogen bonds 
(Fig. 4), while the adenine-thymine residues are held 
together by two. The dimensions of the hydrogen-bond 
pairs suggested by Pauling and Corey are shown in 
Figs. 3 and 4. Within experimental error, the positions 
and angles of the two chains relative to the base pairs 
are identical. 


MOLECULAR STRUCTURE OF THE 


Early x-ray diffraction studies by Wilkins‘ and by 
Franklin,® and their collaborators showed that DNA is 
a helical molecule with 10 residues per turn. The gross 
features of the diffraction pattern were, at this early 
stage, shown to be compatible with the double-stranded 
form proposed by Watson and Crick. Wilkins and his 
collaborators have continued to carry out extensive 
studies on the diffraction patterns of DNA, and they 
are responsible for most of the knowledge concerning 
the detailed configuration of DNA.*$§ 

The DNA molecule exists in several forms. At lower 
relative humidities (about 70%), the molecule crystal- 
lizes in the A-form—that is, a face-centered monoclinic 
lattice with a=22.2 A, b=40.0 A, c=28.1 A, and B 
= 97.1 A. This unit cell contains a repeat unit of two 
DNA molecules with the helical axis along c. The water 
content of this lattice is about 40%, and the bases in it 
are tilted about 25° from the fiber axis. In this form, 
the DNA is a true crystal and produces about 100 
independent reflections. This implies that there is a high 
degree of regularity in the structure in all directions. 

At higher relative humidities, the B-form which is 
paracrystalline appears with the molecules all parallel 
to each other but with random rotation about their 
molecular axes. The layer lines on these paracrystalline 
diffraction patterns show a continuous distribution of 
scattering intensity rather than the sharp spots charac- 
teristic of a crystalline lattice. In the B-form, the fiber- 
axis repeat is 34.6 A, and there is an extremely strong 
x-ray reflection on the meridian at 3.4 A. More recently, 
Wilkins and his collaborators® have been able to obtain 
truly crystalline diffraction patterns of the lithium salt 
of DNA in the B-configuration. In this form, the 
lithium DNA crystals are orthorhombic and have the 
dimensions a= 22.7 A, 6=31.3 A, and c=33.6 A. The 
axis of the helical molecules is along c, and two mole- 
cules pass through the unit cell. 

The B-form of the DNA molecule is shown in Fig. 5 
where atoms have been drawn with approximately their 
van der Waals radii. The base pairs are shown horizon- 
tally in the middle of the diagram, and the two sugar- 
phosphate chains are helically wrapped around the 
stacked bases. As can be seen in the diagram, there are 
two helical grooves which go round the DNA molecule. 
One of them is wider than the other because of the 
orientation of the sugar-base bonds shown in Figs. 3 and 
4. The phosphates on the outside of the molecule are 
found at a radius of 9 A, and they are just over 7 A 
apart along a given chain. Wilkins and his co-workers 
have studied the organization of the polypeptide chain 
of protamine in a DNA-protamine combination.’ The 
polypeptide chain is arranged as a third coaxial helix 
which fills the narrow groove in the DNA structure. In 
this position, the positively charged arginine side chains 
can interact with the negatively charged phosphate 


` groups and stabilize the molecule. 


Whenever a parent cell divides, the genetic informa- 
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Fic. 5. A drawing of the DNA molecule using solid circles to 
illustrate atoms. It can be seen that there are two helical grooves 
of unequal size on the outside of the DNA molecule (after 
M. Feughelman et al.1). 


tion in that cell has to be replicated in order to insure 
continuity of inheritance to the daughter cell. The 
structural model of DNA suggested to Watson and 
Crick’ a method for the replication of this molecule. 
They felt that the parent molecule could unwind as 
shown in Fig. 6 so that its two strands separated. These 
individual strands could then serve as a template for 
organizing the individual nucleotides which are neces- 
sary to make the second strand of the DNA daughter 
molecules. The template specificity is assured by the 
specificity of the hydrogen bonds between the purine- 
pyrimidine base pairs. In this way, a single molecule 
could twist about its axis, and two daughter helixes 
would form on the unraveled ends. Although this mo- 
lecular model of genetic replication has not been estab- 
lished, recent experiments suggest that it is probably 
correct. This is a good example of the way in which a 
molecular structure can suggest a molecular mechanism. 


ROLE OF RIBONUCLEIC ACID 


_ In addition to carrying out its replication activities, 
it 1s necessary for the DNA molecule, acting as a genetic 


material, to influence and guide the metabolism of the 


CC-0. Gurukul Kangri University Haridwar Collection. Digitized by S3 Foundation USA 


G) 


UU 


194 


D 


Daughter helices 


Parent helix 


CD 


Fic. 6. Diagram showing a possible replication mechanism for 
DNA. The parent helix unwinds, and the two separated strands 
serve as sites for forming two daughter molecules. Both parent and 
daughter helices must wind simultaneously [from M. Delbriick 
and G. S. Stent in The Chemical Basis of Heredity, W. D. McElroy 
and B. Glass, editors (The Johns Hopkins Press, Baltimore, 
Maryland, 1957), p. 699]. 


cell. It is not known for certain how this is carried out. 
However, there is a large body of indirect information 
which suggests that this is carried out by using another 
nucleotide polymer which is somewhat similar to 
DNA— i.e., ribonucleic acid, which probably acts as 
an intermediary between DNA and the proteins which 
| are synthesized and constitute the working chemical 
H machinery of the cell. 
i The relationship between RNA and protein synthesis 
| is not completely worked out at the present time. Pro- 
teins appear to be polymerized from their amino acids 
at the site of the small particulate bodies which are 
known as microsomal particles. These units are widely 
distributed in the cytoplasm, and appear as approxi- 
mately spherical particles with a diameter of 180 A. 
They usually contain about half RNA and half protein, 
and it is likely that the RNA component plays a funda- 
E mental role in the protein-synthetic process, such as 
i partly or wholly determining the sequence of amino 
ng acids. Thus, in order to control protein specificity, the 
i information present in the DNA molecule in its nucleo- 
tide sequence may pass through the RNA molecule be- 
fore ultimately emerging in a particular sequence of 
amino acids which define a particular protein. If these 
views are correct, one would like to know how it is that 
the DNA molecule “makes? RNA and how in turn this 
RNA molecule organizes amino acids. Unfortunately, 
Ez at the present time, these questions cannot be answered. 
The role of RNA in the metabolic cycle is not simple. 
: mn to being implicated in protein synthesis, it 
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has also been demonstrated that RNA can function as 
a carrier of genetic information in a manner very 
similar to that of DNA. The pure RNA isolated from 
the tobacco mosaic virus is capable of infecting the 
tobacco leaf, and this infection ultimately produces a 
large number of new virus particles which carry the 
same genetic markers as those of the original virus.°-"! 
Thus, RNA has some functions in common with DNA, 
but also appears to have unique activities as well. 
Unfortunately, it is not possible to describe the molec- 
ular structure of RNA in a final manner, as is the case 
for DNA. RNA has been isolated in fibrous form and 
x-ray diffraction photographs taken of this material 
show a helical diffraction pattern with a fiber-axis 
repeat of approximately 26 to 28 A and strong meridi- 
onal reflections at 3.3 and 4.0 A. However, the diffrac- 
tion photographs are not of sufficiently high resolution 
to allow one to make a unique structural interpretation. 
Experimental work on the structure of RNA offered 
little hope of achieving a solution to this molecular 
structural problem until the discovery of an enzyme 
capable of polymerizing synthetic polyribonucleotides. 


Synthetic Polyribonucleotides 


Grunberg-Manago and Ochoa! discovered an en- 
zyme which converts nucleotide diphosphates into poly- 
ribonucleotides. The enzyme removes the terminal phos- 
phate groups from the diphosphates and assembles the 
resultant nucleotide residues to form polyribonucleo- 
tides. Polymers obtained in this fashion resemble natur- 
ally occurring ribonucleic acid in that they have the 
same covalent ribophosphate backbone, and they have 
been shown to undergo similar enzymatic hydrolysis.“4 
Polymers have been made from all of the purine and 
pyrimidine bases which are found in RNA. In addition, 
the enzyme will also make polymers which contain other 
purine or pyrimidine bases; e.g., polyinosinic acid has 
been made which contains the purine hypoxanthene, 
and recently a polymer has been made which contains 
thymine—i.e., a normal constituent of DNA but not of 
RNA. These polynucleotide molecules can be made 
either as pure molecules involving only one residue, or 
as copolymers involving two or more of the purine- 
pyrimidine side chains. The similarity between RNA 
and the synthetic polyribonucleotides can be shown by 
an x-ray diffraction study of synthetic copolymers, 
since they produce an x-ray diffraction pattern identi- 
cal with that of native RNA.!® This suggests that it 
might be possible to study the molecular configuration 
of the synthetic polyribonucleotides and thereby learn 
something about the configuration of naturally occur- 
ring RNA. ` 

The synthetic’ polyribonucleotides are very reactive 
molecules. Soon after they were polymerized, it was 
shown that a complex formed when polyadenylic 
acid was mixed with polyuridylic acid.!6 Using x-ray 
diffraction analysis, it was found that these two mole- 
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cules wrap around each other in solution to form a 
two-stranded helical molecule very similar to naturally 
occurring DNA.” The discovery of this remarkable 
interaction has been followed by a variety of similar 
discoveries among the other polynucleotides. At present, 
we know of the existence of several of these elongate 
macromolecules which form two-stranded and three- 
stranded helical complexes. 


Formation of Synthetic Two-Stranded 
Helical Molecules 


If a dilute salt solution at neutral pH contains both 
polyuridylic acid (Poly U) and polyadenylic acid 
(Poly A), these two molecules complex together. The 
reaction is shown schematically in Fig. 7, where the 
bands represent the polynucleotide chains. On meeting 
each other, the molecules wrap about to form a two- 
stranded helix. Evidence for this reaction can be ob- 
tained from an x-ray diffraction study of a fiber drawn 
from a lyophilized mixture of the two polymeric species. 
The fiber has strong negative birefringence, and pro- 
duces an x-ray diffraction pattern which has many 
similarities to a diffraction pattern of DNA. The dis- 
tribution of scattering intensity is that which is char- 
acteristic of a helix: it has a large area on the meridian 
which is clear, and the scattering intensity is dis- 
tributed in the form of a “cross” through the origin. 
The layer-line spacing varies slightly with humidity.!§ 
However, both DNA and (Poly A+Poly U) have a 
layer-line spacing of 34 A. This spacing represents the 
helical pitch of the molecule. From the strong meridi- 


Poly A 


Fic. 7. Diagram illustrating the chemical reaction between the 
two polymers, polyadenylic acid (A) and polyuradylic acid (U). 


. The irregular contours on the left represent the molecules in a 


random-coil configuration. After reacting on the right, the mole- 
cules are organized into a regular two-stranded helix. The bases 
connecting the two molecules are not shown in this diagram. 


onal reflections in the region of 3 to 4 A, it can be shown 
that there are 10 residues per turn of the helix in both 
DNA and the (Poly-A+Poly-U) molecules. The bire- 
fringence of both materials is identical when the (Poly- 
A+ Poly-U) molecules crystallize in a hexagonal lattice 
with a distance between the molecules of 28.8 A. This 


is approximately 6 A greater than that observed for the 


DNA molecule. 

With the exception of the diameter of the molecule, 
the two diffraction patterns are similar enough to 
suggest that they arise from a similar helical structure. 
In the solution, the adenine residues of polyadenylic 
acid meet with and hydrogen bond onto the uracil 
residues of polyuridylic acid in a way which is identical 
to the kind of hydrogen bonding which occurs between 
adenine and thymine in DNA (Fig. 3). The only differ- 
ence is that the uracil does not have the methyl group 
which is found on thymine. However, this does not 
affect the hydrogen bonding. Since the remainder of the 
molecule is similar to DNA, it forms the stablest struc- 
ture possible—i.e., a DNA-like configuration. There is 
an additional hydroxyl group in the sugar residue of the 
polyribonucleotides relative to DNA, and this increases 
the diameter of the molecule slightly through its inter- 
action with the other atoms of the sugar ring. This 
alters the hexagonal spacing mentioned above. 

There are other methods which can be used to study 
the interaction between these two molecular species. 
When they react in solution, the optical density at 
259 mu decreases. This effect has been utilized in a 
quantitative study of the interaction. A series of 
mixtures of polyadenylic acid and polyuridylic acid is 
made wherein the total concentration of phosphate 
groups remains constant, but the mole ratio of the two 
species varies continuously. The optical density is 
measured for this continuous series of solutions and the 
results are shown in Fig. 8. The dashed line shows the 
optical density of various mixtures of polyadenylic acid 
and polyuridylic acid at neutral pH in 0.1 molar sodium 
chloride, plotted as a function of mole ratio. It can be 
seen that the optical density falls quite sharply, and a 
minimum is reached at 50% mole ratio when the num- 
ber of adenine residues in the solution just equals the 
number of uracil residues. This strongly suggests that a 
new species is being formed which is a 1:1 mixture of the 
two polymeric molecules. This interpretation is rein- 
forced by studying this reaction in an ultracentrifuge, 
since there is an increase in molecular weight and sedi- 
mentation velocity when the two molecules combine. 

Making careful measurements of the type shown in 
Fig. 8 for the 1:1 complex, it can be shown that over 
95% of the residues have reacted, as judged by the 
sharpness of the drop in optical density. This is a 
measure of the high equilibrium constant for the reac- 
tion. One of the consequences of this figure is the 
inference that the reaction must be reversible in order 
to have all of these residues react.” 
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Fic. 8. The optical density of various mixtures of polyadenylic 
acid (A) and polyuradylic acid (U) at 259 mu. The total number 
of moles of polymer is constant for all points, but the ratio of 
molecular species varies as indicated. All solutions are in 0.1 molar 
sodium chloride at neutral pH. The dashed line shows the forma- 
tion of a 1:1 complex. The addition of a small number of divalent 
cations induces the formation of the three-stranded molecule [from 
‘a ee and A. Rich, Biochim. et Biophys. Acta 26, 457 

1957) ]. 


if} These experiments with mixtures of polyadenylic 
acid and polyuridylic acid clearly demonstrate the sta- 
Ei bility of the DNA configuration. In addition, they show 
AEE that it is possible for the RNA covalent backbone to 
/ assume the form of a two-stranded complimentary 
i duplex of the DNA type. This is significant because the 
ify work mentioned above on tobacco mosaic virus showed 
that the RNA molecule from the virus carries the 
genetic information residing in the virus. The molecule 
is also probably capable of carrying out the molecular 
replication necessary to virus multiplication in the leaf. 
In view of the fact that the RNA backbone can assume 
the DNA configuration, it seems quite reasonable to 
assume that the molecular replication of RNA may be 
carried out by a mechanism very similar to that involved 
in the molecular replication of DNA. 
a The reaction between polyadenylic acid and poly- 
uridylic acid was perhaps not completely unexpected 
in view of the fact that DNA is composed of two 
strands held together by hydrogen-bonded purine- 
pyrimidine base pairs. Since uracil is so close to thy- 
mine, it is expected that the stability of (Poly A+ Poly 
U) may be related to the stability of DNA itself. How- 
= ever, it is perhaps unexpected to find that it is possible 
to make a stable two-stranded helical molecule com- 
posed of polyadenylic acid and polyinosinic acid—i.e., 
slecule very similar to the DNA molecule, except 
s purine-purine base pairs instead of purine- 
tae pairs. 
dence for this combination parallels that 
foregoing for polyadenylic acid and 
en polyadenylic acid is mixed 
under appropriate conditions, a 
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lowering of the optical density occurs producing a min- 
imum at the 1:1 mole ratio point, just as shown in 
Fig. 8.2! Further evidence for the formation of this 
complex is also seen in ultracentrifuge experiments, 
since the complex has a larger molecular weight and 
sedimentation constant than either original molecule. 

An x-ray diffraction pattern of a fiber of polyadenylic 
acid plus polyinosinic acid (Poly A+Poly I) is similar 
also to the B-form of deoxyribonucleic acid. The 
(Poly-A+Poly-I) molecules crystallize in a hexagonal 
array on the equator with a= 24.4 A. The fundamental 
screw operation for generating the (Poly-A+Poly-I) 
helix is a translation of 3.4 A and a rotation of 31.5°, 
just slightly less than the DNA and the (Poly-A+ Poly- 
U) molecules. 

The purine base in polyinosinic acid is hypoxanthine. 
It is closely related to guanine in that it has an oxygen 
on Cs of the purine ring, even though it lacks the 
amino group present in guanine on C». It is likely that 
the hypoxanthine is in the keto tautomeric form and 
that it hydrogen bonds to the adenine residue as shown 
in Fig. 9. The keto oxygen of hypoxanthine is hydrogen 
bonded to the amino group of adenine, while the hy- 
drogen on N; of hypoxanthine is bonded to the corre- 
sponding ring nitrogen in adenine. As can be seen when 
comparing this with Figs. 3 and 4, the hydrogen-bonding 
system has some similarities to what is observed in 
DNA. The major difference is the additional imidazole 
ring present in the hypoxanthine base. The hydrogen 
bonding shown in Fig. 9 could be used in the naturally 
occurring nucleic acids if the hypoxanthine base were 
replaced by guanine, since the additional amino group 
attached to C2 of the purine ring would not introduce 
any steric interference. 


Three-Stranded Helical Molecules 


It was found that small amounts of divalent salts had 
a profound effect on the optical density-composition di- 
agram in the Poly-A and Poly-U system.” The solid line 
in Fig. 8 shows the change brought about by making 
the solution 10~* molar in magnesium chloride. A new 
minimum appears at 67% polyuridylic acid and 33% 


Hypoxanthine Adenine 
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Fic. 9. Diagram showing the hydrogen bonding between the 
adenine base of polyadenylic acid and the hypoxanthine base of 
polyinosinic acid [from A. Rich, Nature 181, 521 (1958)]. 
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polyadenylic acid. The new minimum is quite sharp, 
indicating that the new complex is quite stable. This 
was the first indication that a three-stranded complex 
was forming from two strands of polyuridylic acid and 
one of polyadenylic acid. 

Additional evidence for the formation of the three- 
stranded molecule is obtained by studying the sedimen- 
tation constant of the 1:1 complex as compared with 
the 2:1. The mean sedimentation constant for the 
three-stranded complex is about 50% greater than for 
the two-stranded complex. This increase would be ex- 
pected if the two-stranded molecule were to take on a 
third strand. The third strand probably fills the deep 
helical groove in the two-stranded molecule which is 
similar to the deep groove seen in DNA. Since it would 
displace water molecules from that site and would not 
appreciably alter the frictional forces or shape factor of 
the molecule, the net density increment of the molecule 
over the solvent would result in approximately a 50% 
increment in sedimenting velocity. 

It has been suggested that the second uracil residue 
is hydrogen bonding to the original adenine-uracil pair 
by forming two strong hydrogen bonds onto N; and 
No of the adenine ring. Such an additional third strand 
would not involve an increase in radius or helical pitch 
of the molecule, but could account for the approximately 
50% increase in sedimentation velocity. X-ray diffrac- 
tion photographs have also been obtained from this 
three-stranded complex. 

The kinetics of the formation of the three-stranded 
molecules from a mixture of two-stranded (Poly-A 
+ Poly-U) molecules and single-stranded Poly-U mole- 
cules have been investigated by measuring the optical 
density at 259 my as a function of time. Study of these 
curves for various concentrations of magnesium chloride 
or manganese chloride has shown that the reaction is 
second order for divalent cations; i.e., two divalent ca- 
tions are present for each triplet of bases. 

It is important to note that divalent cations do not 
have a unique role in forming the three-stranded mole- 
cule, since this complex will fully form in sodium-chlo- 
ride solutions which are 0.7 molar. Thus, it is likely that 
the cations are necessary to overcome the electrostatic 
repulsion between the negatively-charged phosphate 
groups in the three polynucleotide chains. Divalent 
cations are much more effective than monovalent cat- 
ions, probably because they form stable complexes 
with phosphate groups. N onetheless, they are not neces- 
sary, since monovalent cations alone are capable of 
carrying out this reaction. 

In a completely analogous fashion, the two-stranded 
polyadenylic-acid plus polyinosinic-acid molecule will 
take on a third strand of polyinosinié acid to become 
three-stranded.* This has been shown both spectro- 
photometrically as well as in the ultracentrifuge. 

Up to this point, the discussion has been concerned 
with two- and three-stranded molecules composed of 
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Fic. 10. Diagram showing the hydrogen bonding between three 
hypoxanthine bases in the three-stranded model of polyinosinic 
acid. The molecules are organized around a threefold rotation 
axis [from A. Rich, Biochim. et Biophys. Acta 29, 502 (1958) ]. 


different kinds of residues hydrogen bonded together. 
However, polyinosinic acid forms another kindof helical 
structure which involves only one type of molecule.” 

If polyinosinic acid is prepared as a high molecular- 
weight polymer, it can be drawn into an oriented fiber 
which is negatively birefringent and which produces an 
unusual diffraction pattern when compared with the 
diffraction photograph of the mixtures described above. 
One unusual feature is that the first layer line is found 
at a spacing of 9.8 A, in contrast to the 30- to 40-A 
spacings discussed in the foregoing. Even more unusual 
is the appearance of the second layer line at 5.2 A and 
of the third intense meridional layer line at 3.4 A. 
These are nonintegral; i.e., they are not successive 
orders of one fundamental repeat distance. This feature 
and the fact that the first and second layer lines do not 
appear on the meridian of the diagram point uniquely 
to a helical configuration for the molecule. The meridi- 
onal reflection at 3.4 A is undoubtedly attributable to 
the stacking of the purine residues at right angles to the 
fiber axis in agreement with the negative birefringence. 
The largest equatorial reflections occur at a spacing of 
23.8 A. 

This is an example of a helical diffraction pattern 
which points to a multiple-stranded structure, largely 
because it will not fit any single-stranded model. A 
three-stranded model will fit the diffraction data if all 
three strands are parallel to each other. The bases in this 
model are organized around a threefold rotation axis, 
utilizing the hydrogen-bonding system shown in Fig, 10, 
Here, the three hypoxanthine residues are hydrogen 
bonded in a cyclic manner, so that the keto oxygen 


attached to position 6 in the purine ring is byäroesn 
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bonded to the nitrogen at position 1 on an adjoining 
ring. The threefold rotation axis has the effect of decreas- 
ing the repeat distance along the fiber axis from 29.4 to 
9.8 A. 

Polyinosinic acid in solution can be converted from 
an organized helix into a random coil, and vice versa. 
This alteration is characteristic of all the polynucleotide 
molecular complexes. There are certain conditions under 
which they form stable, organized aggregates, and other 
conditions under which they separate in solution and 
are no longer organized. It is only when polyinosinic 
acid is in a single-chain random coil that it is able to 
react with polyadenylic acid to form the two- and three- 
stranded molecules already described; if polyadenylic 
acid and polyinosinic acid are mixed together in high 
salt concentration, no reaction occurs at all. 

Several other combinations of polynucleotides have 
been studied and are mentioned only. Polyadenylic-acid 
chains combine with themselves to form two-stranded 
helixes in which both chains are parallel to each other. 
Polyinosinic acid, in addition to combining with itself 
and with polyadenylic acid, will also combine with 
polycytidylic acid.” Recently, it has been possible to 
synthesize polyribothymidylic acid.” This is a polymer 
with an RNA backbone even though it contains a DNA 

eh base. This molecule also combines with polyadenylic 
4 acid to form two- and three-stranded molecules, in a 
is manner analogous to the combination with polyuridylic 
acid.?® 


DISCUSSION 


The synthetic polyribonucleotides are extraordinarily 
: reactive, and they can readily form a variety of multi- 
AME stranded helical structures, in some cases with molecules 
which are all alike, and in other cases with molecules 
which are different. There are, however, some generali- 
ties which emerge from this study which undoubtedly 
reflect some of the fundamental stabilizing features 
found in helical nucleic-acid molecules. For example, all 
1 of the helical molecules discovered so far have been 
| i either two- or three-stranded, but in no case has there 

} 


been a stable single-stranded molecule. In the structures 
described in the foregoing, including DNA, the over-all 
architecture has been similar in that the purine or 
pyrimidine bases appear to be largely on the inside of 
the molecule, whereas the charged sugar-phosphate 
chain is on the outside of the molecule. The bases are 
hydrogen bonded together, usually with two or occa- 
sionally three strong hydrogen bonds holding each base. 
_ The molecules are stabilized by the van der Waals 
packing of the stacked planar purines or pyrimidines, 
ddition to the hydrogen bonding. It may be that a 
le-chain helix cannot form in a polyelectrolyte mole- 
ch as the nucleic acids, because of the electro- 
o ulsion. When two or more chains are present, 
-ostatic repulsion in the helical molecule 


ds to pull the two chains closer to each 
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other since they are coiled. In a single-stranded mole- 
cule, the effect of electrostatic repulsion would be only 
to elongate the molecule and break up any organized 
helical structure. 

All of the synthetic organized helices are formed 
reversibly, and they can be made to assume a random 
coil in solution when conditions are appropriately 
modifed. In most cases, a simple reduction of ionic 
strength is enough to drive the two chains apart; in 
other cases, altering the pH is sufficient. Usually, there 
is only a given pH range over which a multistranded 
molecule is stable. For example, the polyinosinic-acid 
helix breaks up when the fH is raised above 10 owing 
to the fact that a proton necessary in the hydrogen- 
bonding system is removed. Similarly, the other helices 
are stable only over pH ranges in which the necessary 
tautomeric forms exist. Studies such as these are useful 
in developing an understanding of the stability of the 
naturally occurring nucleic acids and, since several of 
these molecules are similar to DNA, they can be used 
as model systems for studying nucleic-acid reactivity. 

In this regard, an especially attractive hypothesis 
may be made concerning the three-stranded molecules. 
It is possible that these may be analogs for the forma- 
tion of a single-stranded RNA by a two-stranded DNA. 
DNA has a deep helical groove in it between the two 
strands. It is this groove which is filled in the (poly- 
adenylic plus polyuridylic) molecule by the oncoming 
strand of polyuridylic acid, and, as such, it may be an 
example of a physiologically important type of reaction. 
Since DNA itself has two kinds of base pairs, there are 
a total of four different sites on the DNA molecule. 
These four types of sites may serve as templates for the 
four kinds of ribonucleotides which must be polymer- 
ized together to make the RNA molecule. Schemes such 
as this have been worked on by several investigators for 
many years, since it is an attractive and simple method 
for transferring sequence information from DNA to 
RNA. However, there has been as yet no convincing 
demonstration of a detailed molecular mechanism which 
has the requisite specificity. Further work is necessary 
to fully evaluate mechanisms of this sort. 

Despite the variety of polynucleotide structures 
which are now understood, the structure of naturally 
occurring RNA itself remains unknown. There is un- 
doubtedly a difference in the configuration of the RNA 
which is found in the microsomal particle from that 
found in the small RNA molecules present in the soluble 
supernatant. In addition, it is possible that the nuclear 
RNA is in yet a different configuration. The most in- 
teresting configuration for RNA is perhaps that which 
is found in the microsomal particle. According to the 
current hypothésis, the nucleotide sequence in this 
molecule is translated by some means into the amino- 
acid sequence in the proteins which are synthesized at 


that site. The configuration of RNA in these particles ` 


is indeed an interesting problem, and there is no way of 
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knowing whether or not the studies on the synthetic 
polyribonucleotides will yield an answer which is at all 
applicable to the problem. Nonetheless, these studies 
are producing a variety of structures and configurations 
and it can only be hoped that future work will yield 
promising results in this most interesting aspect of the 
problem. 
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NTEREST in the biosynthesis of nucleic acids stems 
Æ from the crucial importance of these compounds in 
heredity and reproduction and in the development of 
the cellular machinery. 

Every cell has DNA in its nucleus and, from a variety 
of considerations, DNA is regarded as the prime source 
of genetic information. Consider a cell which just 
has been pinched from its mother and which has little 
more than a faithful replica of the maternal DNA. In 
its growth to maturity, the young cell will have to 
make the enzymatic machinery to carry on energy 
metabolism on an expanding scale. At the same time, 
the cell must synthesize all of the special structures (be 
they lipid, carbohydrate, protein, or of any other 
special nature) that will characterize the shape, struc- 
ture, appearance, and behavior of the cell. The sequence 
can be represented, oversimplified, as 


DNA — enzymes — everything else. 


From all indications, it can be said with some confi- 
dence that the structure of DNA is fixed, that it does 
not undergo turnover under the most extreme varia- 
tions of cellular nutrition and physiology, and, face- 
tiously, that it is immutable, except for mutations. 
Aside from having to serve as an original template for 
the synthesis of the enzymes, it has one other major 
function and that is to provide the template for the 
synthesis of a replica of itself in the reproductive act 
aiy of forming a daughter cell. Discussion of the biosyn- 
ry thesis of DNA is limited here to its replication during 
reproduction. 


ig if RNA BIOSYNTHESIS 


A oes What is known about the function and biosynthesis 
H | of RNA? In contrast with DNA, this problem appears 
nue more complex. RNA occurs both in the nucleus and in a 
ler variety of cytoplasmic structures. Its concentration in 
the cell may vary within wide limits depending upon 
f: the age, nutrition, and other environmental factors, 
f and it obviously undergoes considerable turnover and 
_ replacement. While the DNA of the chromosome might 
_ be regarded as a single, exceedingly complex molecular 
= unit, RNA is, by comparison, physically and meta- 
= þolically heterogeneous and there is no reason to speak 
of RNA in the singular, except in the generic sense 
iat one regards protein. 
r ocar from current studies that RNA may 
ovide a link between DNA and protein synthesis, 
nslating the information in the DNA code to a 
_proper template for the manufacture of 


a 


distinctive proteins. What is known, in fact, from the 
work of Zamecnik eż al.! with animal cells, is that RNA 
is essential for the assembly of amino acids into poly- 
peptides, and from the work of Berg? with a bacterial 
system, is that only about 5 to 10% of the RNA seems 
to be active in the fixation of amino acids. This func- 
tion of RNA as a link between DNA and enzyme syn- 
thesis is at present an educated and attractive specula- 
tion only. 

Regarding the biosynthesis of RNA, the problem is 
complicated by its heterogeneity and by ignorance of 
its functions. There is a large and rapidly growing litera- 
ture on the subject, but brief reference is made to only 
a few of the contributions. 

(1) Ochoa’ isolated from a variety of bacterial cells 
an enzyme which condenses various nucleoside di- 
phosphates to high molecular-weight ribonucleic-acid- 
like polymers. These, discussed by Doty (p. 107) and 
by Rich (p. 191), are polymers of adenylic, cytidylic, 
or urldylic acid, or of other nucleotides or mixtures of 
them. As seen in Fig. 1, the reaction involves a con- 
densation between the hydroxy] group in position 3 of 
one nucleotide with the inner phosphate group of the 
other nucleotide, with a consequent elimination of in- 
organic orthophosphate (Pi). There is little free-energy 
change in this reaction and, as indicated, the polymers 
are split by inorganic phosphate at physiological levels. 
For this enzyme to form the highly specific polymers 
required in protein synthesis, it must have controls in 
the cell from which it has escaped upon isolation. A 
reasonable possibility is that this enzyme provides for 
the conservation and storage of nucleotide units which 
can be called upon to form the adenylic, uridylic, 
cytidylic, and other coenzymes essential for energy 
metabolism, for carbohydrate, protein, and lipid bio- 
synthesis, and for specific types of RNA. 

(2) Several groups of investigators‘? have recog- 
nized, during the past year or so, the existence of 
enzyme systems which extend ribonucleic-acid chains 
by one or a few nucleotide units, and which employ 
nucleoside triphosphates rather than the diphosphates. 
There are suggestive indications of relationships of 
these reactions to amino-acid fixation leading to poly- 
peptide formation. 

(3) A reaction which has the intriguing possibility 
of relating RNA synthesis to DNA is the one recently 
described by Hurwitz.” An enzyme from E. coli was 
found which required DNA for the incorporation of 
ribonucleotides into a polymeric structure. With a 
mixture of radioactive ribonucleoside triphosphates and 
deoxynucleoside triphosphates as substrates, the prod- 


CC-0. Gurukul Kangri University Haridwar Coll2@¥i@h. Digitized by S3 Foundation USA 


ll)” ee 


Py een 


a BIOSYNTHESIS OF 


OH OH 


NWCILIBIC ACIDS 201 
tk 
Ome oe ee (0) N— 
(0J o` 
? OH 
O—P=O 
Pi + | 
(0) 
| | 
CH 2 O N= 
OH OH 


Fic. 1. Mechanism of RNA synthesis from ribonucleoside diphosphates. 


uct formed was shown to contain the incorporated units 
in typical 3/-5’-phosphodiester linkage. Samples of 
RNA from several sources failed to replace the DNA 
requirement. Further purification of this enzyme system 
should clarify the specificities of the reaction and they 
are awaited with great interest. 

(4) Turning now to studies with whole cells which 
attempt to relate nucleic-acid and protein synthesis, 
mention should be made first that, by the addition of 
chloromycetin to growing bacterial cultures, it is 
possible to suppress protein synthesis completely and 
still maintain the new formation of RNA and DNA. 
The synthesis of RNA, however, can be shown to be 
dependent upon the availability of all of the amino 
acids essential for protein synthesis.!"“" Thus, a strain 
of E. coli requiring tryptophan for growth fails to make 
RNA in the presence of chloromycetin unless trypto- 
phan is added to the medium. Yet this and other amino 
acids for which this effect has been demonstrated do 
not appear to be precursors of RNA. A dependence of 
RNA synthesis upon amino acids, despite the exclusion 
of protein formation, is implied, therefore, but an en- 
zymatic basis for these observations as yet is not 
apparent. 

(5) Finally, the studies of RNA biosynthesis in 
virus-infected bacterial cells are noteworthy. In a 
logarithmically growing population of bacterial cells, 
there is an exponential increase of protein, RNA, and 
DNA. At the time of virus infection, as with T2 infec- 
tion of E. coli, synthesis of bacterial components 
ceases abruptly and the cellular machinery is devoted 
completely to the synthesis of viral protein and viral 
DNA. Since the virus lacks RNA, there is—or rather, 
there was thought to be—no RNA synthesis. Recent 


' tracer studies by Volkin and Astrachan! have shown 


a burst of RNA synthesis, very small in amount 
but distinctive in composition. One interpretation of 
these findings is that this RNA synthesis is in re- 
sponse to the unique instructions carried in the viral 


“DNA, and is a necessary prerequisite for the formation 


of the proteins essential to virus formation. There is 


here a highly simplified situation in which to study a 
form of RNA biosynthesis that is relevant to the link 
between DNA and protein and that is inviting to 
biochemical study. 


DNA BIOSYNTHESIS 


To review briefly the chemical structures which serve 
as the basic building units of the DNA molecule (Fig. 2): 
Deoxyadenylate (deoxyadenosine 5’-phosphate) is com- 
posed of a purine base linked by a glycosidic bond to 
the sugar that is unique to DNA, 2-deoxyribose. It is 
in the form of a furanose ring and lacks an oxygen at 
the 2-carbon position; esterified at carbon-5 is a phos- 
phate residue. Thymidine 5’-phosphate, or thymidylate, 
is composed of a pyrimidine, thymine, linked as in the 
case of adenine to a 2-deoxyribose 5/-phosphate. The 
methyl group on the 5 carbon distinguishes thymine 
from uracil, also a pyrimidine, which is a component of 


H NA 
io VA 
N 
NEA N Deoxyadenosine 5'-P 
CHe Nae 
N= 
9 N 
H H 
OH F 
Ho 
E 
ne i) 
ye o= a ye —C—H Thymidine 5'-P 
CHo A li 
\ 
o H 
H 


OH 


Vic. 2. Structures of a purine and a pyrimidine deoxynucleotide, as 
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anan, J. G. Flaks, S. C. Hartman, B. Levenberg, 


Fic. 3. Schema of enzymatic synthesis of purine ribonucleotides [from J. M. Buch 
Ciba Foundation Symposium (J. and A. Churchill, Ltd., London, 


L. N. Lukens, and L. Warren, Chemistry and Biology of Purines, 
1957), p. 233]. 


ribonucleic acid and of coenzymes prominent in carbo- 
hydrate metabolism. These and two other deoxy- 
nucleotides—a purine one, deoxyguanylate, and a 
pyrimidine one, deoxycytidylate—comprise the four de- 
oxynucleotides which commonly occur in samples of 
DNA from bacterial, plant, and animal cells, and from 
some of the viruses. How are these units assembled? 
This type of biosynthetic question has been ap- 
proached in several ways. Earlier, Calvin (p. 147) and 
Roberts (p. 170) described the use of isotopic tracers 
to chart biosynthetic pathways. AS suspected. inter- 
mediate containing an isotopic marker is administered 
jn the medium, and the cellular constituents which in- 
-corporate this marker are identified after brief or ex- 
tended time exposures. A variation of this tecnnigue T 
to label the exclusive carbon or nitrogen e che 
cell (e.g., C“ glucose or N" ammonia), and then to 


i not certain (unlabeled) com- 
ceo E ediun reduce the specific radio- 


Y i edium can 
ee of ae molecules in the cell under study. 
acti f 


Quite another approach involves the use of a mutant 
which accumulates intermediates because of an en- 
zymatic deficiency at some point in the biosynthetic 
process. The facility with which mutants can be sè- 
lected for service in a particular situation has made 
this technique extremely versatile and effective. These 
and other quite diverse methods have been used with 
considerable success in mapping pathways in the intact 
cell. But no combination of these methods can establish 
unequivocally a biosynthetic sequence! The walls and 
membranes of the cell form an “iron curtain” which 
either prevents certain molecules from entering the 
cell or alters them upon their entry or exit. It is a 
curtain which obscures the details to the point where a 
spur from a pathway is indistinguishable from the 
main line. Hence, a study of the enzymatic apparatus 
of the disrupted cell is an indispensable approach to 
the problem of biosynthesis. The objective of such an 
enzymatic attack on the problem is the isolation from 
cells of separate enzymes, each of which effects a single, 
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chemically rational step. These steps arranged in proper 

“sequence should lead to the total synthesis of complex 
molecules. To assume real significance, the rates and 
conditions under which a reconstructed biosynthetic 
pathway operates ultimately must be reconciled with 
all of the isotopic, nutritional, and other observations 
made with the intact cells. 

Turning now to what is known about the details of 
the enzymatic synthesis of the nucleotides (Fig. 3), 
one sees an outline of the discrete steps in the synthesis 
of adenosine 5’-phosphate starting with ribose 5-phos- 
phate, glycine, glutamine, and, as an energy source, 
adenosine triphosphate (ATP). This knowledge is 
largely the result of the work of Buchanan!® and of 
Greenberg!® and their colleagues. A comparable scheme 
could be shown for the pyrimidine nucleotides. The 
origin of the deoxyribonucleotides would appear, from 
current indications, to derive from the ribonucleotides 
by a direct reduction step. 

Several years ago, experimental work was begun by 
our laboratory to determine the biochemical mechanism 
for the replication of DNA chains. One might suggest 
many permutations for how such chains might be 


O O (0) 
Adenosine OP z ofall ae 
HO i bn 
a 


| 
Nicotinamide ribose —O—P—OH 
(0) 


Fic. 4. Nucleophilic attack of nicotinamide mononucleotide on 
ATP in DPN synthesis. 


assembled. We were guided by the knowledge of how the 
simplest and best known of the complex nucleotides, 
the coenzymes, are synthesized by the cell." For 
example, the enzymatic synthesis of diphosphopyridine 
nucleotide (DPN) (Tig. 4) involves a condensation of 
ATP with nicotinamide mononucleotide and a re- 
sultant elimination of inorganic pyrophosphate (PP). 
Similarly, in the synthesis of flavin adenine dinucleotide 
(FAD), or coenzyme-A, there is a reaction of flavin 
mononucleotide or pantetheine phosphate, respectively, 
with ATP, and again inorganic pyrophosphate is elim- 
inated (Fig. 5).2° In each case, the adenyl coenzyme is 
produced by a reaction with an activated adenyl de- 
rivative such as ATP. More recently, the enzymatic 
synthesis of the adenyl derivatives of the fatty acids 
and amino acids have been shown to proceed by a 
similar mechanism. What is shown in Fig. 5 to apply 
to the synthesis of adenyl derivatives applies also to 
the synthesis of uridyl, cytidyl, and guanosyl coen- 
zymes.“ Along lines proposed by Koshland,” one 
.may visualize (Fig. 4) a nucleophilic attack on the 
activated adenyl derivative, in the case of DPN syn- 


1. ARPPP(ATP) + NRP(nicotinamide mononucleotide) 


== ARPPRN(DPN) + PP 


2. ARPPP + FRP(riboflavin-P) == ARPPRF (FAD) + PP 


3. ARPPP + Pantetheine-P 


= = ARPP-Pantetheine + PP 


ATP 
>» ARPP-Pantetheine (CoA) 
P 


4. ARPPP + HOS COOE <= AOO + PP 


or or 


RCHNH»2COOH Bact icae rs TREE 


(0) 


Fic. 5. Synthesis of adenine-nucleotide coenzymes. 


thesis, by nicotinamide mononucleotide, to form an 
anhydride bridge with the adenyl group and to displace 
inorganic pyrophosphate. By analogy, it can be con- 
jectured that chains of nucleotides might be formed 
by a reaction of the end of a chain with an activated 
nucleotidyl molecule, such as a nucleoside triphosphate. 
In other words, it is assumed that the basic building 
block is a 5’-phosphate ester of a deoxynucleoside, 
activated perhaps by linkage to a pyrophosphate group. 
As shown in Fig. 6, the end of a DNA chain might be 
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Fic. 6. Postulated mechanism for extending a DNA chain. 
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TABLE I. Enzyme purification. became clear that, in order for one of the triphosphates . A 
; to be incorporated into DNA, the triphosphates of all -—~ 
Unitss Protein cits four of the deoxynucleosides which commonly occur in | 
erection Step perm Merotall coals omg procein DNA, and also DNA itself, must be present.” This is 
illustrated in Table II. The incubation mixture 
| i eae ty ee ped contained the triphosphates* of thymidine, deoxy- 
Wee Hen TY LBO 6-7 cytidine, deoxyguanosine, and deoxyadenosine, calf- 
: y Çoncentration of gel eluate A OUD 22. thymus DNA, Mg**, and enzyme (with the most 
| VII DEAE resin 120. 3600 0.60 200.-400. purified of the enzyme preparations, about 0.1 ug of 
i protein was used). The isotopic marker used has been 
a A unit is defined as th t of enzy sing the i ionof ej Sa ee X 324 à : 
| 10 mamoles of thymidine triphosphate into an acid-insolubie fraction during ¢ither C'™ in carbon-2 of thymine or P™ in the innermost 
! the assay, period of 30 min at 37°. See Table II for further details about the phosphate group of one of the triphosphates. Under 
a ay these conditions, 0.5 mumole of the labeled nucleotide 
involved in a nucleophilic attack on a deoxynucleoside ““S incorporated. Omission of rsh one o: the tri- 
triphosphate, thus extending the chain by one unit phosphates, of the DNA, oniof Mg a OF Draai 
and displacing pyrophosphate of the DNA with crystalline pancreatic DNase reduced 
| We approached the problem of nucleic-acid synthesis the reaction to a level below the limits of detectability. ; 
i experimentally by mixing together and incubating the Replacement of the triphosphates by the corresponding 1 
zi © 7 . ` ‘ P . n raan 
| following: a deoxynucleoside labeled very intensely in ae ehe tren reduced the j Bo k 
| one of the carbon atoms of its base, an extract made OY Pear U MES Oi CAA, ( TOLO Olas 
| from an exponentially growing culture of E. coli there any detectable reaction when ribose analogs such 
(generation time 20 min), ATP as an energy source, °° AAIE aK ayere used in-place of a TP or dCs 
and Mg ions. To determine whether any nucleic-acid respectively. Also, there was no reaction when DNA 
j synthesis had taken place, one relied upon the very Wês replaced iby RNA oF by DNA. degraded by acid 
| simple and fortunate chemical property of DNA that treatment or sonic irradiation. However, as is men- 
| Š i r 7 ariet of canrcea nlane 
it is precipitated at acid pH, while the deoxynucleotides reas ae DINYA ae S ec Di sources plagi 
| which were the substrates are completely soluble under “72741; ang virus, servec ae reaction. sae 
these conditions. It was ascertained many times that 1 he aya ac ancorpora non of deoxynucleotide mto 
i thorough washing of the DNA precipitate removed even ĉ DNA fraction using incubation conditions similar to 
the smallest traces of the deoxynucleoside substrates those in Table II also has been observed with the use 
| Thus, in our earliest experiments, traces of radioactive of sonic extracts of other bacterial species (emo philus 
| thymidine (thymine deoxyriboside) were found in- influenzae, Aerobacler aerogenes), and with extracts of 
Bee ted into an acid-precipitable substance that several types of animal cells (HeLa cell cultures, lymph- 
could be rendered nonprecipitable by treatment with 8!@n¢ and leukemic cells). While assays for the DNA- 
purified deoxyribonuclease (DNase).'® In this crude Mere eTTaRenuicemente for de kota 
. . æ . A . or deoxynu 
system, it was possible to detect 50 counts in the pre- Eno DNA e 
cipitate out of 5 000 000 counts in 1 umole of the starting = 
substrate. This indicated an incorporation of only 10 ar 
pumoles or about 1/10 000 of the amount detectable by Te 
f A 2 r omplete system* 0.50 
the most sensitive microchemical methods. KES: Omit dCTP, dGTP, dATP AN 
Using the incorporation of C'-thymidine into a Omit dCTP <0.01 
DNase-sensitive acid-precipitable material as the assay, Omit AGTE Wo k 
purification was begun of the components of the E. coli Omit Mg** RIS = 
extract essential for this reaction. It soon was recog- Omit DNA <0.01 
DNA pretreated with DNase <0.01 


nized that one of the enzymatic functions of the crude 
fraction was the conversion of the thymidine through 
thymidylate to the triphosphate level, and thymidine 
triphosphate was used thereafter as the substrate.” 

The enzyme-purification procedure at present” in- 
volves first the preparation of an extract of E. coli by 
sonic treatment, then several fractionation steps 
(Table 1) resulting in aa a eae ines a to in dilute alkali and neutralized. A control incubation from which DNase st 
n of about 2000 to times over the crude ex- z A. 

` 


a The complete system contained 5 mumoles of TP3: PP (1.5 X108 cpm 
per umole), dATP, dCTP, and dGTP, 1 „mole of MgCle, 20 „moles of 
glycine buffer (pH 9.2), 10 y of calf-thymus DNA, and 3 y of enzyme q 
Fraction V, in a final volume of 0.30 ml. The incubation was carried out 
at 37° for 30 min. 

DNase treatment of DNA was carried out in an incubation mixture con- 
taining 60 y of thymus DNA, 50 „moles of Tris buffer, DH 7.5, 5 moles of 
MgCl:, and 5 y of pancreatic DNase, in a final volume of 0.50 ml. After 
30 min at 37°, 0.02 ml of 5% bovine serum albumin and 0.05 ml of SV 
perchloric acid were added. The precipitate was centrifuged and dissolved 


protein ol s : eae : 
tract. With this preparation; certain interesting fea * For the triphosphates of thymidine, deoxycytidine, deoxyguan. . 
of the reaction have become apparent and are  osine, and deoxyadenosine, respectively, the following abbreyj. 
tures ations are used: TTP, dCTP, dGTP, and dATP. The symbo} 
discussed presently. ; 3 3 .. PPP indicates that the triphosphate is labeled with P® in the , 
intermediate stage of purification, it innermost phosphate group. $ 


Even at an in 
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synthesizing system cannot be regarded as accurate 
“in the crude extracts, it is significant that the values for 
the bacterial extracts are of the order of 20 to 50 times 
greater than those for the animal-cell extracts. Bollum 
and Potter’! have reported the incorporation of H?- 
thymidine into DNA by extracts of regenerating rat 
liver. 

With the most purified E. coli enzyme preparation, 
it has become possible to demonstrate net synthesis 
by the use of more-direct chemical methods. Several 
experiments are illustrated in Table IIT. In experiment 
1, there was an increase in DNA of a little over twofold, 
measured by spectrophotometry, deoxypentose assay, 
or by incorporation of radioactive tracer. In the other 
experiments, increases of DNA by a factor of 10 to 20 
were obtained, and 90 to 95%, therefore, of isolated 
DNA was derived from the deoxynucleotide substrates 
supplied. The factors responsible for cessation of the 
reaction are currently under study. 

On the basis of the foregoing results, one can consider 
the following over-all equation for the enzymatic syn- 
thesis of DNA: 


n TPPP IPI 
‘ppp x 
CHEE oxacpsa-| eP 
ndC PPP dCP |, 
fi 
4(n)PP 


The four triphosphates+-DNA yield a product which 
contains the four nucleotidyl residues linked in some 
covalent and hydrogen-bonded fashion with the DNA. 
It is apparent from the equation that inorganic pyro- 


TABLE III. Net synthesis of DNA. 


Expt. Control 
No.” Estimation (no enzyme) Complete A 
pumoles pumoles umoles 

1 P® incorporation 0.00 .28 0.28 
Optical density 0.19 0.46 0.27 
Deoxypentose 0.19 0.40 0.21 
e 2 Optical density 0.06 0.63 0.57 
3 Optical density 0.05 0.58 0.53 
4 Optical density 0.05 0.64 0.59 
5 Optical density 0.04 0.89 0.85 


a In experiment 1, the incubation mixture (3.0 ml) contained 0.15 umole 
of dAP#PP (1.3 X10% cpm per umole), 0.3 mole of dGTP, 0.15 umole of 
dCTP, 0.15 umole of dTTP, 200 „moles of potassium-phosphate buffer 
(pH 7.4), 20 moles of MgCl, 0.1 mg of calf-thymus DNA, and 12 y of 
enzyme Fraction VII. The mixture was incubated at 37° for 180 min. DNA 
was precipitated, washed, taken up in 1.2 ml of 0.5N perchloric acid, and 
heated for 15 min in a boiling water bath. Optical-density measurements 
were made at 260 mu and converted to nucleotide equivalents using a 
molar extinction coefficient of 8960 (derived from the calculated values for 
an acid hydrolyzate of calf-thymus DNA). In the P® estimation of DNA 
synthesis, incorporation of deoxyadenylate was multiplied by a factor 
based on its percentage in calf-thymus DNA. The radioactivity actually 
observed for the controls did not exceed the background count. In experi- 
ments 2, 3, 4, and 5, the reaction mixture (1.0 ml) contained 0.32 umole of 
each of the four triphosphates, 30 y of calf-thymus DNA, 60 „moles of 
potassium-phosphate buffer (pH 7.4), 6 moles of MgCl, and 8 y of enzyme 
Fraction VII. The mixture was incubated for 250 min at 37°. 2M NaCl 
then was added to give a final concentration of 0.2M and the mixture was 
heated for 5 min at 70°. Unreacted triphosphates were removed by ex- 

shaustive dialysis against 0.2M NaCl. The product contained no acid- 
soluble material. Optical density at 260 mu was determined and converted 
to nucleotide equivalents using a molar extinction for DNA of 6900. 


o 
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TABLE IV. Physical properties of DNA. 


Heated at 100°C, 15 min 


Primer Product Primer Product 
Sedimentation 
coefficient 25 20-25 20 14 
Intrinsic 
viscosity 50 15-35 <1 <1 
Molecular 
weight 8X10® 4-6xX105 <1X10° <0.7X 10° 


phosphate (PP) must be split out during the reaction 
and in quantities equimolar to the amounts of deoxy- 
nucleotide incorporated into the DNA. It might be 
considered also, especially in the light of the Ochoa 
studies* of the reversibility of synthesis of ribonucleic- 
acid polymers, that there might be a pyrophosphorolytic 
reversal of the reaction. It has been shown!’ that, with 
the incorporation of C4-thymidylate into DNA, equi- 
molar amounts of PP were released; this was isolated 
and identified by ion-exchange chromatography. 

Evidence for reversal of the reaction has been ob- 
tained.” When PP is present in concentrations of 
2X10? M—.e., about 100 times the concentration of 
the deoxynucleoside triphosphates—the synthetic re- 
action is inhibited by about 50%. Under such condi- 
tions, when PP*® is used, its incorporation into the 
terminal PP groups of the four triphosphates has been 
observed. The rate of the reaction is comparable to the 
synthetic rate. The reaction is absolutely dependent 
upon the presence of DNA. DNA degraded with DNase 
is inert. Inorganic orthophosphate fails to replace PP. 
A distinctive feature of the reaction is that triphos- 
phates are required. When they are omitted, there is 
only a small reaction, which may be interpreted as a 
very limited pyrophosphorolysis of DNA. It is neces- 
sary to have only one triphosphate to augment the 
pyrophosphorolysis reaction considerably. 

Up to this point, the discussion has been concerned 
with how a purified enzyme preparation can synthesize, 
in the presence of four deoxynucleoside triphosphates 
and of DNA, presumably acting as primer, a product 
that is insoluble at acid pH and which is nondialyzable 
and may be degraded to acid-soluble fragments by the 
action of pancreatic DNase. It has been established 
with more-sensitive techniques that even a single tri- 
phosphate may react, but to an extent so limited as to 
suggest the addition of one or very few residues to the 
ends of the DNA chains added as acceptor. 

A more detailed consideration is presented below of 
the physical and chemical nature of the enzymatic 
product, DNA, prepared under conditions where 90% 


or more of the polymerized material is newly. 


synthesized. 

In physicochemical studies, we have been able to 
show that the enzymatically synthesized product has 
essentially the same physical characteristics as DNA 
carefully prepared from calf thymus. The sedimenta- 
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Calf-thymus DNA 
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ynthesized DNA 


Enzymatically s 


Fic. 7. Sedimentation of DNA primer and the enzymatically synthesized product. 


tion behavior (Fig. 7) of the calf-thymus and enzy- 
matically synthesized samples were quite similar, al- 
though the enzymatically synthesized sample showed 
a greater polydispersity. The latter may be the result 
of the action of contaminating DNases in the enzyme 
preparation. The sedimentation constants in this and 
in a great many other runs have been in the neighbor- 
hood of 25 to 30. Viscosimetric determinations also 
have yielded values of 15 to 35 deciliters/gram which 
are comparable to those obtained for calf-thymus DNA 
(40 to 50); from these, average molecular weights of 
4 to 6 million for several samples of enzymatically 
synthesized DNA were calculated (Table IV). It may 
be inferred from the sedimentation and viscosity char- 
acteristics of the product that it consists of highly 
ordered rigid structures with effective volumes greater 
than would be expected from single polynucleotide 
chains with freedom of rotation at each link in the 
backbone. In. support of this view was the collapse of 


35 


30 


% \ncrease 


i 
a 


x Thymus DNA 
e Enzymatic product 


it 
© 


Time, hours 


i i i NA upon 
_ 8. Increase in ultraviolet absorption of D 
Bis 2 digestion with pancreatic DNase. 


the macromolecular structure when the DNA was 
heated for 15 min at 100°C; like thymus DNA, the 
viscosity decreased to immeasurably low levels, while 


the sedimentation rate decreased only slightly. Further- 
more, there was found a typical hyperchromic effect?" 


upon degradation of the product with DNase. The 
curves in Fig. 8 show that just as there is an increase in 


optical density upon digestion of calf-thymus DNA 
with pancreatic DNase, so is there a kinetically similar 
increase with the enzymatic product and to the same 
extent of about 30% above the starting values. 

One now comes to a consideration of the chemical 
structure of the product. To begin with, it can be 
affirmed that the substrates are linked in the DNA 
product by typical 3’-5’ diester bonds.” What can be 
said of the base composition and the ratios of bases of 
the product? Do they bear any relation to the DNA 
primer added to the reaction? 

In Table V are the base-composition data on en- 
zymatically synthesized DNA. A close correspondence 
exists in the enzymatically synthesized DNA’s between 
the contents of adenine and thymine on the one hand 
and of guanine and cytosine on the other, so that the 
ratio of purines to pyrimidines (A+G/C-++T) is in each 
case nearly unity. Furthermore, good agreement exists 
for the ratio (A+T/G+C) between the enzymatic 
product and its corresponding primer, ranging from 
values of 0.59 for Mycobacterium phlei to greater than 
40 for an enzymatically synthesized copolymer of de- 
oxyadenylate and thymidylate. The latter is formed by 
the DNA-synthesizing enzyme in the absence of primer 
under rather specialized, but poorly understood, con- 
ditions: specifically, after lag periods of from 3 to 6 hr. 
Once formed, such a polymer then can be replicated 
without a lag and will consist solely of adenine and 
thymine, although all four of the deoxynucleoside tri- 
phosphates are provided in the reaction mixture. This 
polymer, therefore, represents an extreme case in which 
the base composition of the enzymatic product reflects 
that of the primer. 
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The somewhat higher values of (A+T/G+C) ob- 
served for some of the products most probably can be 
attributed to contamination of these DNA’s with traces 
of this deoxyadenylate-thymidylate copolymer. The 
obvious implications of these early results are that the 
added DNA is serving as a template for an enzymatic 
replication of DNA, but it is evident too that more- 
extensive documentation is necessary before this con- 
clusion can be considered to be established. 

In reviewing the specificity of this DNA-synthesizing 
system, the results have indicated that samples of DNA 
from a variety of origins can serve as primers. It has 
been mentioned also that only the ¢riphosphates of the 
deoxynucleosides are reactive. What can be said of the 
specificity of the substrates with respect to the structure 
of the pyrimidine and purine bases? From the many 
interesting reports on the incorporation of bromo- 
uracil,37-** of azaguanine,*® and of other analogs into 
bacterial and viral DNA, it might be surmised that 
some latitude in the structures of the bases can be 
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TABLE V. Purine and pyrimidine composition of enzymatically synthesized DNA.*:> 
aaa pean Ane 
DNA yses A T G C G+C T+C 
M. phlei primer 3 0.65 0.66 1.35 1.34 0.49 (0.48-0.49) 1.01 (0.98-1.04) 
product 3 0.66 0.80 1.17 1.34 0.59 (0.57-0.63) 0.85 (0.78-0.88) 
A. aerogenes primer 1 0.90 0.90 1.10 1.10 0.82 1.00 
product 3 1.02 1.00 0.97 1.01 1.03 (0.96-1.13) 0.99 (0.95-1.01) 
E. coli primer 2 1.00 0.97 0.98 1.05 0.97 (0.96-0.99) 0.98 (0.97-0.99) 
product 2 1.04 1.00 0.97 0.98 1.02 (0.96-1.07) 1.01 (0.96-1.06) 
Calf thymus primer 2 1.14 1.05 0.90 0.85 1.25(1.24-1.26) 1.05(1.03-1.08) 
product 6 1.19 1.19 0.81 0.83 1.46(1.22-1.67) 0.99 (0.82-1.04) 
T2 phage primer 2 1.31 1.32 0.67 0.70 1.92(1.86-1.97) 0.98 (0.95-1.01) 
product 2 1.33 1.29 0.69 0.70 1.90(1.82-1.98) 1.01 (1.01-1.03) 
“Synthetic A-T Copolymer” see 1 1.99 1.93 <0.05 <0.05 > 40. 1.05 
^ A, T, G, and C refer, respectively, to adenine, thymine, guanine, and cytosine, except that C in the case of T2-phage primer refers to hydroxymethyl- 
cytosine, 
fs The figures in parentheses represent the range of values obtained. 


tolerated provided there is no interference with their 
hydrogen bonding. It would be well to reiterate at once 
what Rich mentioned earlier (p. 191). Analysis of the 
composition of samples of DNA from a great variety 
of sources and by many investigators (reviewed by 
Chargaff“) reveals the remarkable fact that the purine 
content always equals the pyrimidine content. Among 
the purines, the adenine content may differ consider- 
ably from the guanine, and among the pyrimidines, the 
thymine from the cytosine. There is an invariable 
equivalence, however, between bases with an amino 
group in the 6 position of the ring, such as in adenine 
and cytosine, and the bases with a keto group in the 6 
position of the ring, such as in guanine and thymine. 
These facts! were interpreted on the basis of hydrogen 
bonding by Watson and Crick*® in their masterful 
hypothesis on the structure of DNA. In a given species, 
the DNA composition of all of the cells is characterized 
by a distinctive ratio of the number of adenine-thymine 
pairs to the number of guanine-cytosine pairs. 


TABLE VI. Replacement of natural bases by analogs in enzymatic synthesis of DNA. 


Control 

Expt values 
No. (mymoles) Base analog used 
1 0.50 Uracil 
la 0.88 Uracil 
2 0.43 5-bromouracil 
2a 0.42 5-bromouracil 
3 0.51 S-bromocytosine 
3a 0.40 5-bromocytosine 
4 0.58 5-methylcytosine 
4a 0.52 5-methylcytosine 
5 0.37 Hypoxanthine 
Sa 0.27 Hypoxanthine 


Natural base omitted 


Thymine Adenine Guanine Cytosine 
(Percent of control)> 
54 4 6 
3 A 
97 2 4 
‘ 4 
4 4 118 
4 
2 3 185 
2 
3 25 5 
4 


a Control values are myumoles of radioactive deoxynucleotide incorporated into DNA in the absence of analog. Incubation mixt 
5 mumoles each of TTP, dATP, dCTP, and dGTP; 2 umoles of MgCl2; 20 umoles of potassium phosphate (pH 7.4); 10 ug of calf.th: conta nee Osea 


-thymus DNA; and tug 


of enzyme fraction VII-R. Experiments were performed at 37° for 30 min. Labeled substrates were: dCP#PP in Expts. 1, 2, 5a; TP®PP in Expts. la, 3.4.5; 


- and dGP#PP in Expts. 2a, 3a, 4a. 


b The percentage value represents the fraction of the labeled substrate incorporated when the analog (5 mumoies) was used instea \ 
bases, natural or analog, were supplied as the deoxynucleoside triphosphates. Values of 5% or below are near the limit of e E OA 


tionable significance. 
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support for the base-pairing relationships in the double, | 
helix proposed by Watson and Crick for the structure 
of DNA. 

The current status of knowledge of the biochemical 
aspects of RNA and DNA synthesis has been sketched 
in the foregoing. With respect to RNA, the enzymatic 
information available is inadequate to explain the 
metabolic behavior of cells and tissues. In the case of 
DNA, the enzymatic studies of replication can be recon- 
ciled with genetic phenomena, but much remains tobe 
clarified in the mechanism of the reaction. While the 
biological implications of these problems are exciting 
and pressing, the most immediate obstacles are the 
limited resources available to separate and characterize 
these macromolecules. New techniques are desperately 
needed now in the nucleic-acid field to cope with such 
problems as were solved in the protein field over the 
last 50 years. 
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As can be seen from Table VI,“ deoxyuridine tri- 
phosphate used in place of thymidine triphosphate sup- 
ported DNA synthesis at 54% of the rate of the control 
value but failed to support synthesis when used in 
place of the triphosphates of deoxyadenosine, deoxy- 
guanosine, or deoxycytidine. 5-bromodeoxyuridine tri- 
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T is evident that, in considering protein synthesis, 
one must think about amino acids. Figure 1 provides 
a very general summary of what is known about the 
metabolic transformation of the amino acids. The figure 
shows that several hundred separate metabolic steps 
that do not lead to protein synthesis have been recog- 
nized. Each of these steps probably requires at least one 
specific protein. These enzymes, many other enzymes, 
and still other proteins that do not exhibit specific cata- 
lytic properties, are formed by processes about which 
very little is known—these are represented by a single 
large upward arrow in Fig. 1. All of the other arrows 
represent “pathways of metabolism,” a series of step- 
wise processes. It is known that protein molecules of 
enormous complexity are synthesized by living cells at 
relatively rapid rates. Partly for this reason, there is a 
tendency to regard protein synthesis as a rapid spontane- 
ous condensation of amino acids on a template. It is 
reasonable to believe that some type of model or tem- 
plate is needed for the synthesis of a specifically organ- 
ized macromolecule composed of about 20 different 
building blocks. It is also clear that the specificity of 
proteins must ultimately be determined by genetic in- 
formation. Nevertheless, it is highly improbable that 
the amino acids are brought together simultaneously by 
a process in which there are no intermediates. Failure 
to detect intermediates (peptide or other) in protein 
synthesis—or in intracellular protein degradation (which 
is often thought to be a stepwise process)—may prob- 
ably be ascribed to inadequacies in the experimental 
approaches employed. 
A complete description of the process of protein syn- 
thesis must include the mechanism of peptide-bond 
synthesis, and also the manner in which the amino-acid 
building blocks are arranged in specific sequences. Such 
a description should also explain the processes responsi- 
ble for the formation of the specific configurations and 
linkages of the peptide chains of protein and for the 
binding to proteins of a variety of low molecular-weight 
compounds. In short, it is necessary to understand 
how essentially the same building blocks are used by 
living cells to yield such widely different structures as 
the contractile protein of muscle and the proteins of the 
blood, milk, silk, or hair. An attempt is made here to 
review some approaches to the problem. 


STUDIES ON INTACT ORGANISMS, 
TISSUES, AND CELLS 


eral aspects of the mechanism of protein 
from studies on intact organisms. 
] studies on rats showed that 


Several gen 
synthesis are apparent 


growth was greater when all of the dietary essential 
amino acids were available to the animal at the same 
time.!? If some of the essential amino acids were fed 
several hours after the others had been fed, no growth 
or less growth was observed, as compared with animals 
that received all of the essential amino acids at the same 
time. The necessity for the simultaneous availability of 
the amino acids for growth leads to at least two impor- 
tant conclusions concerning protein synthesis. First, it 
is evident that the amino acids are not stored in the 
body to an appreciable extent, but are removed by 
degradative reactions or by excretion. Second, it is clear 
that protein synthesis takes place relatively rapidly. If 
an adult animal is deprived of even one of the essential 
building blocks of proteins, there is prompt loss of ap- 
petite, decrease of dietary intake, and development of 
negative nitrogen balance, with eventual loss of weight. 
These effects are promptly reversed by administration 
of the missing amino acid.’ These observations suggest 
that there is continual synthesis of protein in an animal 
that is not growing, and also that, in such an animal, 
there must be a balance between protein breakdown 
and synthesis. 

Studies in which isotopic amino acids were adminis- 
tered to animals demonstrated that such amino acids 
were incorporated into the tissue proteins of growing 
and nongrowing animals.‘ These experiments proved the 
existence of an over-all dynamic equilibrium in which 
there was a continuous synthesis and breakdown of 
body protein, and made it possible to determine rates 
of turnover. Thus, it was observed that certain proteins 
(e.g., those of intestinal mucosa, liver, kidney, spleen) 
tum over rapidly, while in others (skin, muscle, brain) 
the turnover of protein amino acids is relatively slow. 
In a recent study, Thompson and Ballou administered 
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{ritium oxide to rats daily from conception to 6 months 
of age; the animals were then sacrificed at various times 


` over a 300-day period. Also, tritium oxide was adminis- 


tered for 124 days to a group of mature rats, which were 
then sacrificed at various times over a period of 360 
days. It was found that components with half-lives of 
several days constituted a very small proportion of the 
total animal, and that about half of the body com- 
ponents exhibited half-lives of greater than 100 days. 
About three-quarters of the collagen fraction exhibited 
an apparent half-life of 1000 days, in general agreement 
with other studies. Collagen and the muscle proteins 
represent a major fraction of the total body proteins; 
therefore, a relatively small fraction of the body protein 
is extensively involved in the dynamic state. However, 
this fraction includes proteins that are being degraded 
and synthesized at remarkably rapid rates. The ob- 
served turnover phenomena may be explained in part in 
terms of cell destruction and cellular protein secretion. 
Thus, the intracellular protein molecules may be stable 
until they are secreted by the cell or until the cell is 
destroyed. The latter interpretation applies to the hemo- 
globin of the red blood cells, which does not turn over 
until cellular destruction occurs.7 

In Æ. coli cells synthesizing adaptive enzymes, the 
new proteins are formed almost entirely from free amino 
acid in the medium, and there is practically no utiliza- 
tion of the amino acids of the preformed proteins.3—” 
Synthesis of induced enzymes in P. saccharophilia, how- 
ever, was accompanied by some utilization of preformed 
protein in growing cells, and a larger proportion was 
used in resting cells. In nongrowing Æ. coli cells, con- 
siderable (4 to 5% per hour) degradation and synthesis 
of protein has been demonstrated.” It appears that, in 
growing Æ. coli cells, the rate of protein synthesis is very 
rapid as compared with the rate of degradation and, 
therefore, that the synthesis of new protein does not 
involve the utilization of significant quantities of amino 
acids derived from the degradation of other cellular 
proteins. However, in Æ. coli cells that are not carrying 
out net synthesis of protein, considerable degradation 


. and synthesis of protein can be demonstrated. 


In studies on Ehrlich’s ascites carcinoma cells, 
Moldave! demonstrated an uptake of labeled amino 
acids into the cellular proteins. Incubation of labeled 
cells in a medium containing C-amino acids resulted 
in intracellular release of radioactive amino acids with- 
out concomitant net loss of protein. Similar results were 
obtained by Piez and Eagle in experiments on HeLa 
cells grown in tissue culture. It was shown that there 
was no appreciable reutilization of degraded cells or of 
secreted proteins. The available data do not indicate 
whether all of the individual protein molecules of the 
cell turn over or whether they turn over at similar rates. 
It is also difficult to establish or to exclude the existence 


"or exchange, i.e., replacement, of a single amino acid or 


a group of amino acids of a peptide chain without com- 
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plete synthesis of the protein. Such exchange might occur 
at an intermediate stage of protein synthesis. Exchange 
of an amino acid with an amino-acid residue in the 
interior of a protein would require opening two peptide 
bonds and a mechanism that would bring the two por- 
tions of the peptide chain together after acceptance of 
the new amino-acid molecule. That some of the steps of 
protein synthesis may be reversible must be considered 
in connection with the turnover and exchange phenomena 
considered above. Studies by Simpson'® showed that the 
release of amino acids from rat-liver slices is depressed 
by conditions (anaerobiosis, 2,4-dinitrophenol, cyanide) 
that inhibit utilization of energy. Amino-acid incorpora- 
tion is also reduced under these conditions. The proc- 
esses of degradation and synthesis may be closely related 
and could perhaps represent separate aspects of the 
same mechanism. 

It is evident that information concerning the chemical 
reactions leading to protein synthesis will be needed to 
interpret the observed exchange phenomena. However, 
experimental work bearing on this question has arisen 
from studies designed to determine whether protein 
synthesis takes place by a stepwise process (in which 
free peptides or other peptide intermediates are formed), 
or by a mechanism in which the amino acids are arranged 
at one time on a template corresponding to the sequence 
of the protein to be synthesized and then dislodged from 
the template in a single step. The “template” idea is 
attractive; it provides a mechanism for a specific amino- 
acid sequence and is consistent with the “‘all-or-none”’ 
aspects of nutritional studies. Experimental studies re- 
lating to this problem fall in two categories : those which 
show that labeled amino acids are incorporated to the 
same extent into different positions of the peptide chains 
of proteins, and those which show that the labeling is 
unequal, i.e., that the specific activities of amino acids 
obtained from different portions of the protein molecule 
are not the same. The studies of Muir et al." fall into 
the first category. They found that the N-terminal 
valine and the total of the nonterminal valine residues 
of the hemoglobin of rats previously injected with 
C'-valine exhibited the same specific activity. Askonas 
et al. injected a lactating goat with labeled amino acids 
and found that the specific activities of amino acids ob- 
tained from different portions of the casein molecule 
were, within experimental error, the same. On the other 
hand, Steinberg et al.!° labeled the ovalbumin of hen’s 
oviduct by incubating this tissue with C"O., and found 
that the specific activity of the aspartate of the hexa- 
peptide (cleaved by enzymatic conversion of the oval- 
bumin to plakalbumin) was significantly greater than 


that of the plakalbumin. Similar results were obtained . 


in in vivo experiments, and with other proteins. The 
finding of unequal labeling is consistent with the exist- 
ence of peptide intermediates, but other interpretations 
are not excluded. Uniform labeling would be the expected 
result if amino acids were added at one time to a tem- 
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plate. Uniform labeling might result with peptide inter- 
mediates also, if these intermediates equilibrated rapidly 
with the available free amino acids. The recent discovery 
of separate amino-acid activating enzymes and ribo- 
nucleic-acid acceptors (see p. 217) makes it possible to 
conceive of mechanisms by which amino acids may be 
incorporated into different types of intermediates for 
protein synthesis. On the other hand, it is possible that 
templates exist for certain portions of protein molecules 
and that, in the final stages of protein synthesis, one or 
more large peptides prepared on different templates are 
combined. For example, the A- and B-peptide chains of 
insulin might require separate templates, and, if the two 
chains were not synthesized at the same rates, or if the 
templates were filled by amino acids from different 
pools, unequal labeling of the final molecule would 
result. Yet, in studies in which insulin was synthesized 
by calf-pancreas slices in the presence of labeled amino 
acids, unequal labeling within both of the A- and B- 
chains was observed; also, unequal labeling of ribo- 
nuclease was observed. 

The finding of unequal labeling is clearly more sig- 
nificant than that of uniform labeling. Although unequal 
labeling might occur if a newly synthesized, uniformly 
labeled protein participated in subsequent “‘exchange”’ 
reactions, the evidence indicates that unequal labeling 
is observed in studies where the interval of time between 
introduction of free, labeled amino acid and isolation of 
the protein is relatively short. With longer time inter- 
vals, in the same system, labeling becomes uniform. 
Although the available data do not permit unequivocal 
conclusions, it is clear that the unequal labeling phe- 
nomenon, which has now been observed with several 
proteins, must be taken into account in any complete 
explanation of the process of protein synthesis. 

As discussed above, the studies in which labeled 
amino acids were administered to intact animals re- 
vealed that relatively high concentrations of isotope 
were found in those organs known to be capable of very 
active protein synthesis, such as the liver and pancreas. 
When these organs were separated by differential cen- 
trifugation into subcellular fractions (nuclei, mitochon- 
dria, microsomes), the highest concentrations of isotope 
were found in the microsomal fraction. Thus, Hultin 
found that injected N'-glycine was taken up most 
rapidly by the microsomal fraction of the liver of 
chicks.2° Similar results were observed in experiments 
with mammalian tissues.”!.* The concept has, therefore, 

developed that the microsomal particles of these tissues 
represent the most active fraction of the cell in protein 
synthesis. The microsome fraction (as usually obtained 


by differential centrifugation of liver homogenates) in- 


cludes ribonucleoprotein particles which are attached 
to membranous material, and perhaps also to axils? 
cellular matter. When the microsomal particles are 
treated with sodium deoxycholate, most of the lipid and 
lipoprotein portions of the microsome are removed, 
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leaving insoluble ribonucleoprotein particles which con- 
tain about equal weights of protein and ribonucleic acid. 
These particles contain most of the ribonucleic acid of 
the microsomal fraction and about one-sixth of the total 
protein of this fraction. When such particles were iso- 
lated from the livers of animals previously injected with 
labeled amino acids, they were found to have incorpo- 
rated isotope more rapidly than the lipoprotein portion 
of the microsome and the other subcellular fractions.*® 
Similar findings have been made in different labora- 
tories, on several tissues and in various species. The 
findings, which are remarkably similar, lead to the con- 
clusion that the most rapid uptake of amino acids takes 
place in the ribonucleoprotein particle of the microsomal 
fraction. However, as indicated below, mitochondria 
and nuclei also incorporate amino acids. 

A close relationship between nucleic acids and protein 
synthesis has become increasingly evident.??"5 It has 
been suggested that nucleic-acid synthesis must pro- 
ceed concomitantly with the synthesis of protein. This 
conclusion is based on experiments which have shown a 
close correlation between the two synthetic processes (cf. 
Spiegelman eż al.®). Nevertheless, it must be emphasized 
that it is not proven that the two syntheses are inter- 
dependent, and there is evidence inconsistent with this 
hypothesis. For example, in studies on phosphate- 
starved yeast, synthesis of protein was observed, whereas 
there was no synthesis of RNA.*® Synthesis of amylase 
in pigeon-pancreas slices was not accompanied by net 
RNA synthesis.*! In view of these findings, it is difficult 
to accept the view that protein and nucleic-acid syn- 
thesis are obligatorily linked, although there is much 
evidence for the belief that RNA plays a role in protein 
synthesis. 


CELL-FREE SYSTEMS 


Although important concepts concerning protein syn- 
thesis have arisen from studies on intact organisms and 
tissue slices, a detailed explanation of protein synthesis 
requires isolation and study of the cellular catalytic 
components. Several approaches have been made to the 
study of protein synthesis in cell-free systems. One ci 
these concerns enzyme systems, initially recognized be- 
cause of their ability to catalyze hydrolysis of peptide 
bonds. Another line of investigation has been directed 
toward study of certain “model” systems that synthe- 
size peptide or pseudo-peptide bonds. Finally, attempts 
have been made to study incorporation of isotopically 
labeled amino acids into proteins in cell-free systems. 
This work has been facilitated by the studies on model 
systems and by information obtained from cytological 
investigations. 


Formation of New Peptide Bonds Catalyzed by 
“Hydrolytic’? Enzymes 


The ability of proteolytic enzymes to catalyze the’ 
synthesis of peptide bonds was observed first more than 
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‘60 years ago. In these early experiments and in later 

. ones, partial hydrolyzates of proteins were incubated 
with proteolytic enzymes; the formation of insoluble, 
high molecular-weight (2000 to 400000) polymers 
(plasteins) occurred.’ Although the free energy change 
associated with the formation of plasteins must be rela- 
tively small, the hydrolysis of a dipeptide to free amino 
acids proceeds spontaneously and virtually to comple- 
tion. Significant reversal of such a reaction could proba- 
bly not occur at the concentrations of amino acids 
usually present in the cell, unless a mechanism existed 
for continuous removal of the peptide from solution. 
The free energy change for the formation of glycylgly- 
cylglycylglycine from glycylglycine is about half of that 
for the formation of glycylglycine from glycine.** Thus, 
it appears that the free energy change for the formation 
of small peptides from amino acids is much greater than 
that for condensation of peptides to form larger pep- 
tides. If this type of reaction plays a role in the synthesis 
of protein, it would be expected to be significant later 
in the process when relatively large molecules have been 
formed by other reactions. 

The recognition that proteolytic enzymes could cata- 
lyze replacement reactions and thus form new peptide 
bonds has led to a number of studies on transpeptidation 
reactions.*°~*7 Fruton and collaborators*®** discovered 
that cathepsin-C (which hydrolyzes certain dipeptides 
at pH values near 5) catalyzes polymerization reactions 
at pH 7.5. Thus, glycyl-1-phenylalaninamide reacts to 
form an octapeptide amide as follows: 


gly-phe-NH2+ gly-phe-NH» 


gly-phe-gly-phe-NH2+NH; 
gly-phe-NH» 


gly-phe-gly-phe-gly-phe-NH»-+NH; 
gly-phe-NH, 


gly-phe-gly-phe-gly-phe-gly-phe-NH.+ NH3 


The reaction is similar in some respects to the formation 
of amylose from glucose-1-phosphate and also to the 
enzymatic synthesis of polynucleotides from nucleoside 
diphosphates. 

y-Glutamy] transpeptidation reactions may be in- 
volved in the synthesis of the polyglutamic acid pro- 
duced by certain bacteria.**~® An enzyme preparation 
from B. subtilis catalyzes a reaction in which the y- 
glutamyl group of glutamine is transferred to p-glutamic 
acid or to a-D-glutamyl-p-glutamic acid to yield di- and 
tri-peptides. Continuation of this process could yield 
polyglutamic-acid molecules of considerable size. Ac- 
cordingly, the glutamine-synthesis system and the 
transpeptidation enzyme would be the major catalytic 
components required for the synthesis of polyglutamic 
acid from glutamic acid. 


It has recently been shown that transpeptidation re- 
actions involving proteins can take place.“ Thus, when 
insulin and C'-glycyl-1-tyrosinamide were incubated 
with cathepsin-C, labeled insulin could be recovered, 
and it was found that the labeling occurred by substitu- 
tion of an N-terminal amino-acid residue (probably 
mainly the a-amino group of the N-terminal glycyl 
residue). Similarly, incubation of rat-liver mitochondria 
with C'-tyrosinamide resulted in incorporation of iso- 
tope into the mitochondria. 


Incorporation of Amino Acids into Proteins 


It was shown about ten years ago that homogenates 
of animal tissues could incorporate labeled L-amino 
acids into proteins.” The general procedure employed 
involved incubation of a tissue preparation with a 
labeled amino acid followed by precipitation of the pro- 
tein with trichloroacetic acid. Subsequent treatment of 
the precipitate was designed to remove nonprotein com- 
ponents and labeled amino acid not bound to protein. 
It is obvious that this approach to the study of protein 
synthesis has a number of inherent difficulties. For ex- 
ample, types of binding other than those involving pep- 
tide linkage may occur. Furthermore, the relatively 
small quantities of isotope incorporated and the limited 
amounts of labeled material available are factors that 
have thus far prevented isolation of pure proteins. It 
was subsequently recognized that incorporation was in- 
hibited by anaerobiosis and by inhibitors of oxidation 
and phosphorylation, and that ATP could supply the 
energy for amino-acid incorporation.“—® As in the in 
vivo experiments, incorporation studies with liver homog- 
enates led to greater labeling of the microsomes than 
of the other subcellular fractions.* Incorporation of 
labeled amino acids into protein was shown to take place 
in a system obtained from rat-liver homogenates con- 
sisting of a microsome-rich fraction, a soluble non- 
dialyzable fraction, and an ATP-generating system.*® 
The incorporated amino acids were not released from 
the microsomes by subsequent incubation with unlabeled 
amino acids. When the soluble nondialyzable fraction 
required for the incorporation of amino acids into micro- 
somes was precipitated at pH 5, it was less active in 
incorporating amino acids, but activity was restored by 
the addition of either guanosine diphosphate or guano- 
sine triphosphate.“ These studies were extended to 
rat-hepatoma and mouse-ascites tumor cells.‘8 It is of 
interest that the microsomes obtained from one source 
were labeled by incubation in a system containing 
soluble fraction obtained from another. The incorpora- 


tion of amino acids was additive rather than competi- - 


tive; the incorporation of one amino acid was not stimu- 
lated by the addition of 17 other amino acids. It seems 
probable that the preparations employed contain a pool 
of amino acids, perhaps complete, although this does 
not seem to have been specifically investigated. 
Although there can be little doubt that the micro- 
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somes actively incorporate amino acids, it appears that 
other subcellular fractions can also carry out this activ- 
ity. Thus, Mirsky and collaborators*° observed in- 
corporation of amino acids into isolated calf-thymus 
nuclei. In certain respects, incorporation was similar to 
that observed with microsomal preparations. Thus, 
anaerobiosis and dinitrophenol inhibited incorporation. 
t-Alanine was incorporated, whereas D-alanine was not. 
Nuclei labeled by incubation with a C'-amino acid did 
not lose significant radioactivity on subsequent incuba- 
tion in a medium containing the corresponding un- 
labeled amino acid. Incubation with a complement of 
L-amino acids did not stimulate C'-alanine incorpora- 
tion. Treatment of the nuclei with deoxyribonuclease 
led to reduced incorporation ; the reduction in incorpora- 
tion did, in fact, parallel the removal of DNA. The 
decrease in incorporation observed after removal of the 
DNA appears to be associated with a decrease in the 
ability of the nuclei to synthesize ATP; thus, recent 
work has shown that addition of polynucleotides re- 
stores ATP synthesis.*! Separation of the nuclei into 
different fractions was carried out after incorporation, 
and the greatest specific activity was found in a non- 
histone protein closely associated with the DNA; the 
incorporation into histones was relatively low. A ribo- 
nucleoprotein complex that was easily extractable in 
pH 7.1 buffer was also highly labeled. 

Incorporation of amino acids into the ribonucleo- 
protein particles of plants has been observed.5 The 
features of the particulate amino-acid incorporating 
system of plants are similar to those of the animal- 
tissue systems. 

Gale and collaborators have carried out an extensive 
study of amino-acid incorporation in disrupted bacterial 
cells.°*—®> In the presence of the necessary amino acids, 
development of certain enzyme activities in disrupted 
cell preparations of Staphylococcus aureus has been re- 
ported. Gale has concluded that when disrupted prepa- 
rations of Staphylococcus aureus are incubated with a 
single amino acid and a source of energy (ATP and 
hexose diphosphate), incorporation occurs by an ex- 
change between the added amino acid and protein- 
bound amino acids. In these experiments, incubation of 
the labeled preparation with the corresponding un- 
labeled amino acid and a source of energy led to release 
of the radioactive amino acid from the disrupted cell 
preparation. Removal of nucleic acid from the disrupted 
cell preparations reduced the extent of incorporation. 
Reactivation of the system was accomplished by the 
addition of DNA or RNA. Digestion of staphylococcal 


= RNA by ribonuclease was reported to increase the 


ability of the RNA preparation to stimulate amino-acid 


in ation. A number of active components were 


‘om ribonuclease digests of staphylococcal or 
. The nature of these factors is not yet 
but it appears,that certain of them are of 
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low molecular weight and can be separated from poly- 
nucleotide material. 

Simpson eż al.5®:57 found that the rate of amino-acid 
incorporation into the mitochondria of muscle was of 
the same order of magnitude as that into the micro- 
somes, and that isolated muscle and liver mitochondria 
were able to incorporate amino acids into protein.5§ 
Incorporation into liver mitochondria required ATP, 
and was increased by addition of magnesium ions and a 
soluble fraction obtained from rat liver. The mito- 
chondrial-incorporation system appears similar to those 
for incorporation of amino acids into microsomes and 
thymus nuclei. One major and somewhat surprising 
difference, however, is that the incorporation of amino 
acids is increased by treatment of the mitochondria with 
ribonuclease. Inhibition by the nucleases is a charac- 
teristic feature of the incorporation systems of nuclei 
and microsomes. 

Ochoa and Beljanski®® have reported that a particu- 
late fraction of sonically disrupted Alcaligenes faecalis 
incorporated C'-1-amino acids into protein. Incorpora- 
tion was dependent upon oxidative phosphorylation for 
the generation of ATP; however, the system was re- \ 
ported not to contain amino-acid activating enzymes 
(see below) as measured by the absence of amino-acid 
stimulated, pyrophosphate incorporation into ATP. The 
incorporation of one C'-amino acid was stimulated by 
the addition of 18 other amino acids. They also observed 
small net increases of protein-nitrogen. Incorporation 
was decreased by treatment of the particles with M 
sodium chloride; reactivation could be obtained by 
addition of RNA or of a soluble enzyme (“amino-acid 
incorporation enzyme”) that was purified from the 
supernatant. Incorporation was not reversible, nor was 
it inhibited by chloramphenicol. It is of considerable 
interest that incorporation did not appear to require 
activating enzymes of the type found in other systems. 

The studies on disrupted bacterial systems have led 
to results somewhat different from those observed with 
the animal microsomal, mitochondrial, and nuclear 
systems. The major differences include the reversibility 
of incorporation of amino acids, and the observation ot > ~ 
net protein synthesis. These phenomena are, of course, _ 
characteristic of intact cell systems. That some of the 
incorporation observed in studies with bacteria may 
represent synthesis of cell-wall material is suggested 
by recent work. Mandelstam and Rogers® incubated 
Staphylococcus aureus cells with radioactive glycine, 
glutamic acid, or lysine in the presence of chlorampheni- 
col. The total precipitate obtained after treatment with 
trichloroacetic acid or organic solvents, and (in separate — 
experiments) the cell-wall material and protein were 
isolated. The data obtained indicate that the amino ~ 
acids were incorporated into the cell-wall material, and 
that the “protein synthesis” observed in the total pre- 
cipitate was, in fact, an increase in the mass of the cell 
wall. Incorporation into cell-wall material is not sig- 
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-nificantly affected by chloramphenicol ; some incorpora- 

tion into preparations of disrupted staphylococci can 
also occur in the presence of chloramphenicol. It is ob- 
viously important to determine how much of the ob- 
served incorporation by bacterial preparations repre- 
sents cell-wall or cell-membrane synthesis. 


Possible Intermediates in Protein Synthesis 


Clues to the mode of formation of the peptide bonds 
of proteins might be expected to come from studies of 
the synthesis of smaller and chemically characterized 
compounds possessing peptide or pseudo-peptide link- 
ages, e.g., benzoylglycine (hippuric acid), glutathione, 
glutamine, pantothenic acid. The major outlines of the 
enzymatic reactions leading to the synthesis of these 
compounds are now known. Study of these and of simi- 
lar systems has contributed significantly to the under- 
standing of amino-acid activation and the intermediates 
in protein synthesis. It may, therefore, be valuable to 
review briefly several activation phenomena leading to 
formation of peptide or pseudopeptide bonds. 

Studies on the enzymatic synthesis of glutathione 
demonstrated that the synthesis from the three amino- 
acid components takes place by a stepwise process :%: 


Mg** 


L-glutamic acid-++L-cysteine+ATP 
L-y-glutamylcysteine+ADP-+Pi, (1) 

Mgt 

L-y-glutamylcysteine+ glycine+-ATP ——= 
L-glutathione+ADP-+Pi. (2) 


Both reactions have been separately demonstrated. As 
yet, there is no evidence for the formation of a free 
intermediate in either reaction. 

The first reaction in glutathione synthesis is similar to 
that catalyzed by the glutamine synthesis enzyme :*~” 


Mg++ 


glutamic acid+NH;+ATP 


glutamine+ADP+Pi. (3) 


__ The purified glutamine synthesis enzyme also catalyzes 
synthesis of y-glutamyl hydroxamic acid when hy- 
droxylamine is substituted for ammonia, as well as a 
transfer reaction which requires Mg** (or Mnt+) and 
catalytic quantities of ADP and inorganic phosphate: 


glutamine+NH;,0OH = 
y-glutamylhydroxamic acid-++NH;. (4) 


No free intermediates have been isolated. Incubation 
of O'8-labeled glutamic acid in this system leads to the 
formation of O!8-inorganic phosphate and glutamine in 
stoichiometric quantities.”:” It is possible that the 
transfer of O'8 occurs via y-glutamylphosphate, or 
perhaps by a more complex mechanism. Experiments 
with synthetic y-glutamylphosphate revealed that, al- 
“though this compound reacted with very low concen- 
trations of ammonia to form glutamine, there was no 
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significant acceleration of the rate of glutamine syn- 
thesis in the presence of enzyme.” The possibility re- 
mains that enzyme-bound y-glutamylphosphate is the 
active intermediate, according to the following scheme: 


enzyme+glutamate+ATP = 
enzyme-y-glutamyl phosphate+ADP, (5) 


enzyme-y-glutamyl phosphate-+ NH; = 
enzyme-+ glutamine+phosphate. (6) 


Although the glutamine-synthesis enzyme catalyzes 
a transfer reaction [reaction (4) ], there is thus far no 
evidence for the existence of a natural acceptor. Several 
other enzymes catalyze transfer reactions involving the 
y-carboxyl group of glutamic acid, e.g., y-glutamyl 
transpeptidases that act on glutathione,®? various glu- 
tamine transferases.: With the possible exception of 
the y-glutamyl transpeptidase of B. subtilis 5—0 there 
is as yet no clue as to the physiological roles of these 
enzymes that act on the w-amide or w-carboxyl groups 
of the dicarboxylic amino acids. There is no evidence 
for y-glutamyl or f-aspartyl linkages in proteins, al- 
though this type of linkage occurs in certain peptide 
antibiotics.” It is conceivable that activation of the 
dicarboxylic acids for protein synthesis occurs by pri- 
mary attack on the w-carboxyl group followed by re- 
arrangement to an a-carboxyl-activated molecule. This 
type of rearrangement has been observed nonenzymati- 
cally with derivatives of glutamic acid and aspartic 
acid.” Another possible mechanism for dicarboxylic 
amino-acid activation is cyclic anhydride formation. 
Such derivatives might react enzymatically to give a- 
carboxyl substituted products; similar reactions occur 
nonenzymatically.76 

In contrast to the activation of glutamic acid and of 
y-glutamylcysteine, which are associated with a cleay- 
age of ATP to ADP, the synthesis of pantothenic acid? 
from pantoic acid and #-alanine and of benzoylglycine 
from benzoic acid and glycine involves formation of 
pyrophosphate from ATP. The enzyme system re- 
sponsible for pantothenic-acid synthesis catalyzes an 
incorporation of PP into ATP in the presence of pantoic 
acid. A similar exchange is catalyzed by the acetate- 
activating enzyme: 


Acetate+-ATP-+ CoA = acetyl CoA+AMP-+PP. ` (7) 


Evidence for the existence of an acyl-adenylate inter- 
mediate was first achieved by Berg?8-” in studies on the 
acetate-activating system. Thus, it was shown that 
addition of synthetic acetyl adenylate to the enzyme 
system could yield either ATP or acetyl CoA: 


Acetate+ATP = acetyl AMP-++PP, (8) 


Acetyl AMP+CoASH = acetyl CoA+AMP. (9) 


Evidence suggesting participation of acyl adenylates in 
activation of fatty acids was subsequently obtained 30 
Earlier, it had been shown by Chantrenne®! that CoA 
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was required for benzoylglycine formation. The synthe- 
sis of benzoylglycine appears to take place as follows®?-*¢ 


Benzoic acid+-ATP = benzoyl AMP+PP, (10) 
Benzoyl AMP-+ CoA-SH = benzoyl CoA+AMP, (11) 


Benzoyl CoA+ glycine = 


benzoylglycine+CoA-SH. (12) 


Although formation of benzoyl AMP was not demon- 
strated, when added to the system it was active (in the 
presence but not in the absence of CoA-SH) in forming 
benzoylglycine in human liver and kidney preparations. 
The synthesis of phenylacetyl-1-glutamine by human 
tissues appears to occur by an analogous reaction.85:86 
It thus appears that acyl adenylates are intermediates 
in the enzymatic activation of acetate and fatty acids, 
and in the synthesis of benzoylglycine and phenylacetyl- 
glutamine. The evidence for these intermediates is in- 
complete, however, for the enzymatic formation of these 
acyl adenylates has not been demonstrated. It is proba- 
ble that the actual intermediates are enzyme-bound 
and, therefore, that the number of molecules of the 
intermediate present at a given time is no greater than 
the number of acceptor sites on the enzyme. 
Amino-acid dependent, enzymatic PP-ATP exchange 
was independently observed by Hoagland et al.87:88 and 
by Berg,’*.”-® and several features of the reaction sug- 
gested that the activated amino-acid intermediate was 
an aminoacyl adenylate (Fig. 2). The phenomenon has 
now been observed and studied in a number of labora- 
tories; several specific amino-acid activating en- 
zymes have been purified, and evidence has been ob- 
tained for the formation of an enzyme-bound, aminoacy] 
adenylate intermediate. The general reaction may be 


written as follows: 
Mgt+ 


Amino acid-++enzyme+ATP 


enzyme-aminoacyl adenylate+PP. (13) 


In the presence of the specific amino acid, the enzyme 


catalyzes incorporation of PP into ATP. When the re- 


action is carried out in the presence of enzyme, ATP, 
Mgt, amino acid, and high concentrations of hydroxyl- 
amine, the corresponding aminoacy] hydroxamate 1s 
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formed. That the intermediate formed is an aminoacyl- 


adenylate is suggested, in analogy with acetate activa- . 


tion, by the formation of pyrophosphate rather than 
orthophosphate. DeMoss ef al. found that leucyl- 
adenylate reacted enzymatically with pyrophosphate to 
give ATP (reversal of reaction"). Further evidence for 
formation of an anhydride linkage between the phos- 
phoric-acid group of adenylic acid and the carboxy] 
group of amino acids was obtained in studies in which 
transfer of O!8 from the carboxyl group of an amino 
acid to adenylic acid was observed during enzymatic 
activation.” More-direct evidence for the aminoacyl- 
adenylate intermediate was obtained by Karasek et al.,% 
who extracted a compound that exhibited the properties 
of synthetic tryptophanyl adenylate from reaction mix- 
tures containing large amounts of pancreatic trypto- 
phan-activating enzyme.’ Also, it was found that the 
tryptophan-activating enzyme could catalyze ATP syn- 
thesis from pyrophosphate and a wide variety of a- 
aminoacyl adenylates including those of p-amino 
acids.°®.88 (g-Alanyl adenylate, acetyl adenylate, and 
carbobenzoxy a-amino acyl adenylates were not active.) 
On the other hand, only L-tryptophan was active in the 
enzymatic hydroxamate-forming reaction and in the 
pyrophosphate-ATP exchange. Similar results were ob- 
served with a yeast methionine-activating enzyme.” 
The explanation for this phenomenon is not yet clear. 
The findings with the D- and L-amino-acid derivatives 
are reminiscent of the observation that p-glutamic acid 
is activated almost as rapidly as t-glutamic acid by the 
glutamine-synthesis enzyme, whereas only the L-isomer 
of glutamine is significantly active in the transfer 
reaction. *” 

It would appear that the aminoacyl adenylate is 
stabilized by its binding to the enzyme. Such stabiliza- 
tion may be accomplished by a linkage involving the 
a-amino group; it has been found that N-substituted 
aminoacyl adenylates (e.g., carbobenzoxy-aminoacyl 
adenylates) are much more stable than are the corre- 
sponding free amino compounds.!™ The high reactivity 
of the aminoacyl adenylates is indicated by their rapid 
nonenzymatic reaction with hydroxylamine to form 
aminoacyl hydroxamates, with ammonia to yield 
amino-acid amides, and with amino acids to form pep- 
tides. Synthetic aminoacy! adenylates can react non- 
enzymatically with nucleic acid’ and with microsomal! 
and mitochondrial proteins. The incorporation of 
amino acids from synthetic aminoacyl adenylates into 
the particulate proteins and into soluble proteins repre- 
sents mainly acylation of the available free amino groups 
of the protein. Heated mitochondrial and microsomal 
preparations are acylated to a much greater extent than 
are the corresponding unheated materials. Presumably, 
heat denaturation increases the number of available free 
amino groups. It is evident from the observations which 
demonstrate the high reactivity of the aminoacy! ade- 
nylates that they must be selectively transferred, and 
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.-that their participation in other reactions must involve 
a, mechanism which prevents the occurrence of non- 
specific acylation reactions. 

It appears that very little (if any) free aminoacyl 
adenylate is formed enzymatically. However, com- 
pounds other than hydroxylamine can react with en- 
zymatically synthesized aminoacyl adenylates. Thus, 
incubation of tryptophan-C" with ATP, Mgt, acti- 
vating enzyme, and various protein preparations (rat- 
liver microsomes, serum albumin, ovalbumin) gave 
labeled protein preparations, and similar studies with 
soluble RNA preparations yielded RNA that was labeled 
with amino acids. The reaction of enzyme-bound amino- 
acyl adenylates with RNA appears to be more specific, 
since there is evidence for the existence on soluble RNA 
molecules of specific amino-acid acceptor sites (see fol- 
lowing). Aminoacyl adenylates, whether produced en- 
zymatically or by chemical synthesis, can react with 
nucleotides to yield products which appear to be bound 
by ester linkage to the 2’ (or 3’) hydroxyl groups of the 
ribose moieties. 

A number of specific amino-acid activating enzymes 
have been found. Thus, enzymes specific for methio- 
nine,™™ tryptophan,” tyrosine,%?"'® valine,! and leu- 
cine! have been described. Novelli®® has studied the 
activation of amino acids by preparations of micro- 
organisms, plants, and animal tissues. With most of 
these preparations, only 8 to 10 amino acids stimulated 
pyrophosphate-ATP exchange, although it is possible 
that the background exchange (owing to endogenous 
amino acids) may have obscured the effects of the added 
amino acids. With guinea-pig tissue extracts, all of the 
amino acids appeared to catalyze at least asmall amount 
of exchange.” It is obviously of importance to determine 
whether or not this type of activating mechanism exists 
for all of the amino acids. It is curious that the trypto- 
phan-activating enzyme should be so prominent in beef 
pancreas and in other sources, since this amino acid is 
present in very low concentrations in proteins and is 
absent from some proteins. However, it is possible that 
improved methods of isolation will be needed to obtain 

~all of the individual enzymes. 

There are difficulties associated with the study of the 
a-activation of glutamic acid and aspartic acid; thus, 
these amino acids (and their w-amides) may form w- 
carboxyl-linked intermediates. ATP exchange reactions 
and the hydroxamic-acid test system may, therefore, be 
unreliable guides as to the occurrence of a-activation. 
It is conceivable that these amino acids are activated by 
a different mechanism as suggested above. 

Evidence for the accumulation of labeled intermedi- 
ates during the incorporation of C'-leucine into rat-liver 
microsomes was reported by Hultin®° and Beskow.!% 
Hoagland, Stephenson, and Zamecnik!!% provided 
evidence for the transfer of labeled amino acids to ribo- 

` nucleic acid of the soluble enzyme fraction of liver 

homogenate (which also contains amino-acid-activating 
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enzyme activity). They also obtained evidence for the 
further transfer of the amino-acid moiety to microsomal 
protein. The transfer to protein was dependent upon the 
presence of guanosine triphosphate. Other studies have 
provided evidence for the hypothesis that soluble ribo- 
nucleic acid serves as an acceptor for activated amino 
acids.101,102,106—108 The reaction, which appears to be 
analogous to other activation reactions [Eqs. (7) to 
(12) ], but with ribonucleic acid instead of CoA as the 
acceptor, may be formulated as follows: 


Amino acid+-ATP = aminoacyl adenylate+PP (14) 


Aminoacyl adenylate+-RNA = 


amino acid-RNA+AMP. (15) 


It appears possible that both reactions are catalyzed by 
the same enzyme. When the total fraction of soluble 
RNA is incubated with a specific amino-acid activating 
enzyme and the corresponding amino acid, the RNA 
becomes labeled. A separate activating enzyme appears 
necessary for linking each amino acid to RNA. There 
may be specific RNA sites for each amino acid; thus, 
labeling by several amino acids is approximately the 
sum of maximum labeling with each. There are also 
preliminary indications!!! that there may be separate 
RNA molecules for each amino acid. Formation of the 
RNA amino-acid complex is reversible according to 
Eqs. (15) and (16) ; thus, incubation of the RNA amino- 
acid complex with AMP and pyrophosphate yields ATP. 
101,102,109 The RNA amino-acid complex is alkali-labile, 
but considerably more stable than are aminoacyl ade- 
nylates. This suggests that the linkage of amino acid to 
RNA does not involve an anhydride linkage. Potter and 
Dounce!” isolated alkali-stable polynucleotide fractions 
from calf pancreas that contained amino acids and pep- 
tides; they suggested that the amino acids were bound 
to the nucleotides by phosphoamide bonds. The stability 
of this linkage to alkali effectively excludes it as the 
binding mechanism in the enzymatically formed RNA/ 
amino-acid complexes. An attractive possibility, consis- 
tent with the alkali-lability of the bond, is an ester 
linkage involving the 2’ or 3’ hydroxyl group of ribose. 
Recent work by Hecht, Stephenson, and Zamecnik,!™! 
has shown that the soluble RNA of ascites-cell tumors 
incorporates adenine nucleotide into the terminal posi- 
tion in the presence of a soluble enzyme and ATP. Such 
incorporation is increased by adding cytidine triphos- 
phate. Their evidence was interpreted to mean that 
cytosine and adenine nucleotides are added to RNA in 
this sequence, It is of interest that the incorporation of 
a great many amino acids was enhanced by addition of 
these nucleotides, suggesting that these end groups 
participate in the attachment of activated amino acids 
to RNA. 

The further transfer of amino-acid moieties from 
RNA to microsomal protein has been described by 
Hoagland eż al., as mentioned above. The work of Hultin 
and Beckow,!* which demonstrated a two-step incorpo- 
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ration of leucine into rat-liver microsomal protein, now 
may be interpretated in terms of the formation of an 
intermediate RNA-leucine complex followed by transfer 
to protein. The mode of transfer and the relationship of 
the incorporated amino acids to specific characterized 
proteins are major problems that await solution. 


LATER STAGES OF PROTEIN SYNTHESIS 


The incorporation of amino acids into the microsomal 
particles of animal cells has thus far not been shown to 
be associated with the formation of recognizable specific 
proteins. It is possible that the synthesis of protein 
molecules is completed in the ribonucleoprotein particle. 
Failure to detect it at this stage may be ascribed to the 
very small quantity of protein present, or to binding of 
the newly formed protein to the particle. If the latter 
interpretation is correct, existence of additional reac- 
tions that remove the protein and transport it to other 
parts of the cell must be postulated. Another explana- 
tion is that the newly formed peptide material incorpo- 
rated by the microsomal particle undergoes additional 
transformation before becoming a recognizable protein 
molecule. The existence of such a protein precursor has 
been considered by several authors." Such transfor- 
mations might involve reactions that link relatively 
large peptide chains together, introduce disulfide bonds, 
and add smaller nonpeptide molecules. Transpeptida- 
tion reactions may function at this stage of protein 
synthesis. Experimental findings that appear to bear on 
this question have arisen from studies on the synthesis 
of pancreatic enzymes by Daly and Mirsky." This work 
i showed that there was no significant change in the total 
| . protein content of pancreas during secretion and syn- 
i thesis; however, there were significant and marked 

changes in enzyme concentration. There appears, there- 

fore, to be a distinction between rapidly formed precur- 

sor protein and specific enzyme protein. The transfor- 

mation of precursor protein to specific proteins asso- 

ciated with secretion may be a slower process than the 

i formation of the precursor protein. Similar conclusions 
: are suggested by experiments of Campbell and Work™ 
on incorporation of amino acids by the rabbit mammary 

gland, by Green and Anker" on the synthesis of a- 

globulin, and by Putnam eż a/.''° on Bence-Jones protein 

synthesis. Peters!”""* has reported that labeled carbon 
appears more slowly in serum albumin than in the total 
protein fraction, when chicken-liver slices are incubated 
with radioactive carbon dioxide or glycine. The appear- 
ance of isotope in the total liver-protein fraction pre- 
ceded the labeling of albumin by 15 or 20 min. Further 
studies indicate that serum albumin, isolated by an 
immunological procedure, accumulates in the cyto- 
plasmic granules before its appearance in the soluble 
fraction. These findings suggest that an antigenically 
-eactive serum-albumin molecule is formed on the ribo- 


cleoprotein particle or the associated lipoprotein 
nue” Opror al 


nembrane. 
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The ribonucleoprotein particles obtained from various _ 
sources have been found to contain about equal weights 
of RNA and protein. Such particles have been obtained 
from microorganisms, higher plants and animal tissues. 
Recent work indicates that particles obtained from the 
same source are not homogeneous, structurally or bio- 
chemically. Furthermore, Simkin and Work"® found 
that amino acids are incorporated in vitro into the pro- 
teins of different microsomal subfractions at different 
rates, and that the pattern of in vitro incorporation is 
not the same as that observed iz vivo. 

The amino acids incorporated into isolated thymus 
nuclei become linked to a specific protein closely as- 
sociated with the DNA.*:* In the mitochondrial system, 
incorporation of labeled valine into cytochrome-c was 
observed.°® These findings are consistent with the belief 
that specific proteins may be synthesized in nuclei and 
mitochondria. 


DISCUSSION 


The available experimental data indicate that virtu- 
ally all cells can incorporate amino acids, and that 
amino-acid incorporation (and perhaps also protein syn- 
thesis) can take place in the microsomal, nuclear, and 
mitochondrial fractions of certain animal cells. The 
general features of amino-acid incorporation into these 
structures are remarkably similar. The need for amino- 
acid activation is apparent in these systems and also in 
those obtained from plants and microorganisms. Addi- 
tional data are required before it can be concluded that 
all of the amino acids are activated by formation of 
aminoacyl adenylates; for example, the evidence con- 
cerning the dicarboxylic amino acids and the corre- 
sponding w-amides is incomplete, as observed above. 
Although acyl adenylates appear to be the reactive 
intermediates in acetate, fatty acid, benzoate, and 
phenylacetate activation, another type of activation, 
perhaps involving carboxyl phosphate anhydrides, ap- 
pears to be involved in both steps of glutathione syn- 
thesis and in glutamine synthesis. The studies of Ochoa 
and Beljanski*® lend support to the possibility that other 
types of amino-acid activation may occur. That phos- 
phorylated RNA may serve as a source of energy for 
protein synthesis has been considered and a phos- 
phorylated template has also been suggested. The con- 
version of amino acids to aminoacyl adenylates and to 
RNA/amino-acid complexes may form part of the 
mechanism responsible for transport of amino acids into 
the cell. Amino-acid activation may play a role in both 
transport and protein synthesis. 

There have been many published speculations con- 
cerning the existence and nature of templates for protein 
synthesis. The concept of a template or model for pro- 
tein formation seems almost indispensable in order to 
explain the high degree of specificity apparently associ- 
ated with the arrangement of amino acids in proteins. 
A system of separate and specific activating enzymes 
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-and RNA acceptor sites could provide some specificity 
in preparing amino acids for synthesis. Yet it appears 
that neither the activating systems nor the complete 
synthesizing system are absolutely specific, for such 
unnatural amino acids as ethionine and tryptazan can 
be activated and incorporated into protein. The in- 
corporation of a number of “foreign” amino acids into 
“false” protein has now been observed. If ethionine can 
occasionally replace methionine in the synthesis of pro- 
tein, and if fluorophenylalanine can sometimes be in- 
corporated in place of phenylalanine or tyrosine, is it 
possible that the naturally occurring amino acids may 
themselves occasionally be ‘‘misplaced”’ on the peptide 
chains? Thus, valine might “accidentally” replace iso- 
leucine or vice versa, and so forth. It seems possible that 
the specificity of amino-acid sequences may not be 
absolute. 

The participation of soluble and particulate RNA in 
the process of amino-acid incorporation is consistent with 
the belief that RNA functions as a template. However, 
there is no evidence that unequivocally excludes the 
possibility that protein may serve as a template. Specific 
enzymes are apparently needed throughout the process 
of protein synthesis, and it appears very probable that 
both protein and RNA function in the formation of 
specific amino-acid sequences. The nature of the process 
by which genetic information is supplied to the protein- 
synthesizing mechanisms, and discussions of possible 
relationships between the base sequences of nucleic acids 
and the amino-acid sequences of proteins are considered 
elsewhere in this volume (see p. 227). Information re- 
lating to these important problems may arise from 
studies on the structure of the soluble RNA molecules 
that accept the amino-acid moieties of aminoacyl 
adenylates. 

The observation that incorporation of amino acids 
into microsomes from one source could be carried out 
with soluble enzymes obtained from another is very 
interesting. If incorporation represents synthesis of a 
specific protein, then it would appear that the specificity 
must reside in the microsomal particle; alternatively, a 
“mixed” system of this type synthesizes a “hybrid” 
protein. It is possible that the microsomal-incorporation 
system represents only a portion of the cellular-protein 
synthetic system, and that additional components pres- 
ent in the intact cell must be added in order to obtain 
net protein synthesis. As yet, there is no unequivocal 
demonstration of net synthesis of a specific protein in 
a cell-free system. Such an achievement would be an 
important step toward the goal of understanding the 
process of protein synthesis. 
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INTRODUCTION 


a fae MON theory” is used in at least three 

senses. In the narrowest of these senses, it denotes 
a class of problems concerning the generation, storage, 
transmission, and processing of information, in which a 
particular measure of information is used. This area is 
also called ‘‘coding theory” and, especially in Britain, 
“the mathematical theory of communication,” which is 
the title of Shannon’s original paper! from which the 
field is derived. This is the usage in the titles of the 
books by Khinchin? and Feinstein,’ which are rather 
abstract mathematical presentations. 

In a broader sense, information theory has been 
taken to include any analysis of communications prob- 
lems, including statistical problems of the detection of 
signals in the presence of noise, that make no use of an 
information measure. Woodward‘ has shown the rela- 
tionship of information measure to some of these prob- 
lems. A book by Wiener, an article by Rice®” (reprinted 
in a book edited by Wax’), and a recent textbook by 
Davenport and Root® discuss the use of statistical 
techniques in problems concerning analysis of signals 
and noise, and books by Blanc-LaPierre and Fortet,'° 
Loeve, and Doob” provide the (abstract) mathemati- 
cal background. It is in this broader sense that the 
word is used, for example, in the title of the Profes- 
sional Group on Information Theory of the Institute 
of Radio Engineers, whose Transactions contains arti- 
cles both on coding and on signal-noise problems. 

In a still broader sense, information theory is used as 
a synonym for the term “cybernetics” introduced by 
Wiener" to denote, in addition to the areas listed in the 
foregoing, the theory of servomechanisms, the theory of 
automata, and the application of these and related 
disciplines to the study of communication, control, and 
other kinds of behavior in organisms and machines. 
This is the usage in the titles of three meetings held in 
London (two of which have published proceedings":!*) 
and two held at the Massachusetts Institute of Tech- 
nology.!®!7 Information and Control publishes articles in 
this broad area. 

Only the area covered by the narrowest definition is 
discussed here—not because it is necessarily the most 
important for biological applications, but because of the 
limitations of space. 


_  * The work of this Laboratory is supported in part by the U. S. 
Army (Signal Corps), the U. S. Air Force (Office of Scientific Re- 
search, Air Research and Development Command), and the U. S. 

Navy (Office of Naval Rech 


INFORMATION MEASURE. SOURCES 


The first problem is to assign a measure to informa- 
tion. Figure 1 shows two representations of a message 
and the code book which connects them. At first one 
might suppose that a long message has more informa- 
tion than a short one, but, as the figure shows, the 
message length—or any other characteristic of the 
representation itseli—can be changed drastically by 
coding at the transmitter. If the receiver decodes cor- 
rectly, the message has been transmitted successfully 
by using a very brief form. In a communications system, 
any message has a variety of different representations 
in different places. One may want to say that it con- 
tains the same amount of information. An amount of 
information, then, cannot depend upon the form of 
representation. It seems reasonable, however, to make 
it depend upon how many messages there are. If the 
number of messages is large, longer code words will have 
to be used in order to distinguish between them. One 
starts, therefore, with the hypothesis that the informa- 
tion in a message is some function of the number of 
messages in the set. One would like to say that two 
successive, independent selections from the same set 
have twice the information value of a single selection, 
if the two messages are equally probable. This demands a 
function f(m) of the number m of messages in the set, 
for which f(m?)=2f(m), and the logarithm is the only 
respectable function with this property. For a selection 
between two messages, this gives [log 2] units of infor- 
mation. This can be rewritten as [—log (4)] or, in 
general, as the negative of the logarithm of the message 
probability. This, in fact, is what one chooses to gener- 


CODED MESSAGES 
ABBA ... 


THE QUICK BROWN FOX JUMPED OVER THE LAZY 
DOG NOW IS THE TIME FOR ALL GOOD MEN TO COME 
TO THE AID OF THE PARTY NOW IS THE TIME FOR ALL 
GOOD MEN TO COME TO THE AID OF THE PARTY 
THR QUICK BROWN FOX JUMPED OVER THE LAZY 


CODE BOOK 
ATER QUICK BROWN FOX JUMPED OVER THE LAZY 


BoNOW IS THE TIME FOR ALL GOOD MEN TO COME 
AS CHE ‘Or en THE PARTY 
=I = QUICK BROWN FOX JUMPED OVER THE 
LAZY DOG)=LOG 2=—LOG(i/2)=—L = 
BIT/SYMBOL of) OG Pe 


Fic. 1. Two representations of a message and the code book which 
connects them. 
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Information, bits 


Fic. 2. Distribution of information for a two-symbol source whose 
symbols have equal probabilities. 


pi=p2=3, 1=I2=—log}=1 bit 
I[=H=}+}=1 bit 


alize as the information associated with a message. The 
choice of a logarithmic base determines the unit of 
information, and the base 2 is chosen, the unit being 
the “bit.” Thus, a selection between two equiprobable 
alternatives requires one bit of information. 

This definition is plausible, but it does not justify the 
information measure. The adopted measure is justified, 
in a practical sense, on the average, because a source 
that generates information more rapidly than another, 
also requires more communications facilities—more 
band width, time, signal-to-noise ratio—to transmit its 
output successfully. 

The information source, Fig. 1, is characterized by 
the distribution of information, Fig. 2. Since this source 
generates one bit of information per selection regardless 
of which message is selected, its information distribu- 
tion is degenerate: all of the probability is piled up at 
one bit..Each symbol has a probability given by the 
value of the exponential at the information value of 
the symbol, as the plot is just probability vs minus log 
probability. Figure 3 shows the information distribution 
for another two-symbol source whose symbols have the 
unequal probabilities 4 and 4. This source generates 
only 0.42 bits when it selects an A and 2.0 bits when it 
selects a B. Its average rate is 0.81 per symbol, which 

is smaller than in Fig. 2 with equiprobable symbols. 
This is a general characteristic of sources. Any con- 


1.00 


2 
Ni 
oO 


o 
a 


Information, bits 


9 information for a two-symbol source whose 
unequal proba bilities 4 and io 


i A =0.31 ; 0.50=0. 81 bits 
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straint on the number of symbols or sequences that a 


source may generate and any shift away from equal - 


probabilities will reduce the average source rate. A 
source constrained to generate sequences of letters 
which spell out sentences in English has a lower average 
rate than a source which selects successive letters of the 
alphabet with statistical independence. 

There are, of course, many questions which arise in 
connection with coding and with the description of 
sources having sequential constraints. What has been 
covered here is a mathematical theory, and problems 
arise in its application. One of these is illustrated in 
Fig. 4. The problem consists in identifying the alphabet 
which is relevant in a given situation. Suppose that, in 
observing the output of a neuron, the eight wave forms 
shown in the figure are seen with equal frequency. 
This would give three bits of information per wave 
form. However, these signals may not all be distin- 
guishable by the system being observed. The system 
may act only as a pulse counter over this time interval, 
and may recognize only four different signals—no pulse, 


fe se 3 Zero 
| ia 
8 Signals im oe 4 Messages 
AA il B 
masie IU ii T= 1.82 bits 
l l Two 
itt 
LLL J Three 


Fic. 4. Dependence of source rate on alphabet. 


one pulse, two pulses, and three pulses. Then the aver- 
age rate is only 1.82 bits per signal. There is no way of 
telling from the outside which of these alphabets is 
actually in use—if, in fact, either is. It is necessary t 
determine that the system does have different responses 
to two signals before they can be defined as being dis- 
tinct letters of an alphabet. The problem of recognizing 
the relevant alphabet always is present and shows up 
in many ways, including the selection of scales of resolu- 
tion to be used in amplitude and time to distinguish 
different signals. In Fig. 4, the average rates for the 
two alphabets are not too far apart, but in a train of 
100 pulses, if all observed wave forms are equiprobable, 
there is a factor of about 20 between the rates obtained 
for the two corresponding alphabets. 


NOISY CHANNELS 


There are, then, sources to select symbols, and chan- 
nels are needed to transmit them. Figure 5 shows 
one—a noisy channel. In communications, a noisy chan- 


nN 


Li 
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nel usually is a medium which separates transmitter 
and receiver. In information storage, a noisy channel 
may model the action of the environment which, 
through thermal agitation or other forces, may cause 
changes in the stored information. In Fig. 5, the channel 
eliminates some of the distinctions present in its input. 
The source selects from among four equiprobable sym- 
bols a, 6, c, and d. The receiver on reception of A or B 
can eliminate two input possibilities, but it cannot 
choose between the other two. 

To analyze informationally what happens in the 
channel, consider a particular transmission event: the 
selection of a at the transmitter and the reception of A 
at the receiver. It is assumed that the receiver knows 
the input-letter probabilities and the channel. Thus, 
before reception of A, the estimate by the receiver of 
the probability that a will be transmitted is }. After 
reception of A, the receiver knows that only either a or 
b could have been transmitted, and both are equally 
likely. Thus, a posteriori, the probability of a (on the 
evidence A) is }. 

The receiver a priori needs log 4=2 bits of informa- 


4% a 
ae 
% b 


Fic. 5. The mutual information between a transmitted symbol 
and a received symbol. 
Pr(a)=} Pr(a|A)=3 
I(a)=2 bits J(a|A)=1 bit 
Al=logPr(a)—[—logPr(a|A)]=1 bit 
Pr(a|A) Pr(a,A) 
=log = A 
ay aOR EA 


tion to select a as the transmitted letter. A posteriori, 
it still needs log 2=1 bit to select a after receiving A. 
The channel is credited with the 1-bit difference, which 
is defined as the amount of information which the 
receipt of A gives about the transmission of a. The 
quantity expressed in Fig. 5 as Z(a; A) is called the 
mutual information of A about a, or the transmitted 
information. 

In general, for each possible pair of transmitted and 
received symbols, v; and y;, there is a probability of 
occurrence Pr(«;, y;) [which may be zero as in Fig. 5 
for Pr(a, B)] and an information value I(x; y;) 
=log Pr(«:, y,;)/Pr(«:)Pr(y;), which measures the 
change in the logarithm of the probability of x; owing 
to knowledge of y;. This quantity is positive if y; makes 
x; more probable than it was, and is negative if y; makes 
x; less probable than it was. A plot can be made of the 
mutual-information distribution of a channel-source com- 


` bination. For the channel of Fig. 5, this plot is exactly 


like the plot of Fig. 2: one bit of information is always 
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G BEC 
k% a A Av. rate = % 
P bit/symbol 
P 
A IE B 
Mutual information, bits 
BSC 
bat A Av. rate = 0.18 
D bit/symbol 
P, 
k b B Errors 


Mutual information, bits 


Fic. 6. The binary erasure channel, the binary symmetric 
channel, and their mutual information distribution. 


transmitted regardless of which input-output pair 
happens. 

Two more-complicated channels and their mutual- 
information distributions are shown in Fig. 6. In the 
binary erasure channel (BEC), one of two symbols is 
selected at the transmitter. The input symbol may be 
received correctly, with probability q taken here as $, 
or it may be erased with probability =}: The receiver 
then receives an X. Evaluating mutual information 
gives the distribution shown. The channel sends one 
bit per symbol three-quarters of the time, when no 
erasure occurs; one-quarter of the time, when the 
transmitted symbol is erased, no information is trans- 
mitted. The average rate is just ł bit per symbol, if the 
input symbols are equiprobable. 

In the binary symmetric channel (BSC), there are 
only two input and two output symbols, and true 
errors occur. Here less than one bit is transmitted when 
there is no error, and a negative amount of mutual 
information is transmitted if there is an error. One 
takes error probability p=}. The average rate of trans- 
mission here is only 0.18 bits per symbol with equi- 
probable inputs. 

Returning to Fig. 5, one interprets the average rate of 
transmission of 1 bit per symbol as the average rate at 
which the receiver’s ignorance is reduced—from 2 bits 
to 1 bit for each letter transmitted. However, the 
receiver’s ignorance is not reduced to zero. There is an 
apparent difference between this channel and another 
with the same rate which transmits two equiprobable 
symbols without error. In the latter case, the ignorance 
of the receiver is reduced from 1 bit to zero bits for 
each transmission. One would like to be able to say- 
that the average rate over a channel is the significant 
parameter. This requires that, for the channel of Fig. 
5, one finds some method for which the rate of putting 
information in is reduced in such a way as to reduce 
both the initial and final ignorance of the receiver, keep- 
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Noise 
..0 0X0... 
Input Output 
. BABB... BEC ..BAXB... 
BEC, uncoded transmission 
Input Noise Output 
. BABB... -OOXOXOOO... BABB... 


Decoder 


BB XA XB BB.. 
Noisy signal 


...BB AA BB BB 
Coded signal 


BEC, coded transmission iteration 


Fic. 7. Transmission over the binary erasure channel uncoded and 
coded by iteration. 


ing their difference fixed. In this case, it is easy to do. 
The transmitter agrees to send only two symbols, a 
and c, with equal probabilities. The receiver then can 
decode unambiguously, receiving one bit of mutual 
information per symbol with no residual ignorance. 


CODING FOR NOISY CHANNELS 


The foregoing result is, in fact, general for a broad 
class of noisy channels, although the way in which the 
input information rate is reduced is usually more com- 
plicated. It is not possible in the BEC or BSC to find 
a set of input symbols which will leave the receiver 
with no ambiguity. All input symbols are used, there- 
fore, but they are used in sequences, and only a fraction 
of the possible sequences occur. 

Figure 7 shows the BEC, first used in ordinary, 
uncoded transmission and then used with a simple 
coding and decoding scheme. In the upper picture, one 
bit per symbol is put into the channel, + of the output 
symbols are erased, and an average of ? bit per symbol 
comes out. Below, a coder is added, which duplicates 
each input symbol. For a fixed channel, input symbols 
can be accepted now only half as often, since two sym- 

bols go into the channel for each input symbol to the 
coder. The input rate is then 4 bit per channel symbol. 
Again } of the output symbols are erased, but both of 
the copies of an input symbol are erased Jẹ of the time 


= only; so reliability has improved. 


1.0 


0.5 


Output, reliable bits/sec 


p, erasure probability 
of channel 


i ty for iteration coding of the BEC. 
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Of course, each digit could be sent three or more 
times instead of twice. This would reduce further both 
the rate of transmission and the probability of total 
erasure, or of ambiguity after decoding. The relation- 
ship is shown in Fig. 8. On the left plot, it can be seen 
that sufficient reduction of rate reduces residual prob- 
ability of total erasure as low as is desired, but, to get 
arbitrarily low probability, an arbitrarily low rate of 
transmission is necessary. If one looks at the channel 
in a different way, demanding transmission with arbi- 
trarily low total-erasure probability, and asking how 
average rate varies as the channel-erasure probability 
$ is varied, one sees the plot on the right in Fig. 8. If 
p=0, one can send one reliable bit per symbol, but if 
p=0, none can be sent. Time must be spent on repeating 
the first input symbol to make its reliability arbitrarily 
good, and one never can get around to sending any- 
thing else. 

Certainly, one way of avoiding misunderstanding is 
never to say anything new, but it would be discouraging 


Noise 
00x0 
Input Output 
BABB BEC BAXB 


..BA BBIB... ..BA XB'B\.. 
U7 
Coded signal Noisy signal 


B EC, check digit coding 


Fic. 9. The transmission erasure channel, the binary symmetric 
channel, and their mutual information distribution with check 
digits. 


if this were the only way. An alternative is shown in 


Fig. 9. 


Here one starts with uncoded transmission again. = 


Now, however, instead of duplicating each input sym- 
bol, a general purpose replacement, called a check digit 
or a parity check, is inserted so that a replacement will 
be available in case it it erased. The digit enclosed in 
the dotted box is selected to make the total number of 
B’s in the sequence of 5 symbols an even number. At the 
receiver, if only one symbol has been erased, the knowl- 
edge that an even number of B’s should be present 
makes the decoding unique. The rate of transmission 
has been reduced to 4 bit per symbol and increased 
reliability is obtained. There is still a danger, however, 
that two or more erasures may occur in the same block 


of five digits, in which case decoding still would be 


ambiguous. 
The situation can be improved by adding further 


ê 
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-check digits as shown in Fig. 10. Here the sequence of 
symbols is shown above. Added check digits have bars 
over them. The check digits are computed as shown 
below : each symbol to the right of a row is selected to 
make the total number of B’s in the row even; each 
symbol at the foot of a column is selected to make the 
total number of B’s in the column even. First, those 
rows with single erasures are decoded, and then the 
columns having only single erasures remaining after 
row correction. In the case illustrated, this corrects all 
erasures. Further high-order check digits can be added 
indefinitely to give a total-erasure probability which is 
arbitrarily small without reducing the transmission rate 
to zero. Figure 11 shows the kind of relationship be- 
tween rate and residual-erasure probability which re- 
sults. Here an erasure probability p of 1/20 and an 
initial check group of ten, rather than of five, symbols 
have been used. Rate still goes down as reliability in- 
creases, but it now has a positive asymptote at 0.80 bits 
per symbol, at which rate it is possible to get arbitrary 
reliability. The plot of reliable rate vs the channel- 
erasure probability p is now continuous, as shown by 


BABBBABABAAABABBBABBBABBB 
ABAABBAABAABBAAABBBBABBAA... 
B A B BIB B A BBIB B A BBIB 
ABABIA AXABIA ABABIA 
A A BAÍB AAXAJB A A BAIB 
B B A BIB B B ABÍX B B A BİB 
B A B BIB BABBIB BA BBIB 
ABA A|B XB XAÍB ABAA|B 
BAABIA BAABIA BA ABIA 
ABBAIA ABBXIA ABBAIA 
ABBBIB ABBB ABBBI|B 
AB BAIA A [AB BAIA 
Transmitted Received Decoded 


Fic. 10. Correction of erasures by iteration of parity-check digits. 


the solid curve. This kind of iterated check-symbol 
coding can be used for the BSC'® as well as for the 
BEC.” 

Although a very special case of noisy-channel coding 
has been shown, it illustrates a general result. Given 
any noisy channel and source, with an average rate of 
transmission, it is possible to code the input in long 
blocks and to reduce both the input rate and the re- 
ceiver’s residual uncertainty, while keeping the average 
transmission rate fixed. The ideal behavior is illustrated 
in the dotted line on Fig. 11 for the BEC; to approach 
it, coding in large blocks is required. 

One other point should be noted. The code we con- 
structed was engineered carefully, and its reliability 
may appear to be atypical. In fact, by random selection 
of long sequences of symbols, it is possible to get the 
same results—the “ideal behavior” of Fig. 11 is ob- 
tained by just such random coding. It is necessary only 


_ to not select too many possible input sequences. Thus, 


a code suitable for reliable transmission over a noisy 
channel might occur quite accidentally. However, no 
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Fic. 11. Rate and error probability for iterated parity checking. 


accident seems likely to account for the necessary de- 
coding equipment and its organization. 


CONCLUSION 


In conclusion, it might be appropriate to point out 
what applications have been made to biological prob- 
lems. The most successful ones, I think, have been to 
experiments in human communication where the chan- 
nel capacity of a human being for handling information 
has proved a useful concept in a number of experi- 
mental situations [see e.g., Rosenblith (p. 485) and 
reference 20]. There also have been applications at the 
neurophysiological level, but in terms of statistical 
signal analysis rather than of information theory per se; 
some of these are referred to in the second paper by 
Rosenblith (p. 532). Computations of neuron channel 
capacity have been made, but they are of dubious value 
in view of the alphabet problem illustrated in Fig. 4. 
It seems likely that real applications are forthcoming 
at this level. 

There have been applications to chemical specificity, 
etc., in biological systems (see, e.g., references 21 and 
22). My feeling is that these use information measure 
either as a language for the discussion of purely com- 
binatorial problems or as a useful statistic, but they do 
not use it in any coding sense which would imply that 
the informational treatment was at all necessary or 
unique. I think that the only other immediate applica- 
tion might arise in connection with the genetic-coding 
problem. Here, the most urgent need obviously is more 
data about nucleotide-amino group correspondences and 
the statistics of series of each. Although informational 
ideas may be useful here, it seems unlikely that they are 
essential. That is, it seems unlikely that high orders of 
redundancy and error-correction are being used. Some 
data on how local the coding is would be useful in 
reaching a decision on this point. 


BIBLIOGRAPHY bs 


1C. E. Shannon, Bell System Tech. J. 27, 379, 623 (1948); 
reprinted in C. E. Shannon and W. Weaver’s The Mathematical 
Theory of Communication (The University of Illinois Press, Ur- 
bana, Illinois, 1949). 

2A. I. Khinchin, Mathematical Foundations of Information 
Theory (Dover Publications, New York, 1957). 


CC-0. Gurukul Kangri University Haridwar Collection. Digitized by S3 Foundation USA 


\. Feinstein, Foundations of Information Theory (McGraw- 
ook Company, Inc., New York, 1958). 
M. Woodward, ' Probability and Information 
Graw-Hill Book Company, Inc., New York, 1953). 
iener, The Extrapolation, Waterpolakion cad Smoothing of 
Series (Technology Press, Cambridge, Massachusetts and 
Wiley and Sons, Inc., New York, 1949). 

O. Rice, Bell System: Tech. J. 23, 282 (1944). 
©. Rice Bell System Tech J. 24, 46 (1945). 
Wax, editor, Noise and Stochastic Processes (Dover Publi- 


Theory 


B. Davenport, Jr., and W. J. Root, An Introduction to the 
“heory of Random Signals and Noise (McGraw-Hill Book Com- 
y, Inc., New York, 1958). 
: "Blane-LaPierre and R. Fortet, Théorie des fonctions aléa- 
toire (Masson et Cie., Paris, 1953). 
= HM. Loeve, Probability Theory (D. Van Nostrand Company, 
_ Inc., New York, 1955). 

asf. L Doob, Stochastic Processes (John Wiley and Sons, Inc., 


PETER ELIAS 


MW. Jackson, editor, Proceedings of a Symposium on A pplica-. 
tion of Information Theory, London, 1952 (Butterworth Scientific 
Publications, London, 1953). 

15 C, Cherry, editor, Information Theory, Proceedings of a Sym- 
posium (Butterworth Scientific Publications, London, 1956). 

18 IRE Trans. Professional Group on Information Theory 
PGIT-4 (1954). 

17 IRE Trans. on Information Theory IT-2 No. 3 (1956). 

18 P, Elias, IRE Trans. Professional Group on Information 
Theory PGIT-4, 29 (1954). 

19 P, Elias, in Handbook of Automation, Computation and Control, 
E. Grabbe, editor (John Wiley and Sons, Inc., New York, 1958), 
Vol. I, p. 16-01. 

2 H, Quastler, editor, Information Theory and Psychology (Free 
Press, Glencoe, Illinois, 1955). 

21 H, Quastler, editor, Essays on the Use of Information Theory 
and Biology (The University of Illinois Press, Urbana, Illinois, 
1953). 

22H. P. Yockey, R. L. Platzman, and H. Quastler, editors, 
Symposium on Information Theory in Biology (Pergamon Press, 
New York, 1958). 


UPA 


> a 


ee 


DA at a 


REVIEWS OF MODERN PHYSICS 


HE purpose of this article is to review some of the 
basic ideas in genetics and to set the stage for a 
discussion of certain specialized problems. The first 
question to be considered is the nature of the genetic 
map: its construction and meaning. The genetic map 
is an abstract entity defined in terms of the probabilities 
with which certain types of offspring are produced in 
matings. In a sense, it is related to a physical structure 
of the cell; however, this relationship must be estab- 
lished experimentally, and it is not a logical consequence 
of the definition of the map. 

For any organism which can duplicate without a 
sexual process, one finds that progeny organisms are, 
in general, identical with the parental. Occasionally, an 
alteration occurs. If this alteration persists—that is, if 
all of the descendants of the altered organism have the 
same alteration—then, the change is called a mutation. 
In an operational sense, mutations form the basic 
entities of genetics. Mutations may occur which lead 
to red hair, to six toes, or to any of a number of other 
attributes in a higher organism. Or for bacteria, a mu- 
tation may result in a small colony or in a colony which 
is red in a certain medium. For the moment, assume 
that one can deal with several such mutations which 
are recognizable and recognizably different from one 
another. Furthermore, assume that an organism which 
contains several such mutations can be detected, and 
that the total number and types of mutations it contains 
can be enumerated. The discussion of the possible 
physicochemical bases of such mutations is undertaken 
in a later article (p. 249). For the moment, it is only 
necessary that the mutations give a recognizable result. 
The analysis begins with the definition of some organism 
as the standard or “wild” type. Suppose that one has 
three different, recognizable mutations which arise from 
the wild type; call them A, B, and C. The standard 
wild-type organism is called, by definition, “plus” with 
respect to each of these characters. The organism with 
the first mutation having lost the property A, is desig- 
nated A B+ C+ ; that is, it has lost the property A, 
but maintains the properties B+ and C+. Organisms 
of this type also may be designated simply by the 
letter “A”. 

A group of organisms may be selected, one of which 
has the mutation A, one B, and one C. If one works 
with microorganisms, then from a single individual of 
each of these types, stocks containing many millions of 
microorganisms can be grown. Except for secondary 
mutations, each organism within a stock will have lost 
a specific property. 

Any two of these mutant types may be allowed to 
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mate with one another. The mechanism of the mating 
is not of concern at the moment. It will suffice to say 
that, in general, the experiments with microorganisms 
involve mixing of the two types under suitable condi- 
tions and allowing the interactions to occur at random.! 
If a stock of organisms of one type A is allowed to 
mate with a stock of another type B, the experiment 
is called a cross between A and B and is described by 
the symbol A XB. 

The organisms discussed here are called haploid, 
meaning that they have only one set of genetic deter- 
minants. In many higher organisms, complications can 
arise which result from the double set of genetic deter- 
minants which they contain. With regard to the dis- 
cussion of a genetic map, none of these complications 
represents an alteration to the general conceptual 
scheme. Thus, for the purpose of simplification, they are 
left out entirely and only haploid organisms are 
considered. 

Among the progeny arising from the cross AXB are 
found organisms of the type A and organisms of the 
type B which are identical with either of the parents. 
Progeny organisms also are found which lack both the 
character A and the character B. Still others are found 
which are like the wild type in having both A+ and 
B+. A and B are parental in their genetic composition 
and the two types A B have the property B (i.e., the 
loss of B+) and the property A (i.e., the loss of A+) 
which it must have received, one from one parent and 
one from the other parent. And the organism A+B 
must have received the property A+ from one parent 
and the property B+ from the other parent. The new 
types, AB and A+B-, are called recombinants. 

The fraction of the descendants which are of the 
recombinant types is called the recombinant frequency 
and is the only number which is needed to describe the 
results of the cross. This is true, however, only if there 
has been no selection of one of the types in preference 
to the others. If a cross has been carried out involving 
equal numbers of A and B parents and if the analysis 
takes into account any possible difference in the mating 
type (maleness or femaleness), then one finds that the 
two recombinant types A B and A+ B+ occur with 
equal frequency. Again, this is true only if the presence 
of the property A+ or the property B- in the organism 
does not give it a selective advantage in the growth 
which occurs before the final progeny are selected and 
scored. If there were any such selective advantage in 
the growth, it would constitute a physiological effect 
which is not relevant to the genetic phenomenon being 
discussed. Therefore, only those cases are considered in 
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which the reciprocal recombinants arising from a cross can be demonstrated even if additivity does not apply. 
are, on the average, equal to each other. In practice, Three mutations can be ordered with respect to each 
this is not a serious limitation, since in most cases such other in two different ways. As in the foregoing, one 
selection easily is eliminated. can measure the recombination frequencies when each 

With a given pair of mutations, two crosses can be of the three pair-wise crosses are performed. The pair 
carried out. The two mutant types can be crossed with which shows the highest recombination frequency be- 
each other; that is, A B+XA-+B, or the double mutant tween them can be taken as the outside pair, while the 
A B can be crossed with the wild type A+-B+-. In the one which shows a lower recombination frequency with 
first case, the recombinants are A B and A+B-+ while each of the other two can be placed between them. A 
in the second they are A B+ and A+B. For the sys- more precise method of ordering three mutations is by 
tems discussed here in which none of the possible types means of a three-factor cross. This method utilizes the 
has any growth advantage over the others, the resulting fact that if the probability is low for a switching event 
recombinant frequency found in these two different which produces a recombination, the occurrence of two 
crosses is the same. Thus, one may conclude that thére such events is much less likely than the occurrence of a 


is only one number needed to describe the results of single one. Three-factor crosses are done if an organism ` 
crosses between any pair of mutants—namely, the re- has, for example, both mutations 4 and B crossed with 
combinant frequency in the progeny. an organism which has mutation C. The other two 2 
In dealing with three different mutants, three differ- crosses of this type also can be done—namely, A CXB 
ent crosses can be carried out between pairs of them: and BC XA. In these crosses, one can examine the 


A with B, A with C, and B with C. The recombinant progeny for recombinants which are like the wild type 
frequency obtained in A X Bis designated A: B, andthe in that they contain all three properties, A+, B+, and 
same notation is used for the recombinant frequency in C+. One of the three crosses will yield, in general, fewer 
BXC and A XC. If the markers are neither too closenor recombinants of this type than the other two. Suppose, 
too distant from each other, it is found that the follow- for example, the cross A CX B produces the smallest 


ing relationship holds as a good first approximation: number of recombinants. This would imply that A and 
Aae C are on the outside and B between them. 
A:-B+B-C=4-C. This ordering of three elements in an operationally 


unique way still implies nothing about the nature of the 
genetic map, since the ordering, by definition, is always 
possible. But suppose that there is a large number of 
mutations and all possible triplets are ordered in the 
same way. Then, it is possible to test whether or not 
the data are consistent with a one-dimensional map. In 
all cases so far studied, it has been found that the 
mutants can be arranged uniquely along one or more 
linear elements. In some cases, one set of mutants will 
assort randomly with respect to another set of mutants 
in crosses. For extremely close markers (those which 


That is, two of the three recombination frequencies, 
when added together, equal a third. This additivity 
obtains if the recombination frequencies are not too 
large. If the recombination frequency is large, since 
recombination can be assumed to result from some kind 
of switching event, then there is a high probability of 
| more than one switch taking place in the same mating. 
i If an even number of switches occur between two genetic 
i markers, then no recombinant is observed. Correcting 
for these multiple events by calculating the total proba- 
bility of switches, under the assumption that they occur 1” ; 
at random, one finds that the results of most crosses are give very little recombination in crosses between them), 
now consistent with the additivity relationship except one finds a marked deviation from additivity. This is 
when extremely closely linked markers are involved. To related to the phenomenon of interference in which the a 
the extent that this relationship holds, mutants can be occurrence of one switch has an influence on the proba- 
arranged as points along a line, the distances between þility of another occurring immediately adjacent to it. 
the points being propor tional to the recombination fre- In some systems, the probability of the second switch 
quencies. This line with the mutants arranged along it is reduced, in which case the effect is called interference. 
is called the genetic map. } 3 A In other systems, the probability of multiple switches 
The genetic map defined in this way has two im- -i the same region is very high, in whicl the effect 
es > ; 
ortant characteristics which must be distinguished. . ila E i! ad 8 ; ‘i ies ea yeas a 
The first is that the mutants can be arranged in unique poco Comme aC LOLEL pence n the instance of ac- 
teriophage, strong negative interference is found,? only 


order along a line without requiring any branch points 
or any other configuration which implies a two-dimen- for sets of mutants which are very closely linked. 
sional surface. The second is that the distances along Benzer has developed an extremely elegant and es- 


> the line are proportional to the recombination frequen- sentially new method of establishing the one-dimen- 
| cies- For the latter to be true, strict additivity must sionality of the genetic map by making use of mutants 
‘hold. If it does hold, then it is true also that a one- which act as though they extend over a finite length of 
dimensional"map ‘will be demonstrated. the genetic map. Thus, the conclusion from many types 
However, the one-dimensional character of the map of genetic analyses is that independently arising muta- 
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GENETICS AND THE 
. tions can be ordered along one-dimensional structures 
whether strict additivity applies or not. 

So far, the recombinational event itself has been dis- 
cussed in very generalized terms. It is an event which 
occurs with constant probability per unit length along 
the genetic map and which results in the formation of 
recombinants. For higher organisms, there is good 
reason to believe that recombinants are formed by 
crossing over. In this process, the two homologous struc- 
tures which correspond physically to the genetic map 
pair each with the other, with a break occurring at the 
corresponding points. The broken ends reunite in such 
a way as to form two new structures, each of which 
contains parts of the genetic structure of the parents. 
One of the characteristics of the crossing-over model is 
that recombinants always are formed in reciprocal pairs; 
that is, if the cross AX B is done, the wild-type and the 
type A B recombinants always are formed in the same 
event. This is a much stronger statement than that 
made earlier about the equality of these reciprocal 
recombinants on the average. 

Only by special tricks can one determine whether or 
not reciprocal recombinants are formed in the same act. 
However, in those higher organisms where the phe- 
nomenon can be studied—and this is primarily with 
the molds, which, for present purposes, are considered 
as “higher’’—one can show that the predictions of the 
crossing-over model are borne out. Where the corres- 
ponding experiments are possible with viruses and bac- 
teria, it is generally found that reciprocal recombinants 
are not formed in the same event. In those cases where 
recombinants are formed without associated formation 
at their reciprocals, there is also evidence which suggests 
that the parental structures are not broken during the 
recombination act. One is then led to a model of re- 
combinant formation which involves the copying of the 
genetic information along one structure for part of its 
length, followed by a shift after which the copying 
continues along another parent. This mechanism has 
been called copy-choice (Fig. 1). 

It should be kept in mind that the generalizations 
about the genetic map are independent of the mecha- 
nism of recombination. Whether recombinants arise by 
crossing-over or by copy-choice, the one dimensionality 
of the map and the roughly additive distances along it 
can be established. The event which produces the re- 
combinants, regardless of its nature, must, by definition, 
occur with a constant probability per unit length of 
the map. 

The next question concerns the relationship between 
the genetic map and the physical constituents of the 
cell. In the case of a few higher organisms, it has been 
possible to show by very direct methods that there is a 
one-to-one correspondence between the position of a 
genetic marker along the map and the position of an 
observable alteration on the chromosome of the organ- 
ism.4 The chromosomes are cellular components origi- 
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Fic. 1. Two models by which recombinants could be formed 
considering the genetic map only; (a) is the crossing-over model 
and (b) is copy-choice. 


nally characterized by the color they demonstrate with 
particular stains. The chromosomes of the Drosophila 
salivary gland are abnormally large and, because of 
characteristic banding, individual portions can be iden- 
tified by direct microscopic observation. Inverted or 
missing segments of a chromosome arise in some mutant 
organisms and can be recognized morphologically. 
Corresponding to the inversion in the visible chromo- 
some is an inversion in the mapping of a short region 
of the genetic map. The same holds true when deletions 
are observed cytologically. In this case, a small region 
of the genetic map acts as though it had been de- 
leted. By experiments with mutants of this type, it 
has been possible to establish a one-to-one correspond- 
ence between the position along the genetic map and 
the position along a chromosome (Fig. 2). This corres- 
pondence, however, has been made in detail only with 
the giant chromosomes of Drosophila which contain 
approximately a thousandfold more material than 
normal chromosomes of Drosophila and are in cells which 
cannot undergo further division. 

In all of the organisms where it is technically possible 
to do the combined cytological and genetic studies, 
the correspondence between the genetic map and the 
chromosomes can be demonstrated. The chromosome, 
however, is a complicated structure which contains 
several chemical components. Of these, only deoxy- 
ribonucleic acid (DNA) is constant in amount in 
all normal cells of a given organism. Those cells which 
have only one set of genetic determinants contain only 
half as much DNA as the amount contained in the 
majority of cells known to possess a double set of 
chromosomes. 

There is a considerable body of evidence indicating 
that the chemical structure which contains the genetic 
information and which, in fact, corresponds to the 
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genetic map is nucleic acid. Since this conclusion has 
come to be very widely accepted, it is important to 
examine in some detail the experimental basis for it. 
Be Four types of experiments are reviewed which lend 
strong support to the view that nucleic acid can carry 
a genetic information. The acceptance of the generaliza- 
tion, however, that nucleic acids do carry the genetic 
information in all organisms is still very much a matter 
of personal taste. Most of the direct experimental 
demonstrations have come from studies with micro- 
organisms, and, in general, these demonstrations have 
consisted of showing that genetic information, either of 
bacteria or viruses, could be transferred to a recipient 

_ cell by the sole means of nucleic acid. 
= The most complete analyses have been done in experi- 
ments with bacterial transformation.> Pure DNA is ex- 
tracted from a bacterial suspension of one genetic type 
and added to bacteria having different genetic proper- 
= ties. Genetic characters are transferred in this way from 
- the original cells, which donated the pure DNA, to the 
_ recipient cells. The material which produces the trans- 
rmation has been purified to the point where no bac- 
riz ibstances other than DNA are detectable, and 
nt of protein which might be in these prepara- 
ape detection is less than 0.02%. That any 
scted component might have biological 
ely by the fact that, as the puri- 
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Fic. 2. The genetic map (upper line) and the X-chromosome of the salivary gland of Drosophila Melanogaster 
[after J. L. Bridges, J. Heredity 29, Suppl. 1 (1938). 


fication proceeds, the amount of transforming activity 
in the preparation is proportional to the quantity of 
DNA and independent of the quantity of any other 
components. Furthermore, it has been shown that the 
number of transferred cells is proportional to the amount 
of DNA actually absorbed by the recipient cells. Addi- 
tional evidence has been provided by experiments in 
which enzymatic degradation is used specifically to de- 
grade various components. Only the enzymes which 
degrade DNA eliminate the ability of the purified 
material to transform cells genetically. 

The results of the experiments, which seem to be quite 
unambiguous, show that one can transfer genetic infor- 
mation from one cell to another by means of purified 
DNA. Thus, it can be concluded that, at least at some 
stage in the life cycle of the bacteria, the genetic infor- 
mation must be contained in DNA. However, the next 
question arises when the duplication of the cells is con- 
sidered. Under these conditions, it is clear that the 
genetic material also must duplicate; the simplest hy- 
pothesis for this would seem to be that DNA of a specific 
type makes moré of its own kind. One cannot eliminate 
the possibility that, in the process of duplication, the 
information must be passed to some other substance 
before it is passed back to new DNA. Although there < 
has been a good deal of speculation along these lines,’ 
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-there are no experiments which definitively show that 
the information is transferred to anything but DNA. 

Other experimental results which indicate that the 
genetic information is contained in nucleic acids have 
been obtained with viruses and with genetic recombina- 
tion in bacteria. The first evidence that only the DNA 
of viruses carries genetic information came from the 
experiments by Hershey and Chase?’ using the bacterio- 
phage T2. This is a virus which can duplicate only after 
it has attached to a bacterial cell. The virus particle 
consists of a head and a tail. The tail is composed of 
proteins exclusively and the head of a protein mem- 
brane surrounding DNA. When the virus attaches to a 
bacterial cell, it uses the tip of its tail, and after attach- 
ing, the DNA which was contained in the head of the 
virus is transferred into the bacterial cell, leaving the 
head membrane and most of the tail on the outside of 
the bacterium. This residual protein which remains on 
the outside can be sheared off by stirring the infected 
cell in a Waring Blendor and the bacterium, with the 
DNA of the virus inside it, will develop then and pro- 
duce a large number of virus particles identical with 
the one used for infection. Thus, most of the viral pro- 
tein can be eliminated, indicating that it plays no role 
in the formation of the new virus particle. 

The most obvious interpretation of these experi- 
ments is that the protein acts as a microsyringe in- 
jecting the DNA into the host cell, and that this DNA 
carries the genetic information necessary for the forma- 
tion of the new virus particles. On the other hand, 
there is a measurable amount of protein which cannot 
be eliminated and which does enter into the cell with 
the injected nucleic acid. The fact that the atoms of 
the associated protein do not appear in the progeny 
virus, while more than 50% of the atoms of the injected 
DNA do, also suggests here that it is solely the DNA 
which is the genetic material. It must be kept in mind, 
however, that the associated protein may play an es- 
sential role in initiating the process of growth in the 
invaded cell. 

In the case of tobacco mosaic virus (TMV), similar 
results have been obtained, but a more definitive analy- 
sis has been possible. In this case, one can completely 
separate the protein and the nucleic acid of the virus 
by chemical means. It is found, here, that the nucleic 
acid is infective and that the protein is not.® If one 
reconstitutes the nucleic acid from one strain of TMV 
and the protein from another into a single particle, 
then the reconstituted particles have the genetic proper- 
ties of the strain from which the nucleic acid was taken 
but not of the strain from which the protein was taken? 
It is important to note, however, that, in the case of 
TMV, the nucleic acid involved is ribonucleic acid 
(RNA). The same kinds of analytical and enzymatic 
demonstration of the purity of the RNA have been 
carried out as were discussed in the foregoing for the 
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transforming factor. Here, the results indicate that the 
genetic information is carried in the RNA exclusively. 
Again, as in the case of DNA, one can argue about the 
lower limits of detectibility of some odd component, 
but this is also a case where one has to push the evidence 
rather far to escape the conclusion that it is nucleic 
acid which contains the genetic information. 

Genetic recombination takes place between fertile 
strains of bacteria by direct contact of the donor and 
recipient cells.!° It has been shown that a cytoplasmic 
bridge is formed between these two cells and that DNA 
passes from the donor to the recipient. It has been 
demonstrated that the amount of DNA transferred is 
proportional to the amount of the genetic material as 
measured along the genetic map. These experiments 
have not yet been done in such a way as to eliminate 
the possibility of small amounts of other components 
also being transferred, but the apparent proportionality 
between the amount of DNA and the length of the 
genetic map again suggests that the genetic information 
is contained in DNA. 

Since the structure of RNA has not been worked out 
yet, it is difficult to say very much about the significance 
of those cases where RNA carries genetic information. 
In the case of the genetic information contained by the 
DNA, however, it is clear that the information must 
reside in the order of the four bases along the DNA 
chain. This is evident from the fact that, as far as the 
sugars and the phosphate-ester linkages are concerned, 
the DNA molecule is extremely monotonous with every 
part of it identical to every other part. One can look, 
therefore, at the three essential properties of any genetic 
material in terms of the base sequence of the DNA 
molecule containing the genetic information. The first 
activity that a genetic entity must be able to perform 
is reproduction and, in this regard, the suggestion made 
by Watson and Crick” as to the possible mechanism for 
the duplication of DNA is extremely hopeful. At present, 
there are no experiments to contradict this hypothesis 
and there are a large number which seem to be con- 
sistent with it. In addition to duplicating its own kind, 
genetic material must be such that recombinants can 
be formed. Although a great deal already is known about 
the way in which recombinants are formed, it will 
probably be necessary to know much more about the 
physicochemical properties of DNA in solution before 
a self-consistant model can be constructed with any 
confidence. Specifically, one must find out whether the 
DNA in cells has all of its possible hydrogen bonds 
intact or whether some regions of the molecule have 
two unpaired chains available for pairing with hom- 
ologous parts of other DNA molecules. If this is so, then 
it is possible’ to construct a model for the recombina- 
tional event consistent with all of the available genetic 
data. Finally, the genetic material must be able to 


exercise its control; that is, it must be capable of 


controlling the formation of proteins and perhaps other 
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ce |] components with specific composition and configura- 
on. This ae is taken up in a later article (p. 249). 
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OME of the evidence for the belief that genetic 
information resides in the structure of nucleic acid 

has been presented by Levinthal (p. 227). A key ques- 
tion in molecular biology is the mechanism by which 
the replication of such genetic information takes place. 
It is certainly agreed that nucleic-acid replication takes 
place within living cells, and, in a few cases mentioned 
later, some information bearing on the mechanism of 
replication has come directly from observations on 
growing, normal cells. But it is also possible to observe 


- phenomena related to nucleic-acid replication in cases 


where the nucleic acid of interest has been allowed to 
enter the cell from the outside. Three distinct systems 
of this nature are known: (1) Some bacterial cells re- 
lease a product, called the “transforming principle” and 
now known to be pure DNA, that is capable of changing 
in an hereditary manner certain characteristics of other 
cells which have been exposed to the transforming 
material.! (2) Infection of bacterial cells by DNA-con- 
taining viruses (bacteriophages) causes the viral DNA 
to be replicated. (3) Purified RNA, notably from to- 
bacco mosaic virus (TMV), can be mechanically inocu- 
lated into plant cells, thereby initiating the replication 
of more TMV-RNA encased within whole virus par- 
ticles.2 The greater part of this paper deals with experi- 
ments with bacteriophages, since it is with these 
systems that the greatest amount of quantitative and 
precise information is available. 


GROWTH OF BACTERIOPHAGE WITHIN 
INFECTED CELLS 


Although dozens of different bacteriophages are 
known, the group of the “T” phages that infect and 
lyse cells of Æ. coli has been worked with by far the most 
extensively. Within the 7-group itself, there are three 
closely related strains, T2, T4, and T6 (the T-even 
phages), which have furnished much of the biological 
and biochemical information that bears on nucleic-acid 
replication. 

There are several important observed facts about the 
intracellular growth of the T-even phages.’ In the 
extracellular form, the infective phage particle consists 
of a head and tail. The head consists of a thin, protein 
membrane, within which is stuffed all the phage DNA, 
amounting to about 300000 nucleotides. The tail is 
solely protein and serves as the attachment organ, 
having a high degree of host-range specificity, and 
through which the DNA is injected into the cell upon 


_infection. Bacteria usually can be infected with several 


ae 


phage particles, with the likelihood of success diminish- 
ing with the length of time elapsing since infection by 


. 
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the first particle. Except for a small amount of soluble 
protein, polypeptides, and polyamines, the only phage 
material transferred to the interior of the cell is the 
DNASE 

Shortly after infection the host cell knows that it is 
in for trouble. The synthesis of host RNA drops almost 
immediately to about 3% of the normal rate. Although 
the synthesis of total protein is unaffected, it is not 
long before most of it is directed toward making phage- 
specific structures. Total DNA synthesis is held up at 
first, and after about 10 min at 37°C the pre-existing 
host DNA begins to be transformed into phage DNA. 
Just before lysis, most of it has disappeared in this 
manner. In the case of the T-even phages, this depletion 
is made evident by the complete loss early in infection 
of the structural integrity of the cell nucleus. 

In the meantime, there is activity in the synthesis of 
phage-specific materials. For about 15 min, no infec- 
tivity can be demonstrated in the contents of artificially 
disrupted cells,’ this observation defining the so-called 
eclipse period. At the end of only about 6 min, some 
RNA, related to phage infection, begins to be synthe- 
sized. Since it does not appear in the mature phage, it 
is not a precursor material, but it does differ in its base 
composition from normal host RNA. At about 8 min, 
phage-precursor DNA begins to be synthesized (and 
also to be converted from host DNA) so rapidly that 50 
phage units of it have accumulated at the end of the 
eclipse period. At about the same time, various protein 
fractions that are phage-specific antigens begin to be 
made. Some of this material is not incorporated into 
infective phage and remains as surplus antigen. After 
about 15 min, it is possible to detect infective phage 
units in the contents of disrupted bacteria. Their number 
increases rapidly at first and then more slowly as the 
average lysis time is approached.!? The average burst 
size is something like 100 phage particles, although this 
number varies greatly from cell to cell. Even at the time 
of lysis there exist within the cell about 20 phage units 
of excess phage DNA, the same amount of surplus phage 
antigen, and about 10 units of nonhost RNA. Evidently 
lysis does not result from a completion of phage matura- 
tion, but represents rather a catastrophic collapse of 
the integrity of the cell wall (see Fig. 1 for a summary 
of this paragraph). 


There are two distinguishable periods during the in- 


fective cycle. During the eclipse period, the components 
of the mature phage are being synthesized and are 
formed into a precursor pool. At this stage, DNA repli- 
cation and interchange begin to take place. Subse- 
quently, preformed structural units begin to be as- 
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sembled into whole particles, although during this time 
some synthetic activity still persists. For our purposes, 
this period of maturation is of less interest than the one 

that precedes it, the so-called vegetative stage. 
Inquiry into the origin of the DNA which ends up in 
the progeny phage is greatly aided by isotopic labeling, 
and by the fact that the DNA of the T-even phages can 
= be distinguished from host DNA by the presence of the 
pyrimidine, 5-hydroxymethylcytosine (5-HMC)." It is 
found that 72-phage DNA comes mostly from material 
that is present in the growth medium at the time of 
infection; pre-existing host DNA contributes less than 
one-third of the total finally assembled into phages. 
This host DNA is evidently broken down into low 
molecular-weight fragments since it contains 5-HMC 
when repolymerized into phage DNA. As might be ex- 
pected, the earliest phage to be matured are relatively 
r rich in DNA from the host. The last phage to be ma- 
= tured contain some host DNA, further evidence that a 
_ large pool is formed, consisting of host DNA, newly 
synthesized. DNA, and DNA contributed by the parent 

particles. 

Bs In contrast to the DNA, the phage protein is found 
to be synthesized solely from the growth medium. The 
pein that i is made within 5 min of infection i is found 
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Fic. 1. A semischematic repre- 
sentation of the time course of the 
intracellular development of cer- 
tain protein and nucleic-acid com- 
ponents following the infection of 
E. coli with T2 bacteriophage ina 
one-step growth experiment. 
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standing of phage DNA replication. It might be expected 
that the formation and fate of RNA within a cell 
would be unaffected by infection, since no RNA is in 
mature phages. But it is found that the RNA assembled 
after infection is unlike normal host RNA in two re- 
spects:® (1) its base composition is different, and (2) it 
suffers a rapid metabolic turnover, in contrast to the 
RNA of uninfected cells. This observation may be im- 
portant in later considering DNA replication. 

As already mentioned, two kinds of protein appear to 
be synthesized in infected cells: a precursor protein, 
and a nonprecursor one that is formed preferentially 
immediately after infection. Its synthesis is closely re- 
lated to the formation of phage DNA, for, if protein 
synthesis in general is blocked by chloramphenicol 
during the first 5 min, no phage DNA is formed. But 
if blocking is delayed, phage DNA can be formed in 
amounts that are greater, the longer the delay. There 
is apparently a stage at which the assembly of protein 
and DNA become independent of each other, although 
they are severely dependent at the start. Experiments 
also have shown the inverse relation: that, if the as- 
sembly of phage DNA has started, its subsequent block- 
ing by ultraviolet irradiation will not halt the synthesis 
of phage-precursor protein. 


FATE OF PARENTAL DNA 


After attachment of phage particle to bacterium, 
almost all of the phage DNA enters the cell. Experi- 
ments using P’? and C™ have been used to determine 


how much of the parentally injected DNA is found in — 


the first and subsequent progeny generations. The ex- 
perimental results with T-even phages are these: at 
each generation about one-half of the P**-labeled nucleo- 


tides fins. their way into the progeny particles 
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-(Whether or not the transfer is as inefficient as this is 


somewhat in doubt because of technical limitations in 
the recovery of all of the labeled material.) It is unlikely 
that the loss of the parental P** atoms is due to a drastic 
breakdown of the phage DNA resulting in a redistri- 
bution of the P** atoms among both the phage-precursor 
and the host DNA. Experiments of both biochemical 
and genetic nature strongly indicate that fairly large 
nucleotide sequences are transferred intact from parent 
to progeny. It also appears that phage DNA is not made 
up of a portion that is transferred with 100% efficiency, 
and a portion transferred not at all. If this were so, the 
efficiency of transfer from first-generation progeny to 
second-, and from second- to third-, would approach 
100%, as distinct from the observed constancy of the 
efficiency factor. The most likely explanation is that 
there is a combination of apparent losses owing to tech- 


` nical reasons, and of random losses of fairly large phage 


nucleotide sequences during the process of infection, 
replication, and maturation. 

The kinetics of DNA transfer can be followed by 
breaking open the infected cell at various times and 
examining the state of aggregation of the parental 
DNA." During the eclipse period, such DNA is found 
partly in free polymers and partly in lower molecular- 
weight material. After the eclipse period is over, the 
earlier phages to mature contain more of the parental 
DNA, with the latest to mature containing practically 
none. These results are in agreement with all others 
which suggest that all phage-precursor substances lie 
around in a pool until the onset of maturation begins to 
withdraw them from it, apparently at random. 

So far, only the content of parental DNA among the 
average of the progeny particles has been considered. 
It might be expected, however, that the distribution of 
the parental DNA from one individual progeny particle 
to another would throw more light on replication mecha- 
nisms. Two extremes are possible: (1) all of the trans- 
ferred DNA winds up in one particle; (2) it is equally 
divided among all progeny particles. The techniques 
used to determine this distribution are the counting 
method of Levinthal and Thomas,!®:!® in which the 
P*? content of the phage particles is assayed by counting 
B-decay “stars” on nuclear emulsions, and the “suicide” 
method of Stent,!” in which the content of P*® is assayed 
by its inactivation of those phage particles that harbor 
it. The former method has high precision in assessing 
the size of stars (the P? content of the phages), but a 
low precision in assessing the number of phages per 
infected bacterium that produce detectable stars. The 
relative sensitivities are in the inverse order in the 
suicide experiments—the P* content of highly labeled 
phage is poorly. determined, while the number of phages 
containing at least some minimal amount of P® is 
precisely determined. 

Both methods agree in showing that the distribution 
of parental DNA is neither all-or-none, nor is it uniform 
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from particle to particle. Rather, the experiments show 
that a very few of the first-generation progeny particles, 
per bacterium, individually contain as much as 20 to 
25% of the parental P. The remainder of the progeny 
particles contain far less, with some containing no de- 
tectable amount. About half of the parental DNA goes 
into one first-generation progeny particle. Since Levin- 
thal and Thomas have found that phage particles, 
uniformly labeled in their P**, when osmotically shocked, 
disclose a nondispersed DNA piece containing about 
40% of the original label, they have been inclined to 
believe that just half of this initial “big piece” stays 
intact and goes into the highly labeled phage particle 
of the first-generation progeny—.e., it is one-half of the 
phage chromosome, presumably a DNA duplex. When 
the first-generation progeny are caused to infect, the P® 
label continues to be transferred to the second genera- 
tion, but the quantitative measurement of the results 
is less reliable. Levinthal finds that the star size is un- 
diminished—.e., that the big piece stays intact—but 
he cannot be certain as to the number of such stars per 
bacterium. Stent finds that the number of highly labeled 
phage per bacterium is the same as with the first genera- 
tion, but he cannot be sure that the P® content of these 
has remained the same. However, it appears certain 
that there is a highly nonuniform transfer of parental 
DNA; whether or not it is a single piece that could be 
called the phage chromosome awaits extensive and con- 
vincing genetic investigation. The idea that the big piece 
is the genetic template does have the merit of simplicity. 
It can be visualized that upon infection the big piece of 
the parental phage remains intact until at some stage 
it comes apart exactly into two halves. One half may 
be broken down into smaller units like the DNA units 
of the parental phage outside the big piece. The other 
half serves as a template for the assembly of other 
“half big pieces,” two of which enter every newly 
matured phage. The original half big piece is preserved 
intact to enter one of the progeny phage (and to result 
in a star, if the parental phage is P® labeled). About 
one-half of the DNA in each progeny phage will be in 
the form of smaller units which had as their templates 
either a portion of the original half big piece or the 
smaller DNA units from the parental phage. 


DIFFERENCES AMONG PHAGE TYPES 


So far, the biological and biochemical events associ- 
ated with infection by only the T-even phages have 
been discussed, but it would be incorrect to conclude 
that they are examples of all phage types that have 
been studied. No comparable amount of information 


respecting DNA synthesis is known for the other phages ~ 


for the reason that they do not contain a DNA species 
distinguishable from the host DNA. Only the T-even 
phage have been found to have S-hydroxymethylcyto- 
sine (S-HMC). 

Distinct differences do exist between infection with 
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TABLE I. Characteristics associated with infection by phages of different types. Most phages are 
found to fall into one of two classes, here called class A and class B. 
f uv Mult. Capacity of uv cells Cell Vegetative 
} Class Sens. React. Normal phage uv phage nucleus radiosens. Recomb. Lysog. 
| A Decreases 
j T-even High High Resistant Resistant Lost with time Frequent Never 
| 5 
l B Less Constant 
_r, P22 Low Low Sensitive resistant Intact with time Rare Sometimes 
j 
| 


various phages, at any rate in a biological sense, and 
they are worth recounting cursorily for the insight they 
lend toward a proper feeling of uncertainty about phage 
development. In a general way, phages may be put into 
two categories, which can be termed A and B. The 
T-even phages are examples of A; phages T1, T7, A, 
p8, and P22 are among the B list. These may be con- 
trasted in several different ways? (Table I): 

(1) The host-cell nucleus is destroyed upon infection 
with A ; preserved with B. 

(2) Upon multiple infection with homologous phages 

: with different genetic markers, it is found that genetic 
exchange (recombination) is frequent among the 4 
types and infrequent among B. 

(3) Phages in the A category are found never to enter 
into a lysogenic'® relation with the cell—i.e., to enter 
into a nonvirulent stage within the host cell—conferring 
upon it the hereditary property of later producing phage 
either spontaneously or by induction. B phages some- 
times do become lysogenic. 

(4) In experiments with ultraviolet-light inactivation 

, i of virulent phages, it is found that the phages of A are 
| relatively more sensitive than those of B. ; 

(5) When phages of category A are inactivated with 
ultraviolet light and then are allowed to infect cells 
multiply, much of their infectious activity can be re- 

' stored, a phenomenon called “multiplicity reactivation.” 
7 Phages of B are not nearly as capable of multiplicity 
reactivation. 
= (6) Phages A and B also differ with respect to the 
oar.” “capacity” of their host cells, defined as the ability of 
ci cells to reproduce infective phage, particularly after the 
cells have been damaged with agents as ultraviolet light 
and x-rays. For phages of class A, the capacity of cells 
is much more resistant to ultraviolet irradiation than 
it is for class B. 

(7) Cells may be irradiated also while the phage is 
in the vegetative state, allowing measurements to be 
= made of the radiosensitivity of the infected host to the 
production of infective virus. The phages of class A 
and their host cells) have a radiosensitivity that rapidly 
eases with time after infection, while those of class 

p very nearly retain their initial sensitivity. 
" The simplest generalized conclusion that can be drawn 
for of experimental facts is that the 


oregoing set ; 
! oe 2 phages differ in the degree to which, 
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during the vegetative stage, the genome of the infecting 
phage is associated with that of the host cell. It appears 
likely that the genome of phages T2 and T4, as examples 
of class A, replicates with considerable independence of 
the host, while the phages of class B multiply only as a 
consequence of genomic interaction with the host cell. 
A virtue of this unitary explanation of what appear to 
be different phenomena is that it allows the preservation 
of a concept that has proven fruitful in the cases of the 
T-even phages. This is the notion, given quantitative 
elucidation by Visconti and Delbriick,! that mating is 
a frequent activity in the DNA pool of the vegetative 
state. Mating is operationally defined as having occurred 
if two vegetative phages have interacted in a manner 
such that the probability is 0.5 that they have inter- 
changed two distantly linked genetic loci, an occurrence 
detected by the production of recombinant phages. 
Conceptually, mating may be thought of as the pairwise 
physical interaction of two (or more) homologous 
genomic structures. The arithmetical conclusion of the 
Visconti-Delbriick theory is that each act of genomic 
replication is associated with a mating act, the number 
of such double acts being about 5 for the 72 phages 
during the vegetative phase. Strength is accorded this 
notion by the observation that, if lysis is inhibited and 
the infected bacteria opened at various times, the pro- 
portion of recombinants in the matured phages in- 
creases with time after infection—i.e., the longer the 
time spent in the mating pool, the greater the chances 
of recombination. But the experiments showing a low 
frequency of recombinants with such phages as A 
(class B) have cast doubt on the correctness of the 
Visconti-Delbriick theory. The theory can be rescued 
qualitatively by assuming that mating still accompanies 
replication, but in the case of the class B phages the 
mating is frequently between the phage and the host 


genomes. 
MECHANISMS OF DNA REPLICATION 


The paper by Levinthal (p. 227) shows that there is 
good evidence that DNA can be the bearer of genetic 
information; that the detailed structure of one chain 
determines that of the other; and that presumably the 
structure of newly formed DNA is determined by that of 
the parental DNA. (It should not be inferred, however, 


REPLICATION OF NUCLEIC ACIDS 


` that only DNA is capable of transmitting such informa- 


“tion, since these are examples of the apparent genetic 
stability of the RNA-containing viruses.) The problem 
to consider now is the mechanical detail of how the 
DNA that is in the process of being formed is so con- 
trolled in its sequence of nucleotides that it usually 
will be genetically identical (and presumably struc- 
turally identical) with the DNA that is acting as a 
template. A correlative problem is how the template 
DNA, either directly or through intermediaries, con- 
trols the specific assembly of amino acids into proteins, 
to confer upon progeny structures some of their pheno- 
typic characteristics. 

It is an attractive consequence of the two-chain 
structure of DNA to assume that the obvious is true: 
that replication is preceded or accompanied by a separa- 
tion of the two chains, each of which then acts as a 
` template for the production of its mate. This could be 

done by allowing the chains initially to separate at an 
end of the helix, forming a short-pronged Y. New 
nucleotides then would start assembling in proper se- 
quence along the branches of the Y, the residual helix 
untwisting as the polymerization of the two new chains 
proceeded. A difficulty with any scheme of this or 
similar nature is the unwinding process; how can two 
chains involving several hundred turns separate by 
untwisting, both in consideration of the energy required 
and the hazards of hopeless tangling? To avoid this 
dilemma, several schemes involving rather ad hoc 
notions of mechanical processes have been proposed, 
such as backbone breakages occurring at every turn of 
the helix during replication, but all of the schemes seem 
to introduce more difficulties than they remove. Actu- 
ally, the energy requirements of unwinding the helix 
are not as great as might be imagined, if the helix is 
restricted to turning on its own axis. If it always turns 
on that axis, there is also a minimal chance of tangling. 
Levinthal and Crane” showed that the energy required 
per turn of the helix is very small as compared with the 
energy of formation of the phosphate-diester linkages 
per turn. It seems inevitable that the transfer of infor- 
mation from parental DNA to new DNA and to new, 
specific structures of other kinds, such as RNA and 
protein, involves unwinding at one or more stages. 

As there seems to be no likelihood in the near future 
of directly determining the mechanics of replication, it 
is profitable to examine phenomena that indirectly 
might indicate what kind of mechanism is the most 
probable. Here some of the phage experiments are 
relevant. Unfortunately, the transfer experiments shed 
very little light on the problem; they indicate that there 
is some conservation of large pieces of P**-containing 
material and some dispersal of smaller pieces among 
several progeny particles. A suggestion by Stent,’ which 
appears attractive, formulates a sequence of events 

` during the vegetative stage whose interpretation can 
point toward a mechanism of replication not mentioned 
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above. These events are characterized chiefly by three 
facts: (1) phage-directed RNA synthesis begins im- 
mediately after replication; (2) the blockage of protein 
synthesis stops phage-DNA synthesis if it occurs soon 
enough, otherwise DNA synthesis proceeds in any 
event; (3) with T-even phage infection, the ultraviolet 
light and P* sensitivity decreases with time during the 
vegetative phase. These facts can fit a replication 
scheme wherein the parental DNA structure is con- 
served initially, at any rate, and serves as template in 
the grooves of which is assembled a single chain of 
ribonucleoprotein. This chain subsequently unwinds off 
the template, leaving the latter to serve in this capacity 
again. The RNA-protein chain wanders off in search of 
a mate, which may be a chain copied from another 
phage-parental DNA helix or from a DNA helix of the 
host itself (class A vs class B). Mating consists in form- 
ing a duplex helix of two reasonably identical or homol- 
ogous RNA-protein chains. The new DNA is then 
assembled by a process the inverse of the formation of 
the RNA-protein chains, wherein the two chains of the 
DNA are helically assembled around one of the chains 
of the RNA-protein duplex, the other chain disengaging 
during the DNA assembly. The final, RNA-free DNA 
is then disengaged from the template RNA-protein 
chain by unwinding. The RNA-protein intermediaries 
may be used also to serve as templates for the assembly 
of phage-specific protein. The handing on of the guid- 
ance of DNA synthesis to a ribonucleoprotein template 
would account for the production of RNA shortly after 
infection, and for the blocking phenomena respecting 
protein and new DNA, since DNA synthesis could 
readily proceed after a few RNA-protein templates 
became assembled. Further, since the secondary tem- 
plate is now made of all-new RNA and protein, one 
might expect it to be relatively resistant to damage by 
ultraviolet light and to P*? decay. The scheme has the 
further merit of predicting that the effects of slightly 
damaged parental (or host) DNA, such as single-chain 
breakage by ultraviolet light or x-rays, could be effec- 
tively overcome since all that is required is that the 
DNA helix hold together as a duplex. It is hard to see 
how single-chain breaks, spotted here and there along 
both chains of the helix, could lead to anything other 
than abortion if the original Watson-Crick replicating 
scheme is correct. The Stent hypothesis also provides 
for mating at each replication, although other schemes 
arrange for this too. 

It does not seem that the observed facts of genetic 
recombination can point unequivocally to either a 
replicating scheme that involves DNA assembly directly 
upon existing DNA templates or to one involving an’ 
intermediary helical structure. It is grossly inferred 
by light-microscopic observations of cellular organisms 
that recombination is due to the existence of breaks and 
subsequent crossing-over in the dividing chromosomes. 
With phage, however, the genetic and biological evi- 
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dence is contrary to the expectation that crossing-over 
among genetic structures exists. The alternate possi- 
bility is that recombinants are produced by a mecha- 
nism of “copy choice,”™ in which a daughter genetic 
structure switches from one template to a closely- 
adjoining but slightly differing one during its stepwise 
assembly. Such switches may be multiple, incorporating 
several genetic loci from the two mated structures 
serving as templates. But a hypothesis of copy-choice 
used in interpreting genetic recombination does not re- 
sult in any unequivocal answers with respect to the 
l mechanism of replication, except to enhance the notion 
j that replication is associated with mating. 

Replication schemes can be categorically distinguished 
| (on paper) with respect to the resulting distribution of 
the parental DNA. If the parental DNA remains intact 
| during all replication processes, resulting in only one 
| parental structure and numerous other DNA’s contain- 
| ing no parental material, a conservative system is said to 
| exist. A semiconservative system is one in which the 
single DNA chains of the parental duplex retain their 
| integrity but are physically and permanently separated 
i during replication. The original Watson-Crick repli- 
j ; cating scheme is a semiconservative one. A dispersive 
| mechanism is one in which the atoms of the parental 
| DNA are dispersed by some kind of fragmentation 
| 


among all the daughter DNA structures. It might be 
hoped that experimental determinations of the kind of 
mechanisms so categorized would lead at least to the 
elimination of some replication schemes. 


REPLICATION MECHANISMS AT THE 
i ‘ CELLULAR LEVEL 


Some autoradiographic work done by Taylor and col- 
leagues” on the dividing chromosomes of root cells of 
the English broad bean is next to be considered. The 
growing cells are first fed tritium-labeled thymidine, 
| which is known to be incorporated only in the cellular 
Wei DNA, and are then observed after one or two divisions 
i f =_= ina normal medium. Colchicine is used to preserve both 

A ok generations of chromosome division within one nuclear 

Pi E3 structure. The upshot of the experiment is that the 
aF f chromatid structures of these cells multiply in a way 

that can be interpreted as semiconservative at the level 

of the DNA duplex. That is, if each chromatid is thought 
of as consisting of numerous DNA helices (arranged in 
presumably some regular fashion), the helices associated 
with any one chromatid separate into two strands with 
each strand preserving its identity. At the half-chro- 
matid level, the mechanism is conservative: the DNA 
We = chains that are associated together in any one half- 
= chromatid stay together indefinitely. 

Te In experiments with growing Æ. coli, Meselson and 
Stahl? have been able to investigate the type of repli- 
ee g mechanism at the level of the DNA polymers. 
coli that have been grown on N° are subsequently 
ved to grow for a few divisions on N“, and their 
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DNA is examined by density equilibrium centrifuga-  “ 


tion, using a caesium-chloride gradient. A gradient can 
be established with a sufficiently gentle density change 
to distinguish structures such as DNA by their relative 
content of N** and N“. DNA containing solely N! will 
form a band in the gradient cell that is distinctly more 
centrifugal than will a DNA containing only N“. It is 
found that at first, of course, all of the Æ. coli DNA is 
N'*-containing. As growth goes on, a band midway 
between N!* and N“ appears, followed later by a band 
at N“. At no time does material appear at the interband 
region, thus establishing that dispersive mechanisms do 
not operate in this system. The N™:N'* band shows 
that the initial N'*-containing DNA helices are diluted 
just one-half by N“ during the growth cycle. The most 
obvious interpretation of the results is that DNA repli- 
cation is semiconservative at the level of the DNA 
duplex. This is an elegant technique indeed and it might 
be hoped that it would give straight answers to the 
mechanism of replication of phage DNA. Although only 
a preliminary report of such investigation is available 
at present, it appears that the answer will not come 
easily. When phage previously grown on N“ are allowed 
to infect N'*-containing cells, in an Në medium, the 
progeny phages can be purified and the DNA examined 
for N™:N?5 content. What is then seen in the gradient 
centrifuge is a single N! band. If the cells are disrupted 
prematurely to expose the DNA vegetative pool, a very 
small amount of what appears to be an N“:N! hybrid 
DNA is found. The experiments are very likely incon- 
clusive. If one takes the results with the purified progeny 
phage as they come, one would conclude that there is 
conservative replication. But it is easy to calculate that 
any N™:N?! hybrid band which might exist would be 
just on the limit of detectability. So the experiments 
prove nothing, so far. 


CRITERIA FOR ACCEPTABLE REPLICATION SCHEMES 


Let us return again to the problem raised by the 
existence of the big piece of phage DNA, along with 
the implied existence of small pieces. All that is known 
about these structures is that a P**-containing big piece 
is held intact during osmotic shock, and that very likely 
half of this big piece is passed on to individual particles 
of progeny phage. But it is not known how the halving 
of the big piece comes about during replication. It could 
conceivably (although improbably) be two separable 
half-pieces, and one of these could remain conservatively 
integrated as a DNA duplex during replication. And 
what is the role of the small pieces? Can it be that the 
big piece and the small pieces have different roles? Until 
the significance of the apparent bipartite character of 
phage DNA is known, it is unlikely that any replication 
scheme will win complete adherence. 

It is tantalizing to find that something as apparently 
Straightforward as the problem of synthesizing more 
DNA ona template of pre-existing DNA should present 
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- so many puzzles. Let us recapitulate some of the experi- 


mental conclusions which any scheme of DNA repli- 
cation should either explain or, at any rate, be consistent 
with. “Experimental conclusions” include presently 
available facts, quite reasonable inferences, and con- 
clusions from experiments yet to be conclusively 
performed. 


(1) Replication schemes must agree with any obser- 
vations on the densities of N” and N!5 DNA, since these 
may well indicate whether or not replication is conserva- 
tive, semiconservative, or dispersive. A conclusion based 
on experiments with a certain type of replicating 
system, such as bacterial cells, is not necessarily subject 
to generalization to all systems, such as phages. 

(2) Provision must be made for information transfer 
from DNA to the synthesis of non-DNA structures, 
such as specific proteins. 

(3) With respect to phage replication there are these 
specific points: 

(a) The numerical equivalence of rounds of mating 
with rounds of replication seems to be established, 
especially with the T-even phages. It is an attractive 
corollary that mating and replication are not indepen- 
dent events, and a replicating scheme might be expected 
to provide for this dependence. 

(b) Genetic recombination is experimentally inferred 
to take place by some sort of copy-choice mechanism, 
rather than by the occurrence of breaks and crossing- 
over.” Copy-choice recombination is reasonably be- 
lieved to take place during a replicating act that occurs 
under the influence of mated structures. Provision must 
be made to have homologous templates arranged in 
one-to-one structural opposition, thus implying the 
existence of highly specific medium-range forces of 
interaction. 

(c) If copy-choice recombination takes place in the 
presence of mated structures, the latter must be so 
mutually arranged that the newly assembled DNA can 
become disengaged from both mated structures without 
tangling and without a high degree of fragmentation 
of the templates, 7f these are the parental DNA duplexes. 

(d) The phenomena of “marker rescue” by multi- 
plicity reactivation™* and cross-reactivation should be 
accounted for. If it turns out that parental DNA may 
be rescued (in its ability to be replicated) even after 
scission of one of the two chains, this phenomenon is to 
be accounted for by any replicating scheme that en- 
visions the unwinding of parental DNA. 

(e) The observed heterogeneous, but nondispersive, 
distribution of parental DNA in the progeny must be 
accounted for. 

(f) Biochemical events during the vegetative phase 
are to be made consistent with the replication scheme. 
In particular, the observation is important that there 
is an early synthesis of phage-specific RNA and that 


` only early blocking of protein synthesis will prevent 


DNA synthesis. 
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No present scheme of replication of phage DNA 
satisfies all of these requirements, and only for phages 
is enough known to allow schemes to be anything other 
than exercises in model building. Any scheme involving 
the disengaging of DNA chains by breaking and re- 
joining them is unlikely on the grounds that it implies 
a large degree of dispersion of the parental DNA, and 
it meets with difficulty in mating and recombination. 
The obvious semiconservative scheme based on the 
Watson-Crick model satisfies many of the criteria but 
has difficulty in handling single-chain scissions, and, in 
its simplified form, does not account for synthesis of 
phage-specific proteins. It also seems not to be sensitive 
to the effect of early blockage of protein synthesis. 
Copy-choice replication in mated DNA structures also 
gives difficulty, for the reasons that there seems to be 
no reason why complete DNA duplexes should mate 
and that the stereo problems of disengagement are large. 
The scheme of invoking a ribonucleoprotein interme- 
diary template is consistent with many of the experi- 
mental conclusions mentioned in the foregoing. It suffers 
from the uncertainty as to whether it is stereochemically 
possible in atomic detail, nor does it make any specific 
predictions as to the fate of the integrated parental 
DNA duplex. If this remains intact throughout phage 
replication and maturation, a conservative type of 
replication is demanded. But if experiments show that 
phage replication is precisely semiconservative, only 
ad hoc disposition of the parental DNA structure would 
save this model at the moment. 


REPLICATION IN RNA VIRUSES 


We have concentrated so far on the problems of 
replication of DNA that has been introduced into cells, 
the best examples of which are the DNA of the trans- 
forming factor and of bacteriophage. But there exist 
many cases in which the genetic message must have 
been introduced through the medium of viruses that 
contain no demonstrable DNA. Actually, most known 
viruses, both plant and animal, contain no DNA. Al- 
though none of these RNA viruses can be observed with 
the quantitative precision inherent in phage work, it is 
known that the general facts of replication are similar. 
They possess genetic continuity, are found occasionally 
to mutate, and in one instance it appears thaf the 
character of the virus genome is controlled by the type 
of host in which the virus is multiplied. Their general 
structure is similar to that of phage, with a protein 
exterior and some kind of core containing the RNA. 

Interest in the replicating potentialities of RNA has 
been enhanced by the discovery that a highly purified 
RNA obtained from tobacco mosaic virus (TMV) is 
infectious.?> If a solution of this material is rubbed 


upon tobacco leaves in the same way as TMV is applied _ 


for assay purposes, local lesions are developed, or sys- 
temic infection is initiated. At present, pure 
be prepared whose infectivity per unit weight is about 
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one percent of the same weight of RNA as it exists in with the phages. So far, the results are equivocal; RNA .— E 


i 


native virus?” particles. The infectivity can be greatly 
increased by reconstituting the virus—i.e., combining 
the RNA with protein fragments of the virus in a way 
such as to re-form virus particles indistinguishable from 
native ones. The infectivity may then rise to a figure as 
high as 60%. Although reconstitution has not been 
achieved for any other virus, it has so far been found 
that the RNA from another plant virus, turnip yellow 
mosaic,” is infectious, as is the RNA from polio virus.”8 

The findings just described allow RNA to be studied 
in a way that has some aspects of both phage and trans- 
formation experimentation. Like the transforming fac- 
tor, RNA can be biologically assayed after removal 
from its natural host, and after purification and labora- 
tory manipulation. Like phage DNA, its activity is 
demonstrated by the production of copies of the virus 
originally containing the nucleic acid. It is frustrating, 
however, that plant viruses cannot be studied with a 
quantitative precision even approaching that enjoyed 
by the phage experiments. 

The work with infectious RNA has largely taken the 
lines of finding out what happens when artificial 
“hybrid” viruses are reconstituted, and of deducing 
the physicochemical characteristics of the infectious 
unit. As might be expected, when RNA from a virus of 
one strain is reconstituted with protein from another 
strain, the progeny particles are copies of the virus that 
furnished the nucleic acid. The protein evidently serves 
primarily as a protecting coat to preserve the integrity 
of the RNA during inoculation. It is also possible to 
make what might be called artificial recombinants: 
reconstituted TMV particles that are believed to con- 
tain nucleic acid from two parent strains.2 When such 
virus particles are assayed on local-lesion hosts, most of 
the lesions found resemble either those normally caused 
by one or the other nucleic acids, but some have a mixed 
character. When the latter type of lesion is subinocu- 
lated by using it as inoculum for systemic infection, 
the symptoms developed and the character of the 
progeny virus from the systemic infection are wholly 
those normally associated with either one strain of 
RNA or the other. If the reconstituted particles have 
two kinds of RNA packaged within them, the process 
of infection evidently allows each kind of RNA to be 
independently replicated, a not surprising result. 

The experiments designed to find the minimum 
weight of RNA that is infectious are obviously im- 
portant in their bearing on the question as to whether 
or not there is redundancy in the amount of information- 
carrying material in a viral genome. Great interest 
would attach to the finding, for example, that a struc- 
tural unit of considerably less mass than all of the RNA 
of the virus is sufficient to act as the replicating tem- 
plate. Is there a TMV-RNA big pieceand are there little 
pieces that are infectious? The answers to these ques- 
tions, if they are positive, should come more directly 
with the RNA from TMV than they are likely to come 


is not easy to fractionate into differing sizes nor is i 
easy to characterize physically. At present, it appears 
that RNA pieces smaller than the RNA content of one 
TMV particle are infectious. 

It seems that, at present, the importance of the find- 
ing that purified RNA is virogenic is a conceptual one. 
Even though RNA viruses have been recognized for a 
long time, there has been no compelling reason to believe 
that their infectious proclivity is localized in their 
RNA, in analogy with the phage-injected DNA. Now 
it appears that the biochemical aspects of the earliest 
stage of infection are similar—that in both cases reason- 
ably pure nucleic acid can act in the initiation of infec- 
tion and the messenger of genetic information. Any 
master plan of replication, therefore, must show how 
RNA can direct the formation of its copies, as well as 
how DNA can do this. It certainly will not be surprising ` 
to find that some kind of protein synthesis, acting in 
specific and intimate contact with the nucleic acids, is 
a necessary correlate in the whole replication process. 
Nor will it be surprising to find that RNA is involved 
in DNA replication, and that DNA is involved in 
RNA replication. 

Where have these extended but inconclusive remarks 
on the replication of nucleic acids as evidenced by the 
viruses led us? My own feeling is that it is currently 
premature and naive to expect that the geometrical 
nicety implied in the two-strand helix of DNA will 
allow us to get out our poppet beads and come up with 
a jigsaw puzzle answer to the way such structures are 
assembled within cells. It is not known even that DNA 
and RNA are replicated, in the sense, for example, that 
the newly formed nucleic acids of genetically identical 
progeny viruses are identical on atomic dimensions with 
the parental nucleic acids (among which, of course, 
there may be diversity from parent to parent). Model 
building is fun, and its results are provocative and sug- 
gestive of what experiments to do next, but we are a 
long way from being able to identify the model with 
reality. We do not know how nucleic acids replicate, 
even in terms of quite nebulous mechanisms. The danger 
of premature model building is that it tends to direct 
our experimentation along subjectively controlled lines, 
perhaps to blind us to the key experiments that do not 
currently fit our preconceptions. 
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HE question of concern here is: To what extent 
can properties of the genetic map be made to 
correspond to some physical structure in the organism? 
| Since the bulk of the evidence points to deoxyribonucleic 
acid (DNA) as the molecule involved, this becomes a 
question of refining genetic experiments to the point 
where they can critically examine properties of DNA 
j which are involved in its acting as genetic material. 
| The experiments discussed herein are those involving 
l genetic measurements only—essentially, recombination 
| frequencies in certain standard kinds of crosses (see 
| Levinthal, p. 227). Experiments attempting to correlate 
| genetic properties with synthetic capacities of cells are 
f discussed in the second paper by Levinthal (p. 249). 
f One of the most fruitful areas of investigation 
| regarding this problem has been that of detailed fine- 
; structure analysis. In essence, the method has been 
j that of pushing recombination frequencies to such a 
limit that sizes of genetic units may be compared with 
i the molecular subunits of nucleic acid, the nucleotides. 
j It is a comparison between the fine-grain structure of 
| the genetic map and the fine-grain structure of DNA. 
Much of the following discussion of fine-structure 
analysis is based upon experiments by Benzer on the 
i bacteriophage 74.' Discussed also are experiments with 
s4 } bacteria bearing on the question posed above. 

The reasons for choosing microbial systems for these 
kinds of studies are several. Mutants are available in 
large numbers and varieties, and selective techniques 
make it possible to screen large numbers of organisms 
in the search for a desired type. For example, in the T4 
system (discussed in detail in the following), one 

hf wild-type bacteriophage particle can be detected among 
10° to 107 mutant types, if the proper bacterial strain 
is used as an indicator. This allows examining re- 
combination frequencies of the order 1 in 108 to 107. 
Another reason for the choice of microbial systems 
is that homogeneous preparations can be made with 
relative ease for chemical studies. 

The work of Benzer made it necessary to define 
= carefully the genetic units in terms of genetic measure- 
W, ments. The classical unit of heredity, the gene, has 

m, several operational definitions which had, in general, 
; not been distinguishable experimentally in Zea mays 
-and in Drosophila, the principal organisms of research 
genetics. The definitions are as follows: (1) The unit 
f recombination—the recon—is defined as the smallest 
sJement interchangeable in recombination experiments. 

_ such experiments, one uses easily-recognizable 
aracters which enter the crosses in the genetic 
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type or recombination-type progeny. This element is not 
necessarily the same as that defined by another set of 
operations—isolation of mutant types (either spon- 
taneous or induced) and mapping of their sizes. This 
yields another unit. (2) The unit of mutation—the 
muton—is the smallest element, alteration of which 
gives a type recognizably different from the standard 
form chosen as wild type. What is recognized as a 
mutant depends on the method of observation. It is, 
of course, conceivable that different mutant characters 
will be of different sizes. (3) The last unit that needs 


definition is the unit of function—the cistron. This 
unit is more difficult to define because what is recognized 
as the function of the genetic material depends upon 


the level at which observation is made between the 
primary synthetic act of the genetic material and an 
observable result of this act. If, for example, wing shape 
in Drosophila is the character being scored, then a large 
number of steps leading to the mutant form may be 
involved. One needs a way of deciding that two mutants 
are blocked in the same function, in the absence of 
knowing what the precise function is in terms of primary 
genetic product. A definition of a functional unit by 
genetic experiments is provided by the (cis-) (¢vans-) test 
devised originally for Drosophila by Lewis,’ and used 
so elegantly by Benzer in his study of 74 bacteriophage. 
One asks the question: Are two given mutants with 
the same outward appearance affected in the same 
functional unit? This is answered by allowing the 
genetic material of both mutants simultaneous expres- 
sion in the same cell. In Drosophila, this is accomplished 
by introducing one mutant into the egg and the other 
into the sperm. The resulting cells then have the two 
mutants’ markers in homologous but not identical 
sets of genetic material. In the bacteriophage system, 
the test is performed by simultaneous infection of the 
host bacterium with two mutant particles. In each 
case, one wants to know whether the resulting combina- 
tion (the frans-configuration) is wild type or mutant 
in its functioning. The control experiment is the similar 
situation in which both mutants (as a double mutant) 
are in one complement of genetic material, the other 
complement being wild type (cis-configuration). What 
are the expected results of such experiments? First 
the control: this must function as wild type if the other 
tests are to be revealing. If, in the ¢rams-test, both 


f 


A 


A 


mutants are in the same functional unit, the combina- i 


tion is not expected to be competent. If, on the other 


hand, they are mutant in different functional regions, 


then the combination should be competent, since each 
can make up for the deficiency 


of the other. Note that — 
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7 -the competence does not depend upon recombination Bacterial 
=~ . vo give wild type, for the genetic material would in host 
F 4 AAS ; 4 Kic strain rII rI rI 
subsequent tests still be revealed as mutant. Two 
mutants then are said to belong to the same functional PRR Wy. a a GY HSS SSS Ge 
unit—i.e., the same cistron—when in the trans- 
configuration they function as mutants. T 
Consider in some detail the system exploited by §----~-----~--~-~~~~- Gp =ne 
Benzer. The bacteriophage mutants used were many 
independent isolates from wild-type parents which m r 
show different plaque morphology (here designated r- Kosos WL eee Ye eee 
type) on the host strain Æ. coli B. On other bacterial 
strains, these mutants have properties which are Fic. 2. Dependence of the genetic map of T4 upon the choice 
compared with wild type in Fig. 1 Sotto Sea ates See ee 
Che classification of mutants into groups rJ, rI7, and [from S. Benzer in The Chemical Basis of Heredity, W. D. McElroy 
rIIT is done on the basis of the characteristics indicated and B. Glass, editors (The Johns Hopkins Press, Baltimore, 
in Fig. 1. The group of mutants r/J is especially Maryland, 1957), p. 70]. 
interesting in not giving plaques on strain K. This 
` allows in a cross selecting against two parental r77 configuration; e.g., rI] mutant m and rII mutant n 
mutants and selecting for rare wild-type recombinant. are put into the same cell, and this combination scored 
Figure 2 shows how the appearance of the genetic as wild type—i.e., yields phage, or as mutant, i.e., no 
map depends upon the choice of the bacterial host for phage yielded. This establishes whether the two 
examining the bacteriophage mutants. The designation mutants are in the same cistron. Two mutants in the 
of rII on K as m comes from the fact that an infrequent same cistron are then mapped relative to each other 
; S and to other members of the cistron. The largest 
ie Bacterial host strain distance between two mutational markers within the 
aerate B S K same functional unit gives a lower bound to the size 
wild wild wild wild of this unit. ; 4 F 
"7 5 > 7 The amount of labor involved in mapping the 
rII r wild piste numerous mutants necessary for such studies is con- 
cee A wad vate siderably reduced by mapping each newly discovered 
Fic. 1. Plaque morphology of TE Strains Gsolatediin B) plated mutant against a set of deletions that span the regions 
op varies posts [from S Benzer in The Chemical Bagis gf Heredity, to be mapped. Mutants labeled as deletions behave as 
Baltimore, Maryland, 1957), p. 70]. if a portion of the genetic map were missing. They do 
not revert to wild type at a measurable frequency. 
rII mutant gives a minute turbid plaque on this One can find for any deletion a group of reverting 
host—most of them, however, give no plaques. In the mutants which among themselves yield recombinant 
physiological effect of the mutational event, these types, but none of which do so with the given deletion. 
mutants vary, then, from lethal to unnoticeable, The extent of the deletion covers this group of reverting 
depending on the host used for phage assay. mutants. The reverting mutants themselves are mapped 
Turn now to the problems of measuring, in terms as points on the map. 
of recombination frequencies, the sizes of the genetic The enormous advantage of using these deletions is 
units defined in the foregoing. that they span blocks of the map, allowing any new 
To measure the unit of recombination, the ap- mutant to be located quickly without an excessive 
propriate bacterial cell (here, Æ. coli B) is infected with number of crosses. Detailed mapping of a newly found 
two parental r77 mutants. One measures the frequency mutant is then done only with regard to those pre- 
with which this pair of mutants recombine to give wild 
type (assayed on K). By searching for many rIT — a 
mutants and by mapping each against the others, one a i jl 
sees if there is a limiting distance between positions at — ee a ee 
which two mutational events occurred. ket 
- To measure the muton length, one observes whether, a 
with three closely linked markers, one can find aber- BESS 
rations In recombinant frequencies owing to the size ARN 
- of the mutation (Fig. 3). This size is reflected by a Fic, 3, Method for determining the “length” of a mutati 
nonadditivity of distance 1 to 2 and 2 to 3 as compared The discrepancy between the long distance and the sum of the ÈS 
with 1 to 3. short distances measures the length of the central mutation 
To determine a cistron length, many isolates of [fron S, Goa editors Che TaS Harati D- Monikai 
rII-type mutants are tested in pairs in a frans- Maryland, 1957), p. 70]. » Baltimore, 
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Fic. 4. The method of overlapping deletions. Three mutants are 
shown, each differing from wild type in the deletion of a portion 
of the genetic material. Mutants No. 1 and No. 3 can recombine 
with each other to produce wild type, but neither of them can 
produce wild recombinants when crossed to mutant No. 2. The 
matrix (b) represents the results obtained by crossing three such 
mutants in pairs and testing for wild recombinants; the results 
uniquely determine the order of the mutations on the map (a) 
[from S. Benzer in The Chemical Basis of Heredity, W. D. McElroy 
and B. Glass, editors (The Johns Hopkins Press, Baltimore, 
Maryland, 1957), p. 70]. 


viously discovered ones that lay within the same 
deletion. 

Mapping of the deletions with regard to each other is 
also done relatively simply (Fig. 4). In the figure, 
mutants 1 and 2, for example, will not give wild type 


LENNOX 


in a mixed infection and are defined as overlapping _~ 


deletions. On the other hand, mutants 1 and 3 wilt 


give wild-type recombinants and are nonoverlapping ~ 


deletions. 

By these methods, Benzer was able to focus his 
attention on a relative small subregion of the rII 
region of the 74 map. By mapping each new mutant 
against a fixed standard deletion, it was screened either 
as one to be ignored for the moment (outside the 
deletion) or as one to be mapped further—first, against 
smaller subdeletions, and, finally, with the mutants in 
each subblock defined by the smallest deletions. In this 
way, refined genetic analyses were performed making 
possible measurements of the genetic units defined 
earlier. 

One of the first results to come out of such a program 
of mapping is that the 77I region defined originally in | 
terms of plaque-forming characteristics on the bacterial 
strains B, K, and S consists of two functional regions, 
the A-cistron and the B-cistron. 

Restricting attention further to one cistron allows 
more-refined mapping. Figure 5 indicates the results of 
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Fic. 5. Map of the mutants in the 7164 segment. The numbers give the percentage of recombination observed in standard crosses 


between pairs of 
verified b 
with whi a 
having one mutation 2 


mutants. The arrangement on this map is that suggested by these recombination values; it has not yet been 
three-point tests. Stable mutants are represented as bars above the axis; the span of the bar covers those mutaa 
the stable mutant produces no detectable wild recombinants. The stable mutant 7928 appears to be a double mutan 

t the highly mutable r131 location and a secon 


d mutation at a point in the B-cistron. Mutants r131 and 1973, 


= o that the data for each indicated. Some of the data here given differ from (and supersede) previously 
} SE on eed upon unconventional aran pe indicat out to be incorrect [from S. Benzer in The Chemical Basis of H eredily, 


Yas) WD. McElroy and B. Glass, editors (The Johns Hopkins Press, Baltimore, Maryland, 1957), p. 70]. 
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A class “a B class 
I 
168 145 205 320 271 l 114 163 287 
I 
114(R)r62 i 
Ea EE] Ce) eS) | 
386 5) Ca Ca mA | Coie 187 
641 4302 Bed p53) LC J19XR;)r18 i 951 102] [575 
(235) es] =) (LL) 743 mm (Ea 
312] [799 389 E] ) 761 895 
fre) 13 852 a 
114(R,)r70 382 
(OD) 
665 398 


HEEBERRRESeRe 


199(Ro)rIT 


Fic. 6. Preliminary locations of stable yJ7 mutants. Mutants producing no wild recombinants with each other are drawn in over- 
lapping configuration. Pairs which produce small amounts are placed near each other. Since there remain some gaps, the order shown 
depends upon that established by Doermann (in personal communication with S. Benzer) for the mutants shown on the axis. The 
scale is somewhat distorted in order to show the overlap relationships clearly. Brackets indicate groups, the internal order of which is 
not established. Ten stable mutants of the A class and six of the B class were not sufficiently close to any others to permit them to 
be placed on the map. (A class varie A-cistron; B class equals B-cistron) [from S. Benzer in The Chemical Basis of Heredity, W. 


D. McElroy and B. Glass, editors 


such detailed mapping in a restricted region of the A- 
cistron. The number indicative of the size of the recon 
is about 0.02%, the closest distance between any two 
mutants in this region. It is seen in the following how 
this recombination frequency can indicate a size of the 
recon in molecular terms. 

The recombination map as shown in Fig. 5 is far 
from filled, in the sense that very little of the total span 
has mutants crowded together as close as the above- 
indicated apparent limiting distance. Verification of 
this as a limiting distance will depend upon finding 
many more mutants to fill up the map. 

The next step was to see if the map could be saturated 
using the deletions already discussed. Can the whole 
region of the A-cistron, concerned apparently with a 
single function, be spanned by the deletions? Also, 
can these deletions be ordered into a linear array that 
does cover region A? The overlapping of two deletions 
is indicated by the fact that each may yield recom- 
binants with one-third (nonoverlapping) deletion, but 
not with each other. The ordering of a triplet of deletions 
is done in the standard way by comparing the fre- 
quencies of wild-type recombinants of each pair of the 


‘triplet. Figure 6 shows the result of the ordering of a 


large number of such deletions in the A- and B-cistrons. 
Two facts stand out. Firstly, each deletion, with one 
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exception which is yet to be explained, falls clearly 
into either the A- or the B-cistron; i.e., control of the 
same function falls into a well-bounded map region. 
There were no a priori reasons why A- and B-mutants 
should not have been scattered among each other in the 
same region. Secondly, the array of mutants can be 
arranged in a unique sequence along a line. 

Both of these facts and the one referred to in the 
foregoing, that there seems to be a smallest separation 
found between two mutants, are the heart of the 
interpretation which correlates the functioning of the 
genetic material with a molecular structure. 

The lengths (in terms of recombination frequencies) 
of genetic units were found to be (1) the recon ~0,01%, 
(2) the muton ~0.05%, and (3) the cistron ~4%. 
These are not precise figures, for clearly the map has 
not been saturated sufficiently with markers to be 
confident of them. Probably the one in largest error i 
the muton. z 

In a rough calculation, Benzer translated these units 
into numbers of nucleotide pairs. The total amount 
of DNA in a phage particle has been well measured. 
Knowing roughly the total recombination frequency 
from genetic experiments and with the following 
assumptions, one can make the appropriate calculations: 
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(1) All of the DNA is genetic and corresponds to one 
copy of the genetic information. 

(2) The map is uniform; i.e., a given recombination 
frequency anywhere on the map corresponds to the 
same amount of DNA. 

With these assumptions, the ratio of total nucleotide 
pairs to recon size gives the order of magnitude 1. The 
limitation and expected variation in this calculation 
are discussed in the paper by Benzer.! 

Consider next experiments with bacteria, investi- 
gating the correspondence between molecular properties 
of DNA and the genetic map. The question here is the 
same as for the bacteriophage system: Is there a 
physical structure in bacteria whose properties corre- 
spond to the genetic structure? 

Bacterial strains Æ. coli exhibit two primary mating 
types which contribute asymmetrically to the formation 
of zygotes from which recombinants are derived.* One 
type, Hfr, acts as a donor of material and the other 
type, F—, acts as the acceptor. In general, the charac- 
teristics of the recombinants formed by a mating event 
are derived principally from the F parent. The Hfr 
parent contributes only a part of the total genetic map. 

: Conjugation of these two mating types is ac- 
companied by formation of a tubule between the 
bacteria. The sequential character of genetic transfer 
is revealed by experiments in which mating bacteria 
are subjected to high shearing forces in a Waring 
Blendor to interrupt the mating process. This interrup- 
tion, at different intervals after initiation of mating, 
allows the formation of zygotes with varying amounts 
of the genetic map from the Hfr parent. This process 
reveals a sequence of genetic markers and distances 
among them closely corresponding to those found by 
the more usual genetic crosses. Other experiments 
È indicate that the effect of the Blendor is on the genetic- 
y transfer process’ rather than at later stages of incorpora- 

Í } tion of the genetic material. 
i To study how this genetic transfer is correlated with 
a a material transfer, Fuerst, Jacob, and Wollman 
(referred to in reference 5) used P? labeling in the Hfr 
parent. Earlier experiments® had shown that bacteria 
having high specific-activity P® lose with time their 
ability to multiply. This loss varies with the number of 
P® decays that have occurred in such a way as to 
indicate that each decay has a constant probability of 
rendering the cell incapable of survival. Two related 
experiments were done by Fuerst, Jacob, and Wollman. 
(1) Hfr bacteria were grown with high levels of P®, 
_ washed free of excess P®, frozen to stop metabolic 
activity, and stored frozen. At regular intervals, 
samples were thawed, mixed under standard conditions 
an F strain, and mating was allowed to occur. 
ey measured the probability that a given genetic 
of chromosome be incorporated in the re- 
s as a function of the total amount of P® 
7 parent. (2) Hfr bacteria were grown 
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F- parent to form zygotes which are then frozen, and^ 
stored as above. These are thawed at regular intervals, 
and the number of recombinants determined. 

The two types of experiments differ in that, in the 
latter, injection occurs before P* damage, whereas, 
in the former, P® decay acts in destroying the capacity 
of a given map length to be transferred and to be 
incorporated in the recombinants. In the second type 
of experiment, the effect on recombination frequency 
is only on the probability of incorporation of a given 
segment of map. 

In these experiments, two results were clear: 

(1) The rate of inactivation of the genetic map 
varies with the length of map selected. 

(2) The transfer probability varies according to the 
length of map. 

In both of these results, the kinetics of decay indicate , 
a constant probability of destruction of both the 
transfer probability and the incorporation per given 
length of genetic segment—the length being measured 
by the previous timed-injection Blendor experiments. 
This correlation of rate of inactivation with genetic 
length fits very well with the assignment of the phos- 
phorus containing DNA as the genetic molecule. 

There is still another kind of genetic experiment 
performed with bacteria that bears on the correlation 
of genetic proximity and functional relatedness. This 
is concerned with the mapping of genetic units control- 
ling related biochemical functions. 

The bacterial systems investigated have the ad- 
vantage over the bacteriophage system in that it is 
easy to select for mutations controlling production 
of a specific enzyme catalyzing a single chemical step. 
With the bacteriophage system, it is easy to ascertain 
whether two mutants belong to the same functional 
region [by the (cis-)(trans-) test], but it has, so far, not 
been possible to ascribe synthesis of a specific protein 
as the function of such a region. With bacteria, finding 
the function is easy, but confirming that this function 
corresponds to a cistron as defined by Benzer has been 
more difficult. 

In these experiments, a group of biochemically 
deficient mutants is isolated from a standard wild 
type, characterized with regard to the specific nature 
of the biochemical block, and each mutant is mapped. 

Consider as a specific example the mapping of 
histidine-deficient mutants in Salmonella typhimurium." 
The wild-type organism can grow on a synthetic 
medium using glucose as a source of carbon and 
ammonia as a source of nitrogen. From this wild-type 
parent, Hartman’ isolated a large number of mutants 
having in common a requirement for histidine, in 
addition to the medium supporting growth of the 
wild-type organism. The synthesis of histidine consists 
of many known steps.* These mutants were charac- 


terized as to which particular steps were blocked— ` 


presumably through failure to make the necessary 
otein for catalysis ef this step, or through making 
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Fic. 7. Genetic maps showing the locations of spontaneous and 5-bromouracil-induced mutational alterations in the r77 region of 
phage 74 divided into cistron A and B. Each mutant is represented by a box, placed in the proper segment of the map and shaded 
to indicate its reversion index. The relative order (but not the lengths) of segments 1, 2, 3, 4, 5, 6, 8, and 10 have been established. 
Segments 7, 9, and 11 are lumped together and considered as one segment. The order and position of different mutations within one 
segment have not been determined [from S. Benzer and E. Freese, Proc. Natl. Acad. Sci. U. S. 44, 112 (1958)]. 


protein with a defect, rendering it incapable of catalyz- 
ing the reaction step. 

Comparing two histidine-requiring mutants, bio- 
chemically, reveals that they are blocked in the same 
or different biochemical step. Mapping of these mutants 
revealed two facts: 

(1) All of the biochemical blocks in histidine synthesis 
are close to each other on the map. 

(2) Those markers involved in blocking at the same 
step are in clusters; the markers from different steps 
seem to belong to separate but nearby clusters. 

It is, of course, natural to speculate that such 
clusters are the cistrons defined by Benzer, though 
his defining (cis-)(¢rans-) test is lacking for this system. 

There are many other cases in the mapping of bio- 
chemical blocks in bacteria where similar evidence has 
accumulated." In all of these, the clear definition 
of a cistron is lacking. The experiments of Jacob and 


-Wollman on the formation of partial zygotes by 


interrupted mating make such tests possible. 
To summarize, experiments mapping functional 


2 


blocks in bacteria have strengthened the idea that a 
unit function is under the control of a small region of 
the genetic material. 

The last experiments discussed here return attention 
to T4 bacteriophage where mapping had been carried 
to a degree of fineness that the recombination and 
mutation units could be defined in terms of relatively 
few nucleotides.1 This opened a new avenue to’ the 
study of mutation and to bringing into correspondence 
the properties of DNA with those of the genetic map. 

Benzer had made a profile of the mutation frequency 
in various portions of the A- and B-cistrons. The 
spontaneous mutation frequency is not uniform 
throughout the map, but does have local “hot spots”, 
of high mutation frequency. Benzer and Freese” then 
investigated whether a mutagenic agent which could 
be expected to interfere with the incorporation of a 
particular nucleotide into DNA would alter the profile 
of mutation. For this purpose, they used S-bromouracil, 
which had been shown to be incorporated into the DNA 
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of phage replacing thymine“ and to be mutagenic 

in bacteriophage.!® 

Pest Since the spontaneous mutational events probably 
= arise from alterations at more than one kind of base 
site, and since the mutagenic agent 5-bromouracil 
would be expected to interfere principally by replacing 
thymine, then the profile of induced mutation would 
be expected to be very different from the spontaneous 

one. 

A comparison is shown in Fig. 7, which plots the 
mutational frequency in various segments of the A- 
and B-cistrons. The profile of spontaneous mutants 
is markedly different from the 5-bromouracil induced 
ones. 

Such studies, extended to other mutagenic agents, may 
form a basis for studies of mutagenesis on a molecular 
level. 

The experiments discussed here—on bacteriophage 
and on bacteria—have yielded the following ob- 
servations: 


(1) The genetic map is linear down to its smallest 
dimensions. 

(2) Portions of the map controlling the same function 
are compact and not intermixed with portions control- 
ling other functions. 

(3) The units of recombination and mutation have 
_ sizes of the order magnitude of a few nucleotide pairs; 
the size of the unit of function is the order of magnitude 
of 10? to 10° pairs. 
= (4) There is a correspondence between map distance 
_ and phosphorus content of the genetic material. 
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here—to specify at a molecular level the detailed: ~ 
structure of DNA in its acting as genetic memory and . 
in its role of making genetic products. - 
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WO of the foregoing articles have dealt with cer- 

tain aspects of formal genetics and others have 

dealt with the physicochemical studies of the structure 

of nucleic acids and protein. This article presents some 

of the current ideas about the relationship between 

genetics and the structure of these molecules. Many of 

these ideas are highly speculative and should be thought 

of only as current working hypotheses (or dogmas) 
which are being subjected to experimental tests. 

Evidence indicating that genetic information can be 
contained in DNA was presented in previous articles 
and it is assumed then that there is a one-to-one corre- 
spondence between the position on the genetic map and 
the position along a DNA molecule. If DNA itself con- 
tains genetic information, it must be contained in the 
order of the bases along the two chains. But since only 
the base pairs adenine-thymine and guanine-cytosine 
are possible in the DNA molecule, the information is 
contained twice, once along each of the two chains. If 
the sequence of the bases along one chain is known, then 
that along the other chain is determined. This fact is, 
of course, the basis of the Watson and Crick hypothesis 
of the self-complementary nature of the duplication of 
the DNA molecule. However, for the moment, only the 
base sequence along one of the two chains is considered 
as being the genetic determinant. 

From the fine-structure genetic analysis of Benzer,' 
Demerec,? and Pontecorvo,’ it is known that function- 
ally related mutations are near each other on the genetic 
map. As discussed by Lennox (p. 242), the cistron is a 
well-defined region of the map which is responsible for 
the control of a particular function. In interpreting this 
in chemical terms, Benzer made the hypothesis that a 
cistron corresponds to a length of DNA which controls 
the formation of a single polypeptide chain in a protein 
molecule. Both the cistron and the single polypeptide 
chain are operationally well-defined entities, and the 
definition of one does not depend upon the definition 
of the other. Thus, although this hypothesis is yet to 
be tested, it is subject to direct experimental verification. 

There is another hypothesis which is central to any 
discussion of the genetic control of protein structure. It 
states that the genetic information determines only the 
sequence of the amino acids within a protein and that 
the three-dimensional configuration of the formed pro- 
tein molecule is a direct consequence of the amino-acid 
sequence. The main reason for accepting such a hy- 
pothesis is that the genetic map and the base sequence 
along the DNA molecule form one-dimensional struc- 
tures, and apparently the only corresponding linear 
order in a protein molecule is the order of the amino 
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acids along the polypeptide chains. It is evident that, 
in a formed protein molecule, the three-dimensional 
configuration is maintained by secondary bonds which 
exist between amino acids, linking either different poly- 
peptide chains or widely spaced amino acids on the 
same chain. It is assumed, however, that the informa- 
tion specifying how those secondary bonds are formed 
is itself contained in the order of the amino acids. For 
example, covalent sulfur— sulfur bonds play an impor- 
tant role in maintaining the structure of many proteins, 
but it is assumed that their position in the molecule is 
determined primarily by the position along the peptide 
chain of the two cysteine molecules required for S—S 
bond formation. It should be stressed, however, that 
there is no direct evidence in support of this assumption, 
nor can a precise mechanism be formulated at present 
which might explain how the required bonds would be 
made. One only can speculate that the folding of a poly- 
peptide chain which was formed along a linear template 
may require the sequential unpeeling of the chain from 
its template beginning from some specified place. One 
can imagine, too, that the correct folding could be in- 
fluenced by some small molecule with which the finished 
protein would interact; as, for example, the inducer 
molecule in the case of induced enzyme synthesis. From 
the point of view of the over-all information transfer, 
such speculation as to the details of the mechanism is, 
of course, irrelevant, and is mentioned, at this point, 
only to indicate that the hypothesis is not entirely 
unreasonable. 

In most of the cases where they have been studied, 
the active sites, or the business regions, of enzyme mole- 
cules are found to contain only a few amino acids. 
However, in order to carry out its enzymatic function, 
the amino acids in the active region of the enzyme must 
be arranged in a very precise, three-dimensional con- 
figuration. One can suppose that the information con- 
tained in the order of the other amino acids in the protein 
is necessary to specify the three-dimensional position of 
each of the amino acids in the active region relative to 
each other. Many of the aa’s coming from the active 
site may be necessary for the correct folding to occur. 
Once this has been accomplished, some of them could 
be eliminated without altering the active site. If this 
interpretation is correct, it follows that, in some por- 
tions of an enzyme molecule, one amino acid can replace 
another without producing any observable change in 
the enzymatic function of the molecule. For example, in 
some regions, the polypeptide chain may be folded in 
such a way that it forms an a-helix for a short region. 
Since most amino acids will do equally well as the com- 
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ponents of an a-helix, there could be many amino-acid 
replacements in such a region which would not affect the 
enzymatic function unless the particular amino acids 
were involved with secondary bonds to other regions of 
the molecule. On the other hand, it is likely that any 
amino-acid replacement in the active site of the molecule 
would cause a change in the enzymatic activity. 

If it is assumed that the genetic information controls 
only the sequence of the amino acids along a polypeptide 
chain, then certain ideas suggest themselves which are 
independent of the detailed chemical mechanism by 
which proteins are synthesized. These arise from the 
fact that, although there are only four different kinds 
of bases in the DNA molecule, their sequence must 
uniquely determine the sequence of approximately 
twenty amino acids along the polypeptide chain. Thus, 
it would require at least three bases to select uniquely 
from among the twenty amino acids (4 is only 16 
whereas 4° is 64). But the distance required for three 
nucleotide pairs along the DNA molecule is about 10 A 
whereas the distance between the amino acids on an 
extended polypeptide chain is only about 3.7 A. The 
first solution for getting around this apparent difficulty 
was suggested by Gamow. He imagined a coding 
scheme in which amino acids would be determined by 
a sequence of n+2 bases. Each amino acid would be 
determined by a sequence of three bases, but two ad- 
jacent amino acids would have two bases in common. 
Thus, two amino acids would be determined by a se- 
quence of four bases; the first three members of the 
sequence, determining the first amino acid and the last 
three, determining the second amino acid. This type of 
overlapping code has two immediate consequences: 
First, there will be many mutations affecting two or 
even three amino acids; to date, no mutations of this 
type have been observed. Second—and this leads to a 
more serious objection to an overlapping code—not all 
pairs of amino acids can lie next to each other. In fact, 
the restrictions on the possible sequences of amino acids 
are so severe in the Gamow code that it seems to be 
impossible to use it to code some of the known amino- 
acid sequences. Brenner® has analyzed the known se- 
quences of amino acids in proteins and has found that 
they are not consistent with any type of overlapping 
code in which two letters are used in common for ad- 
jacent amino acids. Brenner’s analysis does rest on one 
further assumption, that the same code is used in all 
biological systems and for all proteins. This was a 
necessary assumption because, so far, only relatively 


‘small proteins have been analyzed as to the sequence 


their amino acids and because it is not yet possible 
‘obtain a sufficient number of adjacent pairs from 


any one protein. These two arguments, however, make 


n very unlikely that any variant of a double 
ping code is used in the synthesis of proteins. 

et al.’ have avoided the spatial difficulty by 
mically mare plausible assumption that, 
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when either DNA or RNA acts as a template for the 
formation of a protein, it is not the amino acid itself. 
which fits on the template. They assumed that there 
could be some type of small adapter molecule which 
would attach to a specific amino acid on one end and 
to three nucleotides on the other. From what now is 
known of the chemistry of protein synthesis, one might 
suppose that the genetic information in the DNA is 
transferred first to a large RNA molecule which would 
act as the template for protein synthesis, and that the 
soluble RNA to which specific amino acids are attached 
could then be the adapter molecules. There is, as yet, 
no direct evidence that this model is correct, but it is 
interesting to note that the adapter hypothesis was 
made two years before there was any chemical informa- 
tion on the attachment of amino acids to soluble RNA. 
Having made this assumption, one is freed from the 
spatial argument and only need consider the problem 
from the point of view of coding. Crick et al.” have 
pointed out that the synthesis of proteins would be very 
much more efficient or, at least, more rapid, if each 
amino acid, presumably attached to its adapter, could 
place itself along the template independent of the other 
amino acids. Thus, a given position on the template 
should be such that it would accept a particular amino 
acid whether or not the adjacent spots were filled. In order 
to satisfy this condition and to use every three nucleo- 
tides to specify an amino acid, it is necessary to impose 
the condition that certain triplets (called sense triplets) 
of the nucleotides correspond to amino acids and that 
others (called nonsense triplets) do not. The code must 
be such that the six nucleotides required for two adjacent 
sense triplets should not contain any additional sense 
triplets among the overlap triplets. Thus, if both xyz 
and uvw correspond to amino acids, then yzu, zuv, wxy, 
and vwx must not correspond to an amino acid. Such a 
code is called commaless, since no additional informa- 
tion is required to indicate where amino acids are to be 
placed along the nucleotide sequence. In the language 
used by Elias (p. 221), this is a “‘jitter-free” code be- 
cause it is independent of the ability to read correctly 
from one end as to the distance from that end. 

Most of the sixty-four words which can be written 
with four letters taken three at a time can be eliminated 
by the comma-free condition. This set of three-letter 
words which can be used in a comma-free message is 
called a dictionary. The word “AAA” cannot be in- 
cluded in such a dictionary, because, if the word were 
repeated twice, the sequence “AAA” would appear in 
the overlap and the position of the amino acid would 
not be well defined. This condition eliminates the four 
possible words which have three identical letters and 
reduces the number of possible words in the dictionary 
from sixty-four to sixty. Furthermore, no cyclic permu- 
tation of a word can be in the dictionary. If “ABC? is 
included, then neither “BCA” nor “CAB” can be, be- 
cause if either one of them were included and repeated 
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`- twice, “ABC” would occur in the overlap. Since, from 
any word, two others can be derived by cyclic permuta- 
tion of the letters, the maximum size of the dictionary 
is reduced further from sixty to twenty. 

Two twenty-word dictionaries which are essentially 
different from each other were presented by Crick et al., 
and since no comma-free code can contain more 
than twenty words, this completes the proof that 
one is dealing with a maximal comma-free dictionary. 
Freudenthal and Golomb ef al? have shown that 
a total of five, and only five, basic types of code can be 
constructed which satisfy the comma-free condition 
and contain twenty words (Fig. 1). 

There is an additional difficulty with this type of 
code which is related to the fact that the information is 
contained twice in the DNA chain. Chemically, the 
two chains are antiparallel so that the over-all features 
of the molecule are invariant with respect to an end-to- 
end inversion. However, the informational content or 
the sequence of the nucleotide pairs clearly is not in- 
variant with respect to this inversion. Or stating the 
problem differently, the information as to the protein 
to be formed by a given region of DNA will be different 
depending on which of the two chains is read. Delbrück’ 
proposed that an additional requirement be added to the 
code such that, if the message were to read correctly 
along one chain, the complementary sequence which 
existed on the opposite chain would not contain any 
sense triplets. With this additional restriction, only ten 
words can be written with four letters taken three at a 
time, and it requires a sequence of four letters to supply 
enough information to specify twenty amino acids sub- 
ject to this restriction. However, this additional require- 
ment imposed by Delbriick is not absolutely necessary. 
One can, for instance, suppose that the information is 


_Fic. 1. The five possible types of commaless codes using four 
different letters taken three at a time (after Freudenthal®). In 
each group with a given middle letter, all combinations of the 
isted first and third letters are sense words. 
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transferred from a double-chain DNA molecule to a 
single-chain RNA molecule and that the information, 
specifying which of the two DNA chains will be used in 
forming the RNA molecule, is not present everywhere 
along the DNA, but only in short regions (for example, 
between cistrons). Thus, in the making of the RNA, one 
need not impose the commaless condition, whereas it 
still is required in the making of the protein from 
the RNA. 

In the absence of experimental data, it is clear that 
such speculation can go on more or less indefinitely, 
and that very elaborate codes can be imagined. For 
example, there is no particular reason to assume that 
the information specifying a given amino acid neces- 
sarily comes from contiguous nucleotides. One could 
imagine a mechanism in which every tenth nucleotide 
(that is, one every turn of the DNA molecule) would 
specify an amino acid so that one might be specified by 
nucleotides 1, 11, and 21 and the next by 2, 12, and 
22, etc. At the moment, there is very little experimental 
evidence relevant to these questions. However, a great 
deal of work is being done in an attempt to obtain some 
data. In the hope of providing some further understand- 
ing of the current status of the field, the experiments 
are discussed in terms of the kind of question which 
might be asked and of the kind of information which, 
in principle at least, might be obtained. 

It seems unlikely that, in the near future, it will be 
possible to analyze the sequence of the nucleotides in a 
piece of DNA which is responsible for a particular 
protein molecule. In addition to the difficulty of carry- 
ing out the sequence analysis of the bases along the 
DNA, there is another problem which seems even more 
formidable. Except in the case of some of the small 
viruses, the method is not clear for obtaining a prepara- 
tion of DNA wherein each molecule of the material 
controls the same protein. And even in the smallest 
viruses, the amount of nucleic acid is very large for an 
analysis of this type. The work that is being done in- 
volves the analysis of how protein molecules are altered 
as a result of specific mutations. Most of the currently 
available experimental results relevant to this problem 
have come from the work on human hemoglobin done 
by Ingram.” He analyzed the hemoglobin molecule by 
treating it with the proteolytic enzyme, trypsin} and 
then subjecting the resultant digest to a two-dimen- 
sional paper electrophoresis and chromatography. Tryp- 
sin splits the polypeptide chain at every arginine and 
lysine residue. In the case of hemoglobin, this diges- 
tion yielded 28 different peptides. Hemoglobin made 
by different individuals can be compared by noting 
the various movements of different peptides on the 
paper. Sickle-cell anemia, a disease which is presum- 
ably due to a single mutation, results in an altered, 
but not totally inactive, hemoglobin. This alteration 
was shown to correspond to the change of a single amino 
acid. A total of six altered hemoglobins have been 
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examined by Ingram and, in three of these, it has been 
shown that only one amino acid is changed. Two of 
the changes involve the same amino acid and one in- 
volves a change in a different region at the molecule. 
The other three alterations differ from these, but it has 
not been proved that only one amino acid is altered. 
In these studies with human hemoglobin, one selects 
for those mutations which produce a minor alteration 
in the functioning of the molecule. It is much less likely 
that an alteration which produces no physiological 
change would be examined. This is true also for any 
mutation resulting in the absence of a hemoglobin 
molecule which would appear as a recessive, lethal 
mutation, and, thus, would probably not be identified 
as affecting hemoglobin. However, it is important to 
note that no one of the cases so far examined represents 
changes in more than one of the peptides and of the 
three which have been totally characterized, none re- 
sults in a change of more than one amino acid. 

In several laboratories, work is directed toward ob- 

taining a system with which fine-structure genetic 
analysis can be carried out and combined with chemical 
analysis of the protein controlled by the genetic region. 
In order to make such studies possible, one must work 
with a conditionally necessary protein whose absence 
prevents the organism from growing in some conditions 
but not in all. The substance made by the r/J genetic 
region studied by Benzer! obviously satisfies this re- 
quirement. Although an intact protein is necessary if 
the virus is to grow in some bacterial hosts, it is not 
necessary for it to grow in other hosts. Likewise, there 
are many enzymes made by bacteria which are necessary 
for the growth of the bacteria only under certain en- 
vironmental conditions. For example, fine-structure 
analysis has been done in bacteria by using the enzymes 
responsible for the synthesis of a certain amino acid. 
If the amino acid is supplied in the medium, the absence 
of the enzyme does not affect the growth of the organ- 
ism. On the other hand, by placing the progeny organ- 
isms of crosses in a medium lacking the required amino 
acid, a very small number of recombinants can be 
selected which have the enzyme. 

Under such conditions one can consider several classes 
of mutation. If the general concept of a commaless code 
is correct, then there will be many sequences of nucleo- 
tides which do not correspond to an amino acid. A 
mutation could occur which would result in such a 
nonsense sequence in the nucleotide chain. If “ABA” 
corresponded to a certain amino acid, then a mutation 
of the letter “B” to the letter “A” would result in the 
sequence “AAA” which is not in the dictionary. If this 
were to happen, it is possible that no protein molecule 
could be formed, since, presumably, the peptide chain 
no longer would be continuous at this point. This type 
of sense-to-nonsense mutation could result plausibly in 
the absence of an active protein, regardless of where it 

cule. In addition, there would be a 


class of sense-to-mis-sense mutations which could result - 


in a totally inactive molecule. A replacement of valine 
by proline in an active site probably would result in a 
protein with no enzymatic activity. On the other hand, 
such a protein might still be formed and be recognized 
by its antigenic similarity to the normal enzyme. Also, 
the fact that the protein may be induced by the same 
inducing agent as caused the formation of the normal 
enzyme would be another factor in recognizing an alter- 
ation. Sense-to-mis-sense mutations are those which, 
in principle, may lead to a protein which could be 
analyzed chemically and whose difference from the 
normal enzyme could be determined. However, the 
sense-to-nonsense mutation would lead to the absence of 
a protein and, therefore, no information about the alter- 
ation of the amino-acid sequence. It may be possible, 
however, to obtain considerable information by study- 
ing the reversions to activity from the nonsense muta- 
tions. This is illustrated in Fig. 2(b) where the sequence 
equivalent to amino acid a first mutates so that it corre- 
sponds to a nonsense word. Then, reverse mutations are 
selected, some of which restore the amino acid a and 
others of which insert the amino acid 8 or y. In this sort 
of mutation-reversion study, one expects to find, insome 
instances, that several different reversions could be ob- 
tained which produce an enzyme differing from the 
normal. However, each of the reversions would be ex- 
pected to differ from the normal at the same position 
in the molecule. It would be possible to test this conclu- 
sion despite an inability to carry out fine-structure 
genetic analysis as long as the system is such that the 
reversions with some activity have a selective advan- 


(a) a= ADA 


aN 


BDA =8~—_>+=CDA 


(b) œ = ADA 


NX DDA = nonsense 


BDA =8 
CDA = y 


Fic. 2. (a) The conversion group a, B, and y corresponds in all 
of the codes to changes in the first letter of the word, ADA. The 
amino acid a=ADA would belong to two other conversion groups 
also whose size depend on which code was used. (6) ADA could 
mutate to the nonsense word, DDA, also, and in this special case 
all of the possible reversions to sense words would be in the same 
conversion group. However, it is not generally true that the re- 
versions will be in the same conversion group. 
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tage. It has been shown" that, for all of the three-letter 
-commaless codes, the number of different sense rever- 
sions from a nonsense mutation is at most four. This 
prediction could also be subject to experimental tests. 

There are other classes of mutation which, although 
extremely useful in the genetic analysis, are not likely 
to be very helpful in this problem. These are the dele- 
tions or inversions which have been discussed earlier. 
In this case, one expects to find mutations which never 
revert to produce an active protein, and, as such, there 
could never be any analysis of an altered protein. 

Another question which can be studied experimentally 
was mentioned earlier with reference to the work of 
Ingram. It is concerned with whether or not a single 
mutation can affect more than one amino acid. In con- 
nection with this, mention should be made of one as- 
sumption which is implicit to all of these discussions. 
It is assumed that the probability of mutation is 
sufficiently low so that two letters in the nucleotide 
code will not be altered simultaneously. This assumption 
will be justified, if it is found that all mutations do 
affect a single amino acid only. 

Another kind of analysis would become possible, if 
a sufficiently large number of transitions from one amino 
acid to another could be accumulated. Suppose [Fig. 
2(a)_] that a mutation occurs which carries amino acid a 
into 6 and another also occurs which carries a into y. 
One can ask if 8 can mutate to y. If it does, then the set 
of three amino acids is called a conversion group. A con- 
version set is defined as a set of amino acids, each mem- 
ber of which can be changed to every other member of 
the group by a single mutation. With the assumption 
that mutations change only a single nucleotide at a time, 
a conversion group must be produced by changing only 
one letter of the word which corresponds to an amino 
acid. Under these assumptions, the first prediction 
which would apply to any code is that no conversion 
group could contain more than four elements, corre- 
sponding to the four possible bases which could be 
inserted in a given position. If there are limitations on 
what alterations can occur in the nucleotide chain, the 
number of elements in a conversion group would be less. 
For example, if purines could be changed only to other 
purines and pyrimadines to other pyrimadines, no con- 
version group could contain more than two elements. 
A single amino acid could, of course, belong to several 
different conversion groups. The number of conversion 
groups to which a given amino acid belonged would be 
equal to the number of nucleotide letters which corre- 
spond to a single amino acid. Consequently, in any of 
the triplet codes, each amino acid would belong to three 
conversion groups. As can be seen from the five dic- 
tionaries indicated in Fig. 1, for the triplet commaless 
codes, some of the conversion groups would contain four 
elements, and others, three, two, or one. If a complete 
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codes. Conversely, if any code of more than three letters 
is written, the set of these conversion groups could be 
numerated and, therefore, in principle at least, any 
particular code could be tested with this type of analy- 
sis. However, in order to determine the conversion 
groups, a great deal of experimental data would be 
necessary. 

All of the experiments mentioned so far require de- 
tailed chemical analysis of altered proteins but do 
not necessarily require fine-structure genetic analysis. 
However, if fine-structure genetic analysis can be com- 
bined with information as to the chemical alterations, 
two of the most fundamental questions can be investi- 
gated. The first involves the basic premise on which 
much of the speculation depends; that is, does the order 
and relative position of mutations along the genetic map 
correspond to the order and relative position of the 
corresponding altered amino acids along the polypeptide 
chain. If several different mutations can be mapped 
relative to each other and the corresponding altered 
proteins examined, one can investigate whether or not 
the relative positions are the same on the genetic map 
and on the polypeptide chain. One way of accomplishing 
this analysis would be to carry out the fine-structure 
mapping on the nonsense mutations which, since they 
produced no active protein, could be mapped with high 
resolution, and the amino-acid analysis on the reversions 
which have some activity and whose protein alteration 
could, in principle, be determined. Another problem 
which can be investigated, if the genetics could be com- 
bined with the chemical analysis, is to find whether or 
not the genetic determinants of a particular amino acid 
are adjacent to each other. If, for example, the infor- 
mation specifying one amino acid is interspersed along 
the nucleotide chain with the information controlling 
another amino acid, it would be evident from the finding 
that the mutants affecting one were interspersed among 
the mutants affecting another. In this way, the question 
as to whether or not the genetic code is truly a local 
code could be determined experimentally. 

So far, this paper has dealt with much speculation, 
many questions, and a few experimental facts. The re- 
mainder of it is a brief review of the work going on in 
several laboratories which is directed toward obtaining 
some answers to questions raised. Ingram’s studies with 
human hemoglobin are continuing, and the list of pos- 
sible transitions from one amino acid to another is being 
extended. Even if the combined genetic and chemical 
analysis should become possible with proteins made by 
microorganisms, the work with hemoglobin will remain 
of prime importance because it can help to answer the 
question of whether or not the code is the same for 
different organisms. 

Several systems are being investigated with micro- 
organisms. The first is the enzyme trytophane syn- 
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thetase which is being studied by Yanofsky. He has ‘= 


analysis of the conversion groups were possible, one 5 
found” that this enzyme, under some conditions, can _ 


could be led, unambiguously, to one of the five triplet 
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catalyze two sequential reactions which ultimately lead 

to the formation of tryptophane. He has found that 

this enzyme can be altered by mutation so that either 

one of the two enzymatic activities can be eliminated, 

leaving the other intact. Those mutations affecting one 

of the activities are localized and are in a region of the 

genetic map which is adjacent to the region in which the 

second set of mutations can occur. In many of these 

cases, the bacteria form a material which is antigenically 

related to the enzyme produced by the normal cell. 

The effects of many of the mutations affecting this 

enzyme can be reversed by suppressor mutations which 

occur elsewhere on the genetic map in regions not ad- 

jacent to the original mutations.’ This finding of un- 

linked suppressors is, at first sight at least, in complete 

contradiction to many of the assumptions used in this 

analysis. However, the detailed investigation of the 

chemical alterations associated with these mutations 

and suppressors has been undertaken only recently, 

and, until it is completed, several interpretations of the 

results are possible. Some of the mutations leading to a 

loss of enzymatic activity possibly may be owing to a 

certain alteration of the enzyme so that it can be in- 

activated by some normal cellular component. The 

suppressor mutations could then affect the amount of 

this component present in the cells. At least one such 

b case, involving tryptophane synthetase, has been found 

by Suskind.“ However, the resolution of this question 

must await further chemical analysis of the altered 

proteins. 

The proteins which make up the 7-even bacterial 

viruses have been investigated also, with regard to 

finding a system for doing combined genetic and chemi- 

y cal studies. A great deal of genetic analysis has been 

i) carried out with these phages, and several proteins have 

i) been isolated from them. Unfortunately, the 77I region, 

apt which Benzer has studied so extensively, does not con- 

FH trol a protein which is contained in the phage itself. 

i, j Rather, the control seems to be over one of the several 

gel, proteins required in the infected cell for the formation 

of the phage. Two of the proteins which are in the phage 

particle have been purified and characterized, and, 

although differences can be observed between the 

strains, 72 and T4, it has not yet been possible to obtain 

different proteins by mutation of any one of the strains. 

There are, however, several types of mutations of these 

_ phages and it is very likely that some of them will be 
= found to control these proteins. 

= In addition to the coding problem already discussed, 

another type of question can be approached by using 

= the phage proteins. The membrane around the DNA of 

viruses is composed of a large number of identical 
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or not any progress toward the solution of this question ` 
is possible using the phage, it is clear that, in order to: 
understand genetic control of an organism, it is neces- 
sary to know how organized structures, as well as pro- 
tein molecules, are controlled. 

Another attempt to obtain a system in which a con- 
ditionally necessary protein can be studied chemically 
and genetically is being undertaken by Garen and 
Levinthal using the bacterial enzyme, alkaline phos- 
phatase. This enzyme splits the phosphate-ester linkage 
of organic phosphates and is necessary for the bac- 
terial growth only if there is organic phosphate, but 
no inorganic phosphate, supplied in the medium. Alka- 
line phosphatase is made in high concentration by bac- 
teria only if they are starved for phosphorus. Under 
these conditions, it is made in very large amounts. The 
ability to make a single enzyme in this high concentra- 
tion seems to be a characteristic of bacteria, which, 
although they have the genetic capability of making 
thousands of different enzymes, also have additional 
mechanisms which govern the amount of each of these 
enzymes actually made under given culture conditions. 
Approximately 5% of the total cell protein occurs as 
this one enzyme. When the bacteria are producing 
alkaline phosphatase at the 5% level, it is compara- 
tively simple to purify the protein. Mutants have been 
isolated after irradiation with ultraviolet light which 
showed no enzymatic activity, and others have been 
isolated which have reduced activity. Several of the 
latter type have been analyzed and found to produce 
an altered protein in the same high concentration which 
was observed in the “wild” type. From the mutants 
which lack any enzymatic activity, reversions have been 
obtained, some of which are like the “wild” type, and 
others, like the mutants with reduced activity. Fine- 
structure genetics can be done with these mutants, but 
as yet no information has been obtained concerning the 
alterations in the amino-acid sequence in the enzymes 
made by the mutant organisms. 

It can be seen from this discussion that, although 
there is some evidence that a code exists which trans- 
lates genetic information into amino-acid sequence, 
there is no information as to the nature of this code. 
On the other hand, a number of well-defined questions 
which can be answered experimentally have been formu- 
lated, and several different systems which are likely to 
yield some answers are being studied. 
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I. INTRODUCTION 


USEFUL hypothesis of modern genetics and that 

forming the basis of the preceding papers is that 
the genes (specifically, the cistrons) within a cell de- 
termine which proteins among all possible proteins a 
given cell can conceivably synthesize. Such genes are 
thus considered as the primary determinants of a given 
cell’s allowance of proteins. 

There are, however, other determinants of this allow- 
ance which, although secondary to genetic control, are 
of interest in any consideration of the mechanism of 
protein synthesis in that they specifically affect the syn- 
thesis of a single protein. Three phenomena have been 
sufficiently delineated to exemplify such secondary 
specific determinants. These are: 


(1) Specific antibody synthesis resulting from ex- 
posure of cells to a given antigen. 

(2) Induced enzyme synthesis. 

(3) Repression of enzyme synthesis. 
This paper is concerned solely with the last two phe- 
nomena, which quite possibly represent two forms of 
the same basic event. 


Induced enzyme synthesis can formally be defined as 
the increase in the ratio of the rate of synthesis of a 
given enzyme to the rate of synthesis of total cell protein 
resulting from exposure of cells to compounds (inducers) 
which are identical or structurally related to the sub- 
strates of the given enzyme. The vast majority of known 
instances of induced enzyme synthesis is derived from 
microorganisms, particularly the bacteria. This dispro- 
portionate representation is probably not real, for we 
have some reason to suspect that enzyme induction is 
operative in most cell types. Rather, it would seem that 
our observations are selected by the experimental tech- 
niques available for the manipulation of bacterial and 
fungal cell populations which have not, until very re- 
cently, been available for the culture of other cell types. 
A second consequence of this precision of bacterial 
manipulation is that the best-characterized and inter- 
pretable induced enzyme systems are found in the 
bacteria. 

Induced enzyme synthesis in bacterial populations 
has been known for approximately seventy years, al- 
though, during a majority of its observed lifetime, this 
phenomenon has been shrouded in a teleological disguise 


z by being named “enzymatic adaptation.” It is only in 


the past decade that quantitative studies have allowed 
one to define this phenomenon as an induced enzyme 
not necessarily related to any increase in 
Rites 


Even in bacterial populations in which there are the 
most numerous examples of enzyme induction, it is quite 
clear that the majority of known enzyme-forming 
systems cannot be classified as inducible in that the 
enzymes are formed at considerable rates in the absence 
of exogenous inducers, and this rate of synthesis cannot 
be specifically increased by exposure of cells to their 
substrates or such structural analogs of these substrates 
as have been tested. Such enzymes are often referred 
to as constitutive enzymes in order to differentiate them 
from the induced variety. It must be stressed that the 
terms induced and constitutive do not describe the 
properties of an enzyme per se, but rather describe the 
properties of an enzyme-forming system. Thus, the 
same protein molecule can result from an induced 
enzyme synthesis in one bacterial population while 
being the resultant of constitutive enzyme synthesis in 
another, genetically distinct, bacterial population. 

There is then an apparent dichotomy among enzyme- 
forming systems, being either inducible or constitutive. 
While at present the data are not sufficient for a unique 
explanation of this dichotomy, an attempt is made here 
to correlate these two systems to a common working 
hypothesis, thereby making the questions to be asked 
more specific and, it is hoped, experimentally answer- 
able. It would not be profitable for the purpose of this 
paper to review the many known systems of enzyme 
induction or, in fact, even one of these systems in all of 
its detail. (For this purpose, see references 2-11). 
Rather, emphasis is placed on the salient features of one 
system that are relevant to a precise experimental illus- 
tration of the definition of induced enzyme synthesis 
given above and to the apparent dichotomy between 
induced and constitutive synthesis. The system of choice 
for this purpose is the B-galactosidase of E. coli, since 
it has been analyzed in perhaps the most detail as 
regards the enzyme protein itself, its induced synthesis, 
and the genetic and physiological relationship between 
its induced and constitutive synthesis. 


II. 6-GALACTOSIDASE INDUCTION 
(A) Characteristics of the Enzyme Protein’ 


Since studies of the induction process depend upon 
the measurement of enzyme activity as a measure of 
the amount of enzyme protein present at any given 
time, it is obviously essential that a direct correlation 
between these two quantities can be made for the va- 
riety of conditions employed. This, in turn, demands 
that the catalytic and structural parameters of the en- 

uestion be sufficiently determined. The $- 
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galactosidase of Æ. coli has been subjected to such de- 


. terminations and it is useful in the discussion of its 


induced synthesis if some of its basic properties are 
briefly described. 

(a) The characteristic reaction catalyzed by B-galac- 
tosidase is represented in Fig. 1, namely, the hydrolysis 
of B-p-galactosides. Many glycosides have been tested 
as substrates or competitive inhibitors of this enzyme 
yielding the conclusion that the minimum requirement 
for affinity for the active site on the enzyme is the 
existence of the 6-p-galactopyranosidic ring. Whereas 
all of the B-p-galactosides tested are substrates, an in- 
teresting class of compounds, the $-p-thiogalactosides 
in which sulfur replaces the oxygen atom of the galacto- 
sidic linkage, function only as competitive inhibitors. 
The interest in these compounds lies in their capacity 
to function as inducers of $-galactosidase synthesis, 
without at the same time being hydrolyzed by the 
enzyme whose synthesis they induce. 

(b) Purified preparations of B-galactosidase have 
been obtained which contain at most 1 to 2% contami- 
nating protein.® This allows a direct correlation of the 
unit of catalytic activity* with the mass of enzyme, 
yielding the value of 1 catalytic unit per 3.0X10- g of 
protein. Since Cohn’ has determined by equilibrium- 
dialysis methods that there is one active site per molecu- 
lar weight 1.3105, activity measurements can yield a 
determinination of the number of active sites. Thus, 
one catalytic unit equals 1.410! such sites. 


(B) Induction Phenomenon 


A culture of Æ. coli growing in a medium of inorganic 
salts with a nongalactosidic carbon source, such as suc- 
cinic acid, produces only trace amounts of $-galacto- 
sidase. The addition of a suitable galactoside to such 
a growing culture is immediately followed by a sharp 
increase of over 1000-fold in the rate of synthesis of 
this enzyme. This high rate of synthesis is maintained 
as long as the bacteria grow in the presence of the in- 
ducing galactoside (inducer). However, if the inducer is 
removed, the rate of synthesis falls directly to the origi- 
nal small value. The quantitative aspects of this situa- 
tion are presented in Fig. 2. As shown in the left-hand 
graph of this figure, from 0 to 120 min the bacteria are 
growing in the absence of the inducer and the amount of 
enzyme per unit weight of bacteria is quite low, approxi- 
mately 7 units per mg dry weight of bacteria, which is 
equivalent to 20 active sites per bacterium. However, 
immediately upon adding the inducer, one sees that the 
amount of enzyme in the culture increases rapidly 
against a background of constant growth rate. Thus, 
while the bacteria have only doubled in amount, the 
B-galactosidase activity has increased by a factor of 


* One unit of Catalytic activity is defined as that amount of 
enzyme which will cause the hydrolysis of o-nitrophenyl-6-p- 
galactoside to occur at the rate of 1 mumole/min at 28°C and 
ŻH 7.1 in 1.0X 10M sodium phosphate and 2.7 1073M o-nitro- 
phenyl-8-p-galactoside, 
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Fic. 1. Hydrolysis of a 8-p-galactoside 
catalyzed by 6-galactosidase. 


approximately 1200. Plotted in this way, one observes 
the transition of the culture from one steady-state con- 
dition in which the inducer is absent (noninduced state) 
to a second steady-state condition with inducer present 
(fully induced state). The bacteria in the fully induced 
state contain 8.5X 10% catalytic units per mg dry weight 
of bacteria or approximately 24> 10° active sites per 
bacterium. This increase of about 1200-fold in the 
amount of enzyme per bacterium in the fully induced 
state over that in the noninduced state means that the 
rate of synthesis of 8-galactosidase relative to the rate of 
synthesis of bacterial mass (i.e., the differential rate of 
6-galactosidase synthesis!) has increased via the induc- 
tion process by the same factor of 1200. 

It is this differential rate of enzyme synthesis which 
is of particular interest, since it is a more direct measure 
of the specific effect of the inducer on the rate of enzyme 
synthesis. One, therefore, wants to ask the question: 
how does this differential rate of 6-galactosidase syn- 
thesis change during the induction process? This ques- 
tion is best answered by plotting the -galactosidase 
activity of the culture vs the bacterial mass, a procedure 
which yields the right-hand graph of Fig. 2. Here, it is 
seen that the differential rate of B-galactosidase syn- 
thesis (i.e., the slope of the curve, P) changes from that 
of the noninduced state (7 units per mg dry weight) to 
that of the induced state (8.510% units per mg dry 
weight) almost immediately after the introduction of 
the inducer, the transition taking place in less than 2 
min, which is the minimum time detectable by the 
experimental techniques employed. 

This high differential rate of synthesis remains con- 
stant as long as the inducer is present in the medium. 
However, as is shown in both graphs of Fig. 2, the in- 
ducer effect is readily reversible, since removal of the 
inducer immediately restores the differential rate of syn- 
thesis characteristic of the noninduced state, again with 
no appreciable change in the growth rate. Thus, in the 
case of -galactosidase induction, it would appear that 
one has a system in which the synthesis of a given pro- 
tein can be initiated or stopped by the simple addition 
or removal of a compound which, because it cannot 
function as a carbon or energy source, apparently does 
not influence the rate of synthesis of the vast majority 
of other cell constituents. 


(C) Enzyme Induction as de novo Synthesis 


Consider what is meant by synthesis as used in the 
preceding paragraphs. While it is true that the physical 
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and catalytic parameters of the active enzyme are suffi- 
ciently well known that one can calculate the mass of 
the active enzyme from the measurements of the cata- 
lytic activity present in the culture, it does not neces- 
sarily follow from this knowledge alone that an increase 
in such activity (and, therefore, of the mass of active 
enzyme) results from the de novo synthesis of the active 
enzyme. Within the context of the information so far 
presented, it is conceivable that the increase in active 
enzyme involved in the induction process results from 
the formation of a new protein structure (active enzyme) 
derived from inactive proteins present in the noninduced 
state. This transformation from inactive to active ma- 
terial could result (a) from a zymogen- to enzyme-type 
reaction typical of protease activations, (b) from the 
formation of a complex active protein made up of inac- 
tive subunits, or (c) from the addition of a few amino 
_ acids in peptide linkage to an inactive protein. In all of 
these cases, the induction process would have to be 


sense that protein precursors of active 6-galactosidase 
a would be present in the noninduced state, and, conse- 
= quently, that the total protein potentially or actually 


one determine whether the induction phenomenon 
‘responds to the activation of protein precursors or 
e novo synthesis, before it is possible to assess the 
of this phenomenon as a tool for the study of 

f protein synthesis. 
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proof of this correspondence.** The direct proof of this 
correspondence came from experiments carried out in 
collaboration with Cohn and Monod", and simultane- 
ously from the work of Rotman and Speigelman." 

The experiments consist of determining whether or 
not proteins present in noninduced cells are ever incor- 
porated in or associated with the induced 6-galactosi- 
dase protein by simply labeling the proteins in the non- 
induced cells with a suitable radioactive isotope and 
measuring the amount of radioactivity present in the 
enzyme when induced in nonradioactive media. Thus, in 
our experiments, E. coli were first grown in a simple salts 
medium in which the sole source of sulfur was S* labeled 
sulfate. The amount of sulfate was so adjusted relative 
to the other components of the media that the bacteria 
stopped growing for lack of sulfate. In this starved con- 
dition, all of the radioactivity associated with the cells is 
contained within the trichloracetic-acid precipitable 
protein, and since no inducer was added to the medium, 
we have the condition of a unique radioactive labeling 
of the proteins of noninduced cells. This constitutes 
Step I (Table I) of the experiment. 

In Step II, nonradioactive sulfate is added to the 
medium simultaneously with the inducer, methyl-8-D- 
thiogalactoside. This allows growth and induction to 
occur simultaneously. When the mass of the cells had 
increased so that 16% (aliquot A), 28% (aliquot B), 
and 43% (aliquot C) of the total growth had occurred 
in the nonradioactive inducing medium, aliquots of the 
bacterial population were removed, extracted, and, after 
purification of the 6-galactosidase in these extracts, the 
amount of radioactivity associated with the enzy™m® 
determined. 
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enzyme from less than one gram of cells of which only 
‘0.16% was 8-galactosidase. That this was possible is 
demonstrated by the data in Table I relevant to the 
isolation control in which unlabeled enzyme was mixed 
with a labeled extract of noninduced cells, such that the 
ratio of radioactivity to enzyme activity (K) in the 
mixed extract was the same as that for the crude extract 
of aliquot A. By comparing the K-ratio of the enzyme 
isolated from this reconstruction mixture with that iso- 
lated from bacteria grown and induced in the $% labeled 
medium (fully labeled enzyme control), it is clear that 
the purification procedure yields an enzyme preparation 
which contains less than one percent bacterial protein 
contamination (0.4% in this case). 

The enzyme purified from aliquots A, B, and C con- 
tained, respectively, 0.1, 0.8, and 0.1% of the radio- 
activity associated with the fully labeled enzyme. Since 
this is within the limits of the purification procedure as 
indicated by the isolation control data, and since these 
values do not progressively decrease with increase in the 
fraction of total growth occurring in Step II, it was con- 
cluded that less than 0.8% of the sulfur of the enzyme 
formed in Step II was derived from proteins present in 
the noninduced state (Step I). This result, taken in 
conjunction with the fact that, in aliquot A, the enzyme 
level was only 5% of that found in fully induced bacteria, 
indicates that if any protein precursor of 8-galactosidase 
exists in the noninduced bacteria, its level must be less 
than 0.04% of that for -galactosidase in fully induced 
bacteria or less than 10 precursor molecules of molecular 
weight 1.3X10° (i.e., the molecular weight per active 
site of #-galactosidase) per bacterium, assuming of 
course that such precursors contain the same percent 
sulfur as does B-galactosidase. This conclusion is en- 
forced by the fact that Rotman and Spiegelman" found 
essentially the same results using C™ as the labeling 
material. 

The results of these experiments allow one to conclude 
that, if there are any protein precursors existent in 
noninduced cells, they cannot form an appreciable con- 
tribution to the increase in active enzyme found upon 


TABLE I. Incorporation of sulfur into -galactosidase synthesized 
by labeled Æ. coli cells in nonradioactive medium. 


Percent Percent of 
maximal radioactivity K for fully 
3 enzyme a labeled 
Experiment level enzyme activity enzyme* 
Step I 0.06 (0.45) (100) 
Step II—A 4.8 0.0050 0.1 
B 32 0.0043 0.8 
C 58 0.00072 0.1 
Controls 
Fully labeled 100 0.45 100 
enzyme è 
Isolation 
control (4.8): 0.0018 0:4 


ee 
a Corrected for basal activity. 
Basal level assumed to be equal to fully labeled enzyme. 


label onstruction of Extract A by mixing unlabeled enzyme with a 


xtract of noninduced cells. 
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the addition of the inducer, since it would take only a 
few seconds to form 10 new active sites per bacterium 
under the conditions of induction indicated in Fig. 2. 
Thus, when cells are exposed to inducer molecules the 
over-all rate of conversion of amino acids into a specific 
protein, 6-galactosidase, is drastically and immediately 
increased, and consequently induced enzyme formation 
is equivalent to induced de novo synthesis. It would 
appear then, that the induction phenomenon should be 
accounted for in any working hypothesis that attempts 
an explanation of the general mechanism of protein 
synthesis. 

Two further conclusions resulted from our labeling 
experiments. The first stems from the preceding experi- 
ment and states that, in exponentially growing Æ. coli 
cells, the vast majority of non-6-galactosidase proteins 
are stable in the sense that they are not broken down 
to their constituent amino acids at rates that are ap- 
preciable in relation to the rates of synthesis. If the 
state of proteins within these cells consists of a continual 
synthesis from and breakdown to their constituent 
amino acids (i.e., state of “dynamic equilibrium”), then 
one would expect the -galactosidase synthesized in 
Step II of the above experiment to be labeled with S35 
as a result of the breakdown of the radioactive proteins. 
A simple calculation from the data of this experiment 
yields the conclusion that the average rate of breakdown 
of the non-6-galactosidase proteins must be less than 
one percent of their average rate of synthesis. 

It is, therefore, not surprising to recall that, when the 
inducer is removed from an exponentially growing cul- 
ture of E. coli cells (Fig. 2), the B-galactosidase formed 
while inducer was present remains constant, simply 
being diluted out in the exponentially increasing popu- 
lation. This would indicate that 6-galactosidase is also 
stable once formed. Because of the possibility that this 
constancy of enzyme activity results from an equal rate 
of synthesis and degradation of B-galactosidase, and to 
test whether or not -galactosidase is also stable in the 
presence of the inducer, we carried out further S** 
labeling experiments. These experiments yielded the 
expected conclusion that the 8-galactosidase molecules 
in exponentially growing cells are stable in both the 
absence and the presence of inducers. Thus, it can be 
concluded that the induced de novo synthesis of B-galac- 
tosidase is, like the synthesis of other proteins in E. coli, 
essentially an irreversible process, and the so-called 
“dynamic state” is not a concept that need be involved 
in an explanation of the mechanism of such synthesis. 


(D) Induction at the Cellular Level 


While it is clear from the foregoing discussion that 
one may interpret the kinetics of formation of 6-galac- 
tosidase activity during the induction of a bacterial 
population as the kinetics of de novo synthesis of this 
enzyme in that population, one can only extrapolate 
these kinetics to the cellular level if one makes the as- 
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sumption that all cells participate equally and simul- 
taneously in this induced synthesis. Even though one 
can demonstrate a high degree of genetic homogeneity 
for the bacterial population being induced, there is no 
a priori promise that this assumption of equal and 
| simultaneous participation of all cells is justified. It is, 
| therefore, critical to the correct interpretation of the 
| kinetics of induced enzyme synthesis in cultures that 
| one have a knowledge of the distribution of rates of 
| induced synthesis among the cells of the bacterial 
! population. 
| There are two known types of conditions that can 
cause a heterogeneous response of a bacterial popula- 
tion when exposed to inducer molecules. The simplest 
of these types is a condition whereby the catalytic 
activity of the first enzyme molecules synthesized by 
| a given cell increases the probability of synthesis of 
i future enzyme molecules by that cell. As an example of 
this condition, consider a culture of Æ. coli whose growth 
has stopped because of exhaustion of glucose in the 
medium. If lactose (4-glucose-6-p-galactoside) is added 
! to such a culture, it will have two functions: to induce 
l the synthesis of -galactosidase and, as a substrate of 
i > this enzyme, to provide the only available carbon and 
| energy source necessary for cell synthesis. Under these 
| circumstances, the induced synthesis of the first enzyme 
| molecules must depend upon traces of enzyme already 
j present or upon internal bacterial reserves. Since syn- 
| thesis of the first enzyme molecules by any cell will 
j increase the availability of carbon and energy for further 
| synthesis by that cell, the cells which have a head start 
| will increase their advantage. Thus, any initial hetero- 
af ft geneity in the population with regard to internal re- 
serves or initial amount of 6-galactosidase would be 
expected to result in an exaggerated heterogeneity in 
respect to enzyme content during the initial stages of 
growth on lactose. This expectation has been verified 
experimentally by Benzer.!® 
Elimination of this factor favoring heterogeneous re- 
sponse is quite simple and consists of using an inducer 
which is not a substrate of #-galactosidase (e.g., the 
B-p-thiogalactosides), or alternatively of employing mu- 
tant bacteria which cannot further metabolize the prod- 
ucts of hydrolysis of the inducer, while at the same time 
providing a nongalactosidic carbon and energy source 
(e.g., succinic or lactic acid) for cell synthesis. These 
conditions have been defined as “conditions of gratuity” 
by Monod and Cohn? and are the conditions employed 
in the experiment illustrated in Fig. 2. Benzer!® found 
that, in the induced synthesis of 6-galactosidase under 
conditions of gratuity and at saturating levels of inducer 
concentration, the bacterial population exhibits a homo- 


prae 


he same degree. Hence, under these conditions, the 
cs of induction of a culture represent the kinetics 


n of single cells. 
nost readers of Benzer’s work concluded 


of gratuity were a sufficient guarantee 
ns of gratuity Ti gu 
Boz t ; 
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for the assumption of population homogeneity during ` 
induction, Benzer himself cautioned against applying 
this assumption to conditions of less than saturating 
concentrations of inducer even though the induction was 
“gratuitous.” Indeed, a second more subtle factor caus- 
ing heterogeneous response at less than saturating con- 
centrations of inducer has been discovered by Ricken- 
berg, Cohen, Buttin, and Monod! (also referred to in 
reference 9). Thus, these workers found that, in wild- 
type Æ. coli, the -galactosidase induction can be divided 
into two separate processes. The first of these involves 
an active transport of the inducer into the cell such that 
inducer concentrations inside the cell are much greater 
than those in the medium. The second is the actual in- 
duction of B-galactosidase synthesis inside the cell at a 
rate determined by the internal concentration of in- 
ducer. The active transport of inducer is accomplished 
by a unit, called galactoside-permease, which has been 
demonstrated to have most of the characteristics usually 
associated with enzymes and whose synthesis is induced 
by many of the same compounds which induce -galac- 
tosidase synthesis, again at a rate determined by the 
internal concentration of inducer. 

This transport mechanism obviously introduces an- 
other factor that would be predicted to encourage a 
heterogeneous response of a population upon exposure 
to inducer, providing the probability that a given cell 
will produce its first permease molecule per smallest 
detectable time unit is small. Thus, those cells which do 
synthesize their first permease unit will have an in- 
creased internal concentration of inducer which will, in 
turn, increase the probability of synthesizing the second 
permease unit above that for the first. Thus, those cells 
which by chance are the first to synthesize permease 
units will increase their advantage and rapidly pass to 
the stage of saturating internal concentrations of inducer 
and maximum rate of synthesis of both the permease 
and 6-galactosidase. The distribution of rate of enzyme 
synthesis in the bacterial population at any given time 
after addition of the inducer will thus depend upon the 
probability that a given cell will synthesize its first 
permease unit and upon the time interval between this 
event and when maximum rate of synthesis is reached. 

Novick and Weiner!” and Cohn® have analyzed the 
kinetics of B-galactosidase induction in the wild-type 
E. coli, which is inducible for both the galactoside- 
permease and f-galactosidase. They have found that 
these kinetics are consistent with the foregoing theory 
with the addition that the probability of synthesis of the 
first permease units is determined by the inducer con- 
centration, and that the time interval between this 
event and when maximum synthetic rate is achieved is 
very small, even at the lowest inducer concentrations 
studied. This means that, at inducer concentrations well 
below saturation, the distribution of rates of enzyme 
synthesis at any given time is essentially an all-or-none 
distribution; i.e., cells are synthesizing enzyme either at 
the maximum rate or at the minimum noninduced rate. 
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„Frc. 3. Kinetics of $-galacto- 
sidase induction in normal Æ. coli 
and in the cryptic mutant. Inducer 
is lsopropyl-8-D-thiogalactoside at 
the concentrations indicated above 
[from L. A. Herzenberg, “Studies 
on the induction of 8-galactosi- 
dase in a cryptic strain of Esche- 
richia~coli,”’ Biochim. et Biophys. 
Acta (to be published)]. 
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As this induction at low inducer concentrations con- 
tinues, a larger and larger fraction of the bacterial popu- 
lation is synthesizing enzyme at the maximum rate, and, 
consequently, a plot of enzyme activity vs increase in 
bacterial mass during the induction of such a culture will 
yield a curve with a slope (differential rate of synthesis) 
that increases with increase in growth to some final 
constant value (left-hand graph of Fig. 3). As the ex- 
ternal inducer concentration is raised, the probability 
of synthesis of the first permease unit increases, and, 
consequently, the time duration of the phase of increase 
in differential rate of synthesis, or what may be termed 
the heterogeneous phase, decreases until it is no longer 
detectable. Under these conditions of saturating inducer 
concentration, for all intents and purposes, the distri- 
bution of rates of 6-galactosidase synthesis in the popu- 
lation at any given time after addition of inducer is 
uniform, and, consequently, the kinetics of induction 
of the culture represents the kinetics of the individual 
cells. 

However, it is quite clear that wild-type Æ. coli, in 
Which both the galactoside-permease and the f-galac- 
tosidase are inducible, cannot be used to define the 
kinetics of induction of single cells at less than saturating 
concentrations of inducer. This could be accomplished 
if the galactoside-permease could be eliminated, thereby 
removing the last contributing factor toward hetero- 
Seneous response. The most convenient and complete 
method of such elimination is by the isolation and use of 

; coli mutants which have lost the ability to form 
e e pemense, bat ein the pope of Deine 
oie TE B-galactosidase. Such mutants have T 
at . ~ are commonly called inducible cryptics, 

» While B-galactosid be induced in these 
strains, beca idase can be need 
enzyme . “ause they lack the permease, the 1m 

S essentially hidden from external substrate 
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(e.g., lactose), the rate of hydrolysis in whole cells being 
limited by passive diffusion of the substrate. 

One of these inducible cryptics (E. coli ML3) has 
been used extensively by Herzenberg!® to define the 
kinetics of induction throughout the range of inducer 
concentrations that produce measurable increases in the 
rate of 6-galactosidase synthesis. It is also the strain 
used in the experiment depicted in Fig. 2. An interesting 
result of the experiments of Herzenberg is that, for each 
inducer concentration tested, the differential rate of 
synthesis remains constant from the time of addition of 
the inducer, although the actual value of this rate is 
determined by the inducer concentration (Fig. 3). Under 
the assumption that, by employing conditions of gra- 
tuity and the inducible cryptic strains of Æ. coli, all 
factors encouraging a heterogeneous response during in- 
duction have been eliminated, one can interpret these 
kinetics of constant differential rate of synthesis for the 
culture as representing the kinetics of induction of in- 
dividual cells. It must, however, be noted that, while 
this somewhat tortuous path of discovery and elimina- 
tion of the sources of heterogeneous response during the 
induction process in E. coli wild-type makes this as- 
sumption quite reasonable, there is at present no direct 
experimental evidence available that completely justi- 
fies it. 

If one admits this assumption, then the kinetics of 
constant differential rate of 6-galactosidase synthesis 
during induction of the inducible cryptic strains (Fig. 3) 
allows one to conclude that the number of enzyme- 
forming units per cell that are activable by the inducer 
remains constant during induction. For, if the number 
of these units per cell should change during induction, 
then, in the presence of a constant subsaturating in- 
ternal concentration of inducer, the ae rate of 
synthesis should change proportionately. The precision 
of the conclusion as to the constancy of -galactosidase 
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galactosides demonstrating the independence of inducer and sub- 
strate functions. 


forming units per cell is equivalent to the precision of 

the experimental observation of constant differential 

rate of synthesis. Consequently, if any change in the 

: number of these units occurs during induction, it must 

f occur in every cell of the population within two minutes 

j of the moment the inducer is added. Since this seems 

| extremely unlikely, it can be inferred that the same 

E ji number of enzyme-forming units exist in noninduced 

ER cells as in induced cells, these units being active only 
a in the presence of the inducer. 


(E) Independence of Substrate and Inducer 
Functions 


Bi Sh ig A corollary to this conclusion is that the catalytic 
activity of 6-galactosidase is not functional in its own 
synthesis; i.e., the process of enzyme induction is in- 
dependent of enzyme action. The kinetic derivation of 
this postulate is confirmed by the observation that the 
property of being a substrate for B-galactosidase is 
neither a sufficient nor necessary condition for a com- 
pound to function as an inducer. Thus, phenyl-f-p- 
galactoside is an excellent substrate of 6-galactosidase, 
but is not functional as an inducer. Methyl-6-p-thio- 
galactoside, on the other hand, is not a substrate of the 
enzyme, but is an excellent inducer. The inverse rela- 
= tionship of these two compounds is demonstrated quite 
= clearly by a simple growth experiment shown in Fig. 4. 
og ~ An inducible strain of E. coli was grown in a medium 
containing galactose as the sole carbon source until 
growth stopped (ca 60 min) as a result of depletion of 

lactose. Since galactose does not induce f-galacto- 

das synthesis, these starved cells contain only trace 
amou of the enzyme. After waiting a short time to 

oS eae Jete starvation, the starved cells were divided 
xy ee parts, and the galactosides indicated in Fig. 4 
sole carbon source. The cells did not 
ence of methyl-6-p-thiogalactoside be- 
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cause, although it can induce 6-galactosidase synthesis, - 

it is not a substrate and, consequently, cannot function. . ~ 
as the necessary carbon source. No growth is observed ` 
in the presence of phenyl-8-p-galactoside, because it is 
not an inducer of the enzyme necessary for it to function 
as a carbon source. However, when the two galactosides 
are added together, growth occurs since methyl-6-p- 
thiogalactoside functions as the inducer and phenyl-f- 
p-galactoside serves as the substrate, yielding galactose, 
the necessary carbon source. 

In addition to the qualitative data as to independence 
of substrate and inducer functions, quantitative meas- 
urements of the ability of many thiogalactosides to 
induce B-galactosidase synthesis and the ability to com- 
plex with the active site of this enzyme show no corre- 
lation. Thus, phenyl ethyl-8-p-thiogalactoside is an ex- 
tremely good competitive inhibitor of -galactosidase 
action (K;=1.5X10-*M), but can induce the synthesis % 
of B-galactosidase to a rate that is only one-eightieth 
that obtainable with methy!-G-p-thiogalactoside, al- 
though this latter compound is about 800-fold less effi- 
cient as a competitive inhibitor (K;= 1.2107). On 
the other hand, melibiose, an a-galactoside which is 
neither a substrate nor an effective competitive inhibitor 
of B-galactosidase, is quite effective as an inducer of this 
enzyme. Thus, it would appear that, in activating the 
enzyme forming units within the cell, the inducer reacts 
with some material having a different specificity than 
that associated with the active site of B-galactosidase. 

There remains, however, one minimum structural re- 
quirement common both to inducers and to substrates 
or competitive inhibitors—namely, that they contain 
the galactopyranosidic ring structure.j The inference, 
then, is that the site at which the inducer reacts to 
activate the enzyme-forming unit is structurally similar 
but not identical to the active site of B-galactosidase. 

In concluding this experimental definition of induced 
enzyme synthesis, the main conclusions that have been 
obtained are listed below. 


1. The induction process involves the complete de 
novo synthesis of the enzyme from its constituent amino 
acids. 

2. Induced enzyme synthesis is a virtually irreversible 
process, the enzyme being stable in the presence or 
absence of inducer. 

3. The number of enzyme-forming units per cell re- 
mains constant during the induction process; that is, 
the inducer does not change the rate of synthesis of 
enzyme-forming units, but simply activates such units. _, 

4. The site at which the inducer reacts to activate 
the enzyme-forming unit is structually similar but not 
identical to the active site on the enzyme. 


t This is strictly true only if the C6 carbon is not considered 
part of the ring, since the a-L-arabinosides (which are derivatives 
of B-p-galactosides lacking C6) exhibit the property of bemg 
weak substrates for 6-galactosidase. 
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In addition to the above conclusions, one should men- 
‘tion another of equal importance but which, at present, 
cannot be derived from data on 6-galactosidase induc- 
tion. It is that the inducer acts as a catalyst in the sense 
that one molecule of inducer may cause the formation 
of more than one molecule of enzyme. This observation 
comes from experiments concerning the penicillin-in- 
duced synthesis of the enzyme penicillinase in Bacillus 
cereus.'' It should be noted that each of the foregoing 
conclusions has been experimentally confirmed in only 
one, or, at the most, a very few systems and that, con- 
sequently, their generality remains to be shown. 


III. CONSTITUTIVE SYNTHESIS OF 
6-GALACTOSIDASE 


Before analyzing some of the hypotheses attempting 
to explain the mechanism of induction, it is useful to 
first describe the phenomenon of the constitutive syn- 
thesis of -galactosidase in E. coli and the origin of the 
bacteria responsible for it. 

If one examines a -galactosidase constitutive strain 
of E. coli that is growing in the absence of an inducer, 
one will find that B-galactosidase synthesis occurs at a 
very high rate, 1000-fold or more greater than is found 
for the inducible strains growing under the same con- 
ditions. Furthermore, the addition of an inducer to such 
a constitutive culture does not augment this already 
high rate of synthesis. Thus, the constitutive strains 
synthesize -galactosidase in the absence of inducer at 
rates which are apparently maximal for these cells and 
which are approximately the same as the maximum rates 
of induced f-galactosidase synthesis found in the in- 
ducible strains. 

The question that is immediately raised in such a 
comparison of constitutive and induced 6-galactosidase 
synthesis is whether one is concerned with the synthesis 
of the same protein molecule in each case or whether 
two different proteins are included in the above use of 
the term, B-galactosidase, one being synthesized in in- 
ducible strains and the other in constitutive strains. 
This question has been answered with some certainty 
in favor of identity.* Thus, the titration of catalytic 
units precipitable by a given quantity of specific anti- 
B-galactosidase sera yielded the same results with en- 
zyme preparations derived from constitutive bacteria as 
with those derived from induced cells, whether the anti- 
body was formed in response to the induced or to the 
constitutive enzyme. This not only demonstrates the 
antigenic identity of the two enzymes, but also indicates 
that their turnover numbers (catalytic units per mole- 
cule) are the same. Similarly, an extensive comparison 
of the kinetic constants for several substrates, for 
sodium- and potassium-ion activation,‘and for thermal 
Inactivation of the -galactosidase in these two prepara- 
tions did not reveal any measurable difference. 

Taking these observations in good grace as sufficient 
evidence for the identity of the -galactosidase in con- 
Stitutive and induced cells, one should expect the basic 
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genetic unit or units necessary for the synthesis of this 
protein to be common to both cell types, assuming, of 
course, that one accept the tenets of the DNA-protein 
doctrine presented by Levinthal (p. 227). What then 
is the genetic difference between inducible and consti- 
tutive Æ. coli? 

The following series of mutation sequences indicates 
the relationship between inducible and constitutive 
strains!6-20.21 : 


Wild type; ytatz* yisi 
B-galactosidase B-galactosidase 
inducible =; constitutive 
permease inducible permease constitutive 
i tt 
PETTO yezi 


B-galactosidase 
absolute negative 
permease inducible 


B-galactosidase 
absolute negative 
permease constitutive 


In this scheme, the genetic units have the following 
meaning: 


1. i+—>+7 designates the mutation involving a change 
of state from inducible (i+) to constitutive (i~) for both 
permease and f-galactosidase. 

2. zt— z designates the mutation involving damage 
to the z* unit necessary for -galactosidase synthesis 
such that the enzyme cannot be made. 

3. y* indicates the unit necessary for the synthesis 
of the galactoside-permease, a yt— y` mutation (not 
shown) being that involved in formation of the cryptic, 
permease negative mutants previously mentioned. 


This genotypic interpretation of the mutation data 
is consistent with data from recombination studies” 
and, as with a unique type of cis-trans test demonstrat- 
ing that yt and z* involve different cistrons.” 

From the fact that the 6-galactosidase-proteins syn- 
thesized by constitutive and induced bacteria are identi- 
cal, and from the genetic evidence demonstrating that 
the change resultant from mutation of inducibles (i+) 
to constitutives (i~) involves genetic units (cistrons) 
different from those involved in the formation of ab- 
solute negatives (z7), one comes to the conclusion that 
the mechanism of synthesis of 6-galactosidase in induced 
and constitutive cells does not differ in its essentials— 
that is, the same amino acids and the same mechanisms 
of ordering such amino acids are employed in each case. 
Though slightly more specific, this conclusion is equiv- 
alent to the general unitary hypothesis for enzyme 
synthesis emphasized by Cohn and Monod.‘ 

Thus, in addition to conforming to the conclusions 
drawn directly from the induction phenomenon (Sec. 
IL), any useful hypothesis for the induced enzyme syn- 
thesis must also explain the inducible to constitutive 
transition via the agency of a single-step mutation not 
involving a change in the basic genetic units determining 
the structure of the enzyme. 5 
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: ; IV. HYPOTHESES FOR INDUCED ENZYME 
SYNTHESIS 


The point of departure for all hypotheses purporting 
| to explain enzyme induction within the context of the 
| unitary principle is the choice between two possibilities : 
(1) an inducer is an essential part of the minimum re- 
| quirements for rapid enzyme synthesis (i.e., the “gen- 
| eralized induction principle’*); or (2) the converse, 
inducers are not part of these minimum requirements. 
| Since there exists no direct experimental evidence 
| which determines this choice, it is necessary to analyze 
| the consequences of both possibilities. A number of 
schemes have been developed that invoke the inducer 
| as an essential element of the minimum requirements 
| for enzyme synthesis.‘~* It is not proposed to analyze 
| these here, since they either do not attempt to explain 
| at what point and in what manner the inducers act in 
| the over-all conversion of amino acids to enzyme mole- 
| cules,{ or they are not consistent with the conclusions 
| formed in Sec. II from data on -galactosidase induc- 
| tion.§ Rather, a model of enzyme synthesis is presented 
| here in which the necessity of inducer participation can 
! easily be visualized and which is consistent with the con- 
l clusions drawn thus far. The initial assumption in this 
| construction is that the inducer is neither necessary 
H nor functional in determining the amino-acid order of 
{ the induced enzyme, but instead acts to increase the 
| rate of formation of its secondary or tertiary structure. 
j This assumption has its basis in the coding theory of 
i DNA function in which the amino-acid order in a given 
protein is uniquely determined by the nucleotide order 
Ha a in the DNA of some functional gene (cf. Levinthal, 
1 p. 249). 
A diagrammatic representation of how an inducer 
might exert a catalytic function in tertiary structure 
formation is given in Fig. 5. In this diagram, the semi- 
circle of capital letters represents a template of un- 
specified chemical composition (DNA, RNA, or other) 
whose structure is determined by a functional gene and 
which in turn determines the order of amino acids 
(lower-case letters) making up a specific polypeptide. 
Each capital letter then represents a code symbol for 
one amino acid, and it is assumed that the physical 
structure which determines a given code symbol specifi- 
cally binds only one amino acid. Peptide-bond formation 
between amino acids is imagined to occur on this tem- 
plate with the condition that the binding force between 
a given amino acid and its code structure is not lost by 


the formation of the polypeptide. Consequently, poly- 


í “ izer” hypotheses*’® insist that a derivative of the 
se a (he organizer”) is an essential in enzyme synthesis but 
Reece unspecified the mechanism by which such a derivative 
enzy. esis. ’ 

a UE heals by Yudkin* and its extensions® 
Be nase me induction be a reversible process in which 
p. ucers of enzyme synthesis parallel the speci- 
and competitive inhibitors of this enzyme, 
: msistent with the data derived from 
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peptide and template form a relatively stable complex, 
which momentarily may assume any one of many par- 
tially dissociated states. Certain of these transitory, 
partially dissociated configurations will be favorable for 
the formation of the tertiary structure necessary for 
enzyme formation (represented here by the formation 
of a disulfide bond via the oxidation of two sulfhydryl 
residues in the polypeptide chain). The formation of 
such tertiary structure is assumed to be concomitant 
with dissociation of polypeptide and template, thus 
allowing the formation of free enzyme. 

With no inducer present, such favorable states are 
considered to be extremely short-lived, so that the prob- 
ability of formation of the disulfide bond pictured in 
Fig. 5 is quite small. As a consequence, the rate of 
enzyme synthesis is very low. 

The assumed function of the inducer, or some product 
derived from it, (X), is to stabilize the favorable con- 
figurations by interacting with them to form a complex 
of the type represented by the middle left-hand struc- 
ture in Fig. 5. Such a stabilization would result in an 
increase in the probability of the critical disulfide- 
bond formation and, therefore, in the rate of enzyme 
formation. 

This mechanism is offered more as an aid in visualiz- 
ing how one can imagine the necessity of inducer partici- 
pation in protein synthesis, rather than as a unique 
representation of this possibility. However, it is con- 
sistent with the conclusions developed concerning 6- 
galactosidase induction. Thus, it accounts for the de 
novo synthesis of B-galactosidase during induction, since 
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Fie. 5. Model for inducer function under the 
generalized induction hypothesis. 
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the level of polypeptide precursors of the enzyme in 
noninduced cells could be no greater than the number 
of templates per cell. Indeed, the calculated limit of 
ten or less such precursor molecules per noninduced 
cell is not an unreasonable limit for the number of 
templates per enzyme per cell. The model also allows 
for an essentially irreversible synthesis of -galactosidase 
during the induction of growing populations; for the 
activation of enzyme-forming sites (templates) by in- 
ducers and the constancy of such sites during induction; 
and, finally, for the similarity but nonidentity of the 
specificity of inducers and substrates (or competitive 
inhibitors) of the enzyme. It is perhaps of interest to 
note that, insofar as reversal of the synthetic reaction 
in the model is limited by competition of activated 
amino acids and enzyme for the template (the enzyme 
having a very small affinity for the template as com- 
pared to the precursor polypeptide), the rate of any 
possible reverse reaction would be predicted to be greater 
under conditions of low levels of activated amino acids 
(e.g., in nitrogen-starved cells) than when these com- 
pounds are plentiful (e.g., in fast-growing cells.) 

With respect to this model, or any model that invokes 
an inducer as an essential in the minimum requirements 
for the synthesis of a given enzyme, the two genetic 
states of inducible and constitutive -galactosidase syn- 
thesis can be interpreted by assuming that constitutive 
cells contain an endogenous inducer, and that inducible 
cells are unable to synthesize such an inducer for lack 
of a necessary enzyme. On this basis, the mutation of 
an inducible strain to a constitutive can be viewed as a 
repair of a damaged genetic unit responsible for the 
synthesis of the postulated enzyme. 

Turning now to the second possibility—namely, that 
an inducer is not a necessary component of the minimum 
system for enzyme synthesis—one must accept the con- 
stitutive condition as representing this minimum system. 
A model built for such a system could be like that given 
in Fig. 5, with two important differences: (1) The favor- 
able state leading to tertiary structure could be stabi- 
lized by the form of the template itself, thus eliminating 
the need for inducer; and (2) the affinity of the activated 
amino acid for its code structure on the template would 
have to be lost as a result of peptide-bond formation 
with its nearest neighbors. The latter condition could be 
allowed for if one supposes that the activated amino 
acid consists of the amino acid covalently linked to a 
residue, R, which is unique for each amino acid, and 
that the code structure on the template has an affinity 
for R rather than the amino acid itself. If peptide-bond 


` formation entails the breaking of the R-amino acid 


bond, then the polypeptide thus formed is quite free to 
Separate from the template. > 

However, regardless of the model one builds for the 
synthesis of an enzyme without inducer (endogenous or 


- €xogenous) participation, the induced enzyme synthesis 


must differ from 


: 3 constitutive synthesis, either by a 
difference in type 


of template or by the existence of an 
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inhibitor of normal synthesis (i.e., constitutive type 
synthesis) in the inducible strain, an inhibition which is 
relieved by the addition of an exogenous inducer. In the 
first of these possibilities, one must imagine that muta- 
tion from inducible to constitutive invokes a change in 
template type from one requiring an inducer for syn- 
thesis (i.e., as in Fig. 5) to one in which no inducer is 
required. Furthermore, since the -galactosidase mole- 
cules synthesized constitutively and by induction are 
identical, one must suppose that this change in template 
type does not involve a change in the order of the amino- 
acid code. This seems to be asking a lot from a single 
mutation, but since almost nothing is known of the 
template structure, one cannot say, a priori, that such 
a change is impossible. 

The second possibility—namely, that induction is the 
release of inhibition of enzyme synthesis—is actually 
quite an old idea, but in the past it was generally offered 
facetiously as the bête noire of generalized induction 
theories, since it could not be eliminated. However, 
recent developments demand more positive considera- 
tion of this idea. The source of these developments lies 
in the observation of what Vogel:?6 has termed “en- 
zyme repression.” Repression is the inhibition of the 
differential rate of synthesis of an enzyme resulting from 
exposure of cells to a given substance (‘‘repressor’’). 
The word repression was adopted simply to avoid con- 
fusion with the term “enzyme inhibition,” which by 
long usage implies the inhibition of the catalytic func- 
tion of a given enzyme and not of its synthesis. One of 
the best examples of enzyme repression is observed 
among the constitutive enzymes included in the bio- 
synthetic pathway leading to the synthesis of ornithine, 
citrulline, and finally of arginine in E. coli. Thus, Vogel*® 
found that arginine is a specific and very effective re- 
pressor of the synthesis of acetylornithinase. Similarly, 
Gorini and Maas” have shown that arginine also re- 
presses the synthesis of ornithine transcarbamylase. 
Perhaps the most significant implication of these ob- 
servations is that there exists within the cell a mecha- 
nism of feedback control between the result of the 
catalytic action of the group of enzymes in a biosyn- 
thetic pathway (in this case, the synthesis of arginine) 
and the synthesis of these enzymes. For present pur- 
poses, however, they give plausibility to the supposition 
that the inducible strains could represent a case of re- 
pression of the constitutive synthesis of 6-galactosidase 
by endogenous repressors present in such strains, but 
not in the constitutive strains. Indeed, it is known that 
the constitutive synthesis of -galactosidase in Æ. coli 
can be inhibited by exposure of cells to galactose or one 
of a variety of 6-p-galactosides'; i.e., in the constitutive 
strains, these substances act as exogenous repressors of 
enzyme synthesis. 

One may, therefore, term this second possibility the 
repressor hypothesis. It states that in constitutive 
strains enzyme synthesis occurs without the aid of any 
inducer; that such synthesis is inhibited in the inducible 
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cells because of repressor substances synthesized by 
these cells and not by constitutive cells; and that an 
inducer destroys this inhibition of enzyme synthesis 
either by combining with the repressor to yield a non- 
s inhibiting complex, by competing with the repressor for 
some site on the template (or template-polypeptide 
complex)—which, if occupied by the inducer, does not 
inactivate the template, but if occupied by a repressor 
does so inactivate—or by inhibiting the synthesis of 
l unstable repressors. 

i! The three alternative explanations of induced and 
constitutive enzyme synthesis can be summarized as 
follows: 


- f } 


: 1. Generalized-induction hypothesis. Inducers are 
Rik 3 : necessary components for enzyme synthesis both in 
ig : constitutive and in inducible cells. Endogenous inducers 
A : are synthesized by constitutive cells but not by in- 

ia if ducible cells, which must, therefore, receive exogenous 
ae inducers for enzyme synthesis. 

2. Different-template hypothesis. Two different types 
of template exist for enzyme synthesis. In constitutive 
ay etl cells, the templates function without the aid of inducers. 
Tee In inducible cells, the inducers are necessary to activate 
Be Bihar | the template (as in Fig. 5). No endogenous inducer or 
LE ies repressor is assumed. It is assumed that the amino-acid 
H ; ordering function of each template is the same. 

f fn jp 3. Repressor hypothesis. The templates in inducible 
HEREDI and constitutive cells are the same and do not require 
bane activation by inducers. Substances specifically inhibit- 
ing template function (repressors) are synthesized by 

inducible cells but not by constitutive cells. Exogenous 
= inducers function by destroying this inhibition of en- 
pe zyme synthesis. 


i V. EVALUATION OF THE INDUCTION 
a HYPOTHESES 


The data pre ented thus far do not allow much more 


enous induction as by direct inhibition of template 
catalysis not involving an inducer. However, experi- 
= ments recently reported by Pardee et al. have done 
much toward clarifying the relative weight that one can 
z 1 place on the three alternative explanations of enzyme 
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jote: it, it pair. This determination is important 
uation of the hypotheses, since, in the gen- 
du tion hypothesis as it has been presented, 
refore, dominant function of syn- 
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of synthesizing a repressor is given only to the inducible’ 7 
type (i+). In the hypothesis of two template types, both 
inducible (z+) and constitutive (i~) units are given posi- 
tive functions and, therefore, no dominance in the i+, i~ 
pair should necessarily exist. 

The characteristics of Æ. coli conjugation allow the 
determination of dominance in a very unique manner, 
The male (Hfr) member of the mating pair injects its 
“chromosome” into the female (/~) member through 
a small tubule connecting the two cells. The order in 
time of entrance into the female of the genetic units 
from a given strain of male cells is unique, and ap- 
parently no appreciable amount of cytoplasm enters 
with these genetic units (cf. Wollman ef al.” and the 
chapter by Lennox p. 242). Thus, the initial cytoplasmic 
state of the zygote is determined by the cytoplasmic 
state of the female before conjugation. Asa consequence, 
it is possible to form the i+z*/i-z~ zygote either by the %4 
injection of the 7+z* genetic units from an inducible male 
into ani~z— formed cytoplasm of a constitutive absolute 
negative female or vice versa—to inject the 7-2 genetic 
units into an 7+z+ formed cytoplasm. The question then 
asked is what is the behavior of 6-galactosidase synthesis 
in the i+z+/i-z- zygote in each case and to what hy- 
pothesis does this behavior correspond? In answer to 
this question, it is convenient to first describe the be- 
havior predicted by each hypothesis. 

1. Under the form of the generalized-induction hy- 
pothesis presented here, the mating of a male itz* cell 
to a i-z~ female in the absence of any exogenous inducer 
should yield the synthesis of B-galactosidase soon afte- 
the itzt units have been injected into the 7~z~ cytor 
plasm, since this cytoplasm should contain endogenous 
inducers which can activate the template resulting from 
the entering z+ unit. It should be noted here that the 
z and i units are very closely linked? so that, within 
the minimum time units employable in conjugation ex- 
periments, the two units enter simultaneously. Further- 
more, the synthesis of -galactosidase should continue 
in the i+g+/i-z~ zygote as long as this heterogenotic 
state remains intact. 

In the reciprocal conjugation, the prediction would 
be the same, except that there might be expected to be 
a longer lag between the entrance of the i~z~ units into 
the itzt cytoplasm and the first appearance of enzyme 
synthesis owing to the necessity of the entering i unit 
to catalyze the synthesis of the endogenous inducer in 
the zygote cytoplasm. 

2. Under the hypothesis involving two types of tem- 
plates for constitutive (i7) and inducible (7*) synthesis, 
the that [i z- zygote should yield no B- galactosidase * 
synthesis in the absence of exogenous inducer, unless 
the zt in one “chromosome” and the i~ in the other can 
cooperate to yield the type of template synthesized in 
the constitutive B-galactosidase positive strain (i 
If this is possible, then the -galactosidase synthesis 
should occur in the zygote with the same behavior, 
whether the 7*z* units are injected into a i27 female or 
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vice versa. It should be noted that the possibility of 
recombination leading to i~z+ on the same ‘“‘chromo- 
some” is negligible owing to the closeness of the z and 
i units. 

3. Under the repressor hypothesis for induction, the 
injection of a i+z+ unit into a i-z~ cytoplasm in the 


` absence of inducer should yield the synthesis of 6-galac- 


tosidase soon after the injection, since initially there 
should be no repressor in the zygote cytoplasm, it being 
of the constitutive (i~) type. However, different from 
the prediction of the generalized induction hypothesis, 
in this case one should expect the synthesis of B-galac- 
tosidase in the zygote to cease as soon as the injected i+ 
unit has caused the synthesis of sufficient repressor in 
the zygote cytoplasm. At this time, the zygote should 
be phenotypically inducible rather than constitutive as 
it was immediately after the entrance of the i+z* unit. 

In the reciprocal mating, the cytoplasm of the zygote, 
being 7*z*, should initially contain the repressor and 
should continue to synthesize repressor after the itz+/ 
i~s~ state is established. Thus, no synthesis of B-galac- 
tosidase in absence of exogenous inducer would be ex- 
pected at any time after conjugation. 

kach hypothesis yields a different prediction, and, 
consequently, the experiment should uniquely deter- 
mine which, if any, is valid. The results are entirely 
consistent with the repressor hypothesis. Thus, in the 
conjugation of 7*z+ males with 7~z~ females in the ab- 
sence of exogenous inducer, -galactosidase synthesis 
can be detected within a few minutes after the entrance 
of the itzt genetic units. However, in the reciprocal 
conjugation, no enzyme synthesis could de detected in 
the absence of inducer, even after several hours. Further- 
more, in the 7+z+ male-to-i~z~ female conjugation the 
synthesis of -galactosidase in absence of inducer ceases 
about two hours after zygote formation. This is shown 
in Fig. 6. This cessation of synthesis is not the result of 
segregation of the zygote, since this event cannot be 
detected until two hours after cessation of synthesis. 
As is shown in Fig. 6, the zygotes have become inducible 
by the time the constitutive synthesis stops. These 
results are exactly those predicted under the repressor 
hypothesis in which 7+ was predicted to be dominant 
over t~, and they are inconsistent with the two other 
hypotheses. These results, furthermore, offer the evi- 
dence promised earlier that the 7 and z units involve 
different cistrons—i.e., different functional genetic units. 

Does this experiment then offer the death knell to the 
hypothesis of different templates and to the generalized 
induction theory? It quite effectively eliminates the 


‘former of these possibilities, but, unfortunately, the 


ur 


generalized induction hypothesis can be made viable 
again by a very slight alteration. The supposition made 
under this hypothesis was that the i+ to i mutation 
involved the repair of a genetic unit necessary for the 
Synthesis of endogenous inducer. This supposition is 
Clearly eliminated by the experiments of Pardee, Jacob, 
and Monod. However, it can be assumed that, in both 
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Fic. 6. Formation of -galactosidase in zygotes from a g'itzt 
X 2i-s~ conjugation of E. coli. The abscissa indicates the time 
after mixing the parental populations. The experiment was per- 
formed in quadruplicate. Bacteriophage 7, and streptomycin were 
added at the indicated times to stop conjugation and to eliminate 
the g. The inducer, methyl-6-p-thiogalactoside (10-°/), was 
added at 115 min to two of the four cultures (filled circles), 
whereas the other two cultures (open circles) contained no inducer 
[from A. B. Pardee, F. Jacob, and J. Monod, Compt. rend. 246, 
3125 (1958)]. 


it and 7~ cells, an endogenous inducer is synthesized, 
but that 7+ cells differ from i` cells in possessing an 
enzyme capable of destruction of this endogenous in- 
ducer. The 7*-to-i~ mutation would then result in the 
loss of ability to synthesize this enzyme, and 7+ would be 
expected to be dominant to i~. The results of the above 
experiment are equally well in accord with this extension 
of the generalized induction hypothesis as with the re- 
pressor hypothesis. Thus, this ingenious experiment has 
indeed limited the choice of hypotheses drastically; it 
has pushed the repressor hypothesis fully into the lime- 
light, but it unfortunately does not offer a final unam- 
biguous solution. 

Nor should one expect such an unambiguous solution 
as long as one is forced to deal with the complexity of an 
entire cell. At this level, a solution will be unique only 
in that it unifies and poses the answer to a larger number 
of problems with a lesser number of assumptions than 
does its alternatives. In this sense, the repressor hy- 
pothesis is the more satisfying. It yields a mechanism 
for feedback control of enzyme synthesis that would 
seem to be necessary if the cell is not to run amuck. 
If the phenomenon of enzyme repression becomes a 
general observation, particularly in biosynthetic path- 
ways such as that found in arginine synthesis, one | 
account for feedback control with the use of one o 
less repressor substances per enzyme. However, in 
choosing the generalized induction hypothesis, if one 
wishes to maintain the explanation of feedback control, 
one must involve inducers and substances inhibiting 
induction (i.e., repressors) for each enzyme. Thus, if 


both hypotheses are generalized, it would seem that 
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cules and viruses (p. 273), Wood speaks on certain 
cellular effects that have been investigated intensively 
(p. 282), and Tobias discusses selected studies on cell 
populations and multicellular organisms (p. 289). 

By and large, the effects of high-energy radiations, 
as well as those of the ultraviolet, are injurious to all 
or part of the irradiated system. Nevertheless, consider- 
able effort goes into the study of these effects, for various 
reasons. Some persons find a fascinating field of research 
in the general problem of how very small amounts of 
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HIS paper and the three that follow (Pollard, p. 
273; Wood, p. 282; Tobias, p. 289) are concerned 
with radiation biology. The potential scope of this 
field—effects of all types of radiations on all types of 
biological systems—and its relation to other areas of 
radiation research are indicated in Fig. 1, where an 
energy spectrum of radiations is plotted as abscissae and 
various inanimate and biological systems are “plotted” 
as ordinates in ascending order of presumed complexity. 
At the level of macromolecules, radiation biology shades 
= imperceptibly into radiation chemistry and physics, the 


—— 
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‘basic sciences on which it draws for necessary facts and 
concepts. At the level of groups of multicellular organ- 
isms, it approaches “radiation sociology,” which is 
needed to deal with the problems of group behavior 
in irradiated populations. 

Radiations below about 1 ev produce biological 
effects, if any, through heating. These effects usually 
are regarded as cognate to, but not included in, radia- 
tion biology. The region from 1 to 6 ev, comprising the 
near infrared, the visible, and the familiar ultraviolet 
wavelengths, is readily available for investigations as 
is the region from about 1000 ev upward. These two 
accessible portions of the spectrum conveniently are 
termed low-energy and high-energy, respectively. The 
intervening “transition” region is absorbed in air and 
biological material so readily that it can be used only 
on very small objects in vacuo. It is of great theoretical 
interest, in view of the great differences in basic response 
of biological systems to the low- and high-energy spectra. 

Figure 1 shows that radiation biology potentially 
ramifies throughout biology as a whole. So does bio- 
physics. What is the relationship, if any, between the 
two? In my personal view, each contains some of the 
other. The biophysical content of radiation biology is 
probably about as great, both potentially and actually, 
as that of general biology. 

It is clearly impracticable, in four short papers, to 
cover all of radiation biology. Accordingly, the coverage 
1S narrowed as follows. First, attention is concentrated 
on the high-energy radiations, with occasional compara- 
live references to the ultraviolet. Second, specific 
samples are selected from the spectrum of biological 
systems (Fig. 1); after these brief general remarks, 
Pollard describes experiments on certain macromole- 
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radiation energy produce such drastic effects. I believe 
that the real attraction here is the peculiar combination 
of various portions of physics and chemistry and of 
many aspects of biology that must be brought to bear 
on any serious investigation of radiobiological mecha- 
nism. This peculiar versatility demands so much of a 
human life span that a radiation biologist is necessarily 
a specialist, much as he may strive to not become too 
differentiated. 

However, most of the interest in radiation effects 
stems from their applications. In many branches of basic 
biology, they have long been used as powerful research 
tools. For example, the development of genetics has 
been accelerated immensely by the use of radiations as 
mutagenic agents; partial body irradiation has found 
fruitful application in embryology; partial cell irradia- 
tion is used to get information about the properties and 
functions of various cell parts. As basic biological tools, 
the various radiations have two properties that, in many 
situations, are of critical advantage: they do not disrupt 
membranes and other structures grossly, and the dosage 
usually is reproducible. 

Radiation biology also has some important practical 
applications. The oldest, and still one of the most im- 
portant, is radiation therapy. Another is its use in deal- 
ing with the widespread and multifarious problem of 
radiation hazards that, within the last decade and a half, 
have increased so explosively that they even figure in 
national and international politics. 

Regardless of one’s motivation to study or use radia- 
tion effects, there is obvious need to know as much as 
possible about their mechanisms. There is nothing basi- 

cally unique about current methods of investigating 
these mechanisms; they are essentially those of the 
physical sciences and of the analytical biological sci- 
ences, with some special variations that stem from the - 
unique physical properties of the high-energy radiations, 

Like any story, a radiation action has a beginning, 
an end, and a middle. The beginning is the act of irradia- 
tion, and the end is the effect observed; there is con- 
siderable information about these, and the prospects of 
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getting more are good. The middle, frequently miscalled 
the “latent period,” is essentially a domain of ignorance 
wherein most of the problems lie. The remainder of this 
article is devoted to an attempt to indicate some of the 
ways in which current research is directed to reduction 
of this ignorance. 

Clearly, the end effect must be defined as accurately 
as possible. This implies information about the state of 
the investigated system before it is irradiated. The in- 
completeness of such information is demonstrated well 
_ by the need for a large conference such as the Study 

Program in Biophysical Science and by the remedial 
= attempts so vividly described by earlier speakers. High- 
= energy radiation in sufficient amount is capable of pro- 
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mplexity of these structures and functions, it is not 


ing that the list of known end effects is long. 
igorously investigated examples are aberrations 


and induction of tumors in animals. 
al to know as much as possible about 
iation and the immediately conse- 
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reas of biological research. 
iffer in two important 
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_ ina biological object is essentially the same as that dis- 
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Fic. 1. Scope of radiation biol- 


of radiation research. Abscissae, 
partial radiation energy spectru 
in electron volts; ordinates, ‘“‘spec- 
trum” of inanimate and biological 
objects in order of presumed com- 


Physics. plexity. 


10’ 10° ev 


High-energy, 
accessible 


Transition, almost inaccessible 


Low-energy, accessible (visible, UV, IR) 


is transferred to the individual molecules of the many 
species usually present in a nonselective fashion that 
is chiefly dependent only on the molecular mass. Chro- 
mophores do not enter into consideration, as they do 
in work with low-energy radiations. This circumstance 
presents both advantages and disadvantages to the 
investigator. 

Second, the energy transfers to individual molecules 
do not occur singly in space and time but in more or 
less linear groups. All of the high-energy radiations are 
either charged particles in motion (e.g., electrons or a- 
particles) or are agents (neutrons; x-rays and y-rays) 
that set such particles in motion. These charged parti- 
cles typically have kinetic energies of thousands or 
millions of electron volts, whereas the average energy 
that can be accepted by an individual molecule is some 
tens of electron volts. Thus, successive energy transfers 
occur along the path of the high-energy particle. Some 
of these transfers are capable merely of exciting the 
molecules to higher energy states. Others involve enough 
energy to produce ionization, e.g., to eject electrons, 
some of which have kinetic energies sufficient to eject 
electrons from other molecules (secondary ionization). 
The ions produced in gases are well demonstrated by 


means of the Wilson cloud chamber. The trail of ions — 


produced by a single primary charged particle is termed 
an ionization track. Similar tracks are recorded on pho- 


tographic plates (linear sequences of blackened grains) = 


and in suitable liquids at low temperatures (trails of 
bubbles). This, and other physical evidence, makes 1t 
highly probable that the distribution of energy transfer 


ogy and its relations to other areas — 
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played in the cloud chamber, except that the dimensions 
“sshould be reduced by a factor of about eight hundred 
~ “because of the difference in density. 

These two properties—the high average value of the 
individual energy transfers and the grouping of these 
transfers into tracks—make the high-energy radiations 
unique in their mechanisms of action, even though these 
mechanisms, in some cases, produce end effects that are 
superficially indistinguishable from those of other agents. 

To students of mechanism, probably the most signifi- 
cant single feature of high-energy radiation actions, 
especially on cells and on even more-highly organized 
systems, is the small amount of energy required. Of the 
many end effects that are known, most are produced to 
a significant degree by 10 000 ad (or less). This dose 
equals about 0.02. cal/g of biological material. 

For some actions, the dose-effect relations (“kinetics”) 

are suggestive. When samples of certain small cells 
(bacteria, haploid yeast, etc.) are given graded doses, 
the resulting survival curves are exponential; for more- 
complicated cells and organisms, the curves are sigmoid. 
The exponential curves suggest that the action on each 
cell is owing to some single primary event (formation of 
an ion pair, passage of a single ionizing particle, etc.). 
Sigmoid curves are correspondingly ascribed to multi- 
event types of action. In some cases, as Wood points 
out, the curve is exponential or sigmoid, depending on 
known properties of the cells. This is useful in devising 
models for the action. 

Macromolecules and viruses typically exhibit expo- 
nential curves, although their shapes may be modified 
by various factors. By making simple assumptions con- 
cerning the nature of the hypothetical single event, the 
size of the“ target” relevant to the radiation action can 
be calculated! 2. Pollard describes this procedure in 
detail (p. 273). 

In dilute inorganic aqueous solutions, the alteration 
of the solute (e.g., oxidation of ferrous ion) typically 
follows zero-order kinetics, which indicates that the 
energy is absorbed principally by the solvent and that 
chemical intermediates, formed from solvent molecules, 
react with the solute. If two or more solutes are present, 
they compete for the intermediates and thus “protect” 
each other. Actions involving intermediates are termed 
“indirect.” “Direct”? actions are those in which the 
radiation energy must be transferred to the solute mole- 
cules themselves.‘f Current radiation chemistry identi- 
fies some of the intermediates*®; they are OH and H 
radicals, H:O» molecule, and, in the presence of molecu- 
lar oxygen, HO: radical. 

All of these concepts, derived from work on simple 
solutions, ramify throughout radiation biology. Indirect 
action has been demonstrated in many radiation effects 
on macromolecules in solution.’ If a preponderant con- 


^ +I£ no solvent is present, the action must be direct. Target 
theory, in its strict sense, presupposes direct action. 
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centration of protective substances is present, a residual 
effect is observed that is ascribed to direct action. In- 
direct action has been invoked in analysis of various 
cellular effects®; the concept has been extremely fruitful 
in suggesting experiments, although in no case has its 
correctness been established for systems as complex 
as cells. 

Many physical, chemical, and biological factors are 
known to modify the amount of absorbed radiation 
energy necessary to produce a given degree of effect. 
A few that have been found to operate in a wide variety 
of systems are mentioned here. 

One is the physical parameter called linear energy 
transfer (LET), which is the amount of energy trans- 
ferred per unit length of track of an ionizing particle 
to the molecules it traverses. This parameter varies with 
the square of the charge on the particle and increases 
as the velocity decreases. With the various kinds of 
radiation currently available, one can obtain LET 
values ranging from 0.025 to 25 ev/A, corresponding 
roughly to average spacings between primary ioniza- 
tions that range from 4000 down to 4 A. Thus, it is pos- 
sible to give the same dose to a biological system in vari- 
ous ways: a few ionizing particles with high LET, many 
particles with low LET, and intermediate values. Re- 
sults are detailed elsewhere.” Not only does LET signifi- 
cantly influence the dose-effect relations in practically 
all radiobiological actions that have been investigated 
thoroughly in this respect, but it also operates in “simple” 
chemical actions, such as the formation of H0% in pure 
water. Thus, LET not only gives some geometrical 
notions about mechanism, but also gives some encour- 
agement to use radiation chemistry as a basis for in- 
terpretation of radiobiological actions. 

Another factor quantitatively influencing a wide va- 
riety of actions is molecular oxygen. In all but a few 
of these actions, radiosensitivity increases with concen- 
tration of O2 until a plateau is reached, usually at avalue 
two or three times that observed when the Oz concen- 
tration is zero. The basis of the “oxygen effect” is still 
controversial (cf. Wood, p. 282), but I think it signifi- 
cant that, like LET, it is encountered in radiation 
effects on simple aqueous solutions. There is also an 
interrelationship between Os and LET: the greater the 
LET produced by the radiation, the less is the influence 
of Oo. 

In the foregoing, I have tried to communicate a 
concept of the present scientific state of basic high- 
energy radiation biology. Many effects on many diverse 
biological systems are known and cataloged; however, 
there is encouraging evidence that the mechanisms lead- 
ing to these diverse effects have some strong resem- 
blances. No one mechanism has been elucidated yet. 
On the other hand, several have been investigated in- 
tensively by means of the general approaches indicated, 
and, in a few cases, observed facts have been used as 
bases for theories which have been successful; i.e., they 


yt 


periments which in turn have yielded 
acts. Good examples of such investiga- 
given in the three papers which follow. 
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INTRODUCTION 


T is worthwhile, before the description of experi- 
mental work, to say a word about the motivation 
of the researches to be described and also something of 
the people engaged in them. The purpose has been to 
use ionizing radiation as a powerful, localized, and 
penetrating agent to study cell structure in relation to 
cellular function. This purpose was clearly in the mind 
of the late D. E. Lea, and in many ways we have been 
continuing lines which he began. The purpose can be 
usefully directed only if there is some knowledge of the 
actual character of ionizing radiation action on the 
three key elements of cellular systems: proteins, nucleic 
acids, and polysaccharides. Such knowledge is still im- 
perfect, and what is here described is a series of studies, 
which enable one to make some preliminary hypotheses 
as to the action of ionizing radiation. The promise of 
immediate future progress is excellent, and, if research 
effort on the right scale were forthcoming, a year or two 
more would see the major features of the pattern 
properly exposed. 

The people in the Yale Biophysics Department who 
have contributed are Hutchinson, Setlow, Guild, Preiss, 
Woese, Powell, Forro, Fluke, Jagger, Wilson, Till, and 
Whitmore. The author writes as representing this 
group and the work is theirs as much as his in every 
respect. 

In the process of studying the inactivation of these 
key molecules, the basic problems of radiobiology are 
also being studied; and since radiobiology cannot be 
separated from biology, the problems of biology are also 
involved. This will become apparent, as has been 
stressed already by Zirkle (p. 269). 


GENERAL CHARACTER OF RADIATION 
INACTIVATION 


The physical action of radiation is complex. All 
optically allowed molecular transitions presumably are 
capable of occurring, together with many forbidden 
transitions. The gradual dispersal of the intense local 
energy releases, and the accompanying ‘“‘pre-equiparti- 
tion” local high-temperature regions, plus the probable 
effect of local high temperatures, conspire to give the 
molecular physicist a choice of almost any personally 
favored mechanism of action. Such choices have been 
made, and it would be foolish to exempt this paper. In 
view of this great complexity, the traditional physical 
introduction is foregone and the experimental results 
on biological macromolecules are discussed directly. 


The sort of experiment which is readily performed 
requires a set of assay tubes for some enzymes and a 
color agent that is dark for active enzymes. Since the 
reaction time is kept constant, the activity of the 
enzyme shows in the relative color density, and, for an 
enzyme heavily irradiated with y-rays, the activity is 
clearly low. Such experiments were carried out first by 
Northrop' and by Hussey and Thompson.? Similar 
experiments on viruses are described later. Interestingly 
enough, the earliest recorded quantitative measure- 
ments on bacteriophage were made by the Sinclair Lewis 
hero, Martin Arrowsmith, which probably reflects the 
active mind of Paul de Kruif. 

The dose-response curve found in these experiments 
generally obeys the relation 


(1) 


It can be explained statistically very simply, as sug- 
gested by Dessauer,’ Crowther,tand Condon and Terrill® 
many years ago, by supposing that Z inactivating events 
per unit volume are distributed randomly and that 
there is a critical sensitive volume V which may inter- 
cept one of these events. Since the average number of 
events in a volume V is JV, then by the Poisson formula 
the probabilities of 0, 1, 2, 3, etc., events taking place in 
the volume are P(0), P(1), P(2), P(3), where 


In (n/no) = const X dose. 


P(0)=e-4" P(m)=eV (IV) ™/m! 
P(1)=e7VIV 

eV (IV)? 
P(2) reer. 


These are mutually exclusive events so that X P (m) 
=1, and, accordingly, one can reason thus: If the 
probability of escape is measured by the ratio of the 
number left active, n, to the number at the start, mo, 
then for complete escape, P(0), one has 


n[no= e. (2) 
If one “hit” can be withstood, 
n/no= eV +e IV. 


If two hits can be withstood, 


(3) 


eV (IV)? 


n/no=e 1Y + eV I V+ : (4) 


and so on. 
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Transforming activity 


0.008 mg/m! 


500 "1500 
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ite tary | Fic. 1. The loss of transforming activity in pneumococcus 
et S ATEN DNA preparations in dilute solution. This is typical of irradiation 
; ? in dilute solution where the majority of the effect is owing to the 
Fi f migration of radicals. That this is so can be seen from the increased 
Ty Of inactivation in more dilute solution. Data due to DeFilippes 
if | and Guild.® 
l 


Usually, for enzymes and viruses, one finds the 
simplest, complete escape expression holding, or 


ii ln (n/n) = — IV. (5) 
Thus, J can be identified somehow with the dose and 
V with the constant in Eq. (1). The data of Figs. 1 and 2 
provide examples of two inactivations which obey this 
relation. Figure 1, taken from work by DeFilippes and 
Guild,® shows the effect of x-rays on the transforming 
principle of pneumococcus, which Hotchkiss has shown 
to be pure DNA, and which was referred to by Rich 
(p. 191). The ordinate, the activity, is plotted loga- 
rithmically vs the dose in roentgens, and the relation 
for a one-hit effect holds. If the inactivation is ascribed 
to primary ionizations per unit volume, using a con- 
version factor of 5X10” primary ionizations per cubic 
centimeter in water, an absurd value for V is found, 
moreover one which depends upon the concentration of 
transforming principle (TP). The inactivation is due 
clearly to “activated water,” probably free radicals 
which can migrate and so make V many times larger 
= than the volume of the molecule. In principle, if the 
radicals in activated water have infinite lifetime, and 
; jf the water and TP are pure, there always should be 
= one molecule of TP inactivated per radical. In fact, 
“cate recombine; Hutchinson and Ross’ and Smith* 
i ted their half-life. In very pure water, it is 

ng -4 sec. In yeast, it is 10~° sec. 
t, Fig. 2 shows some new data, taken by 
| Nancy Barrett, for inactivation of 6- 
one exception, the irradiations were 
wert e done with a cobalt y-ray 
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—- 
source. The black dots show the effect of radiation on 

enzyme extracted from lactose-adapted bacteria. The ~ 
relation In (2/10) = —JV is obeyed. If one uses for J the 
number of primary ionizations, or clusters of ions, one 
finds V=4.7X10~'%cc. Since one usually is not familiar 

with the volumes of molecules, this figure can be an 
verted into a molecular weight by assuming a protein- 
density of 1.3 and multiplying the mass of one molecule 
by Avogadro’s number to get the molecular weight. 
The figure found is 370 000. Estimates for the molecular 
weight of 8-galactosidase are mostly guesses, but, if one 
takes the molecule to be spherical, a sedimentation 
constant of 18, which has been quoted, and which is 
approximately checked by work in our laboratory by 
Langridge, gives a molecular weight of 390 000. So, in _ 
this case, the dry measurements lead to a statistical ~ 


- 


100 foal 


Lactose adapted 
x Galactose adapted + Dry 


+ Glucose grown 


) Lactose adapted 
irradiated in cells 


Percent enzyme remaining 


Fic. 2. The irradiation in the dry state of -galactosidase grown 
under various conditions. The characteristic behavior is that an 
essentially uniform method of inactivation is observed dependent 
only on the direct hitting of some sensitive region. Irradiation 1n 
the wet state is put in for comparison. It is clear that in the wet 
state inside the bacterium the concentration is so high that in- 
activation is similar to the dry inactivation. 


analysis in which the volume V may well be the volume 
of the molecule. d 

The “constitutive” enzyme, that small fraction which 
is present in unadapted cells, behaves about the same 
as does enzyme from galactose-adapted cells. The sensi- 
tivity of the enzyme in cells irradiated in C-minimal 
medium is not noticeably different. 

Thus, one may draw radiation action on a molecule 
in two ways, as shown in Fig. 3. The shaded part in the Gi 
center is possibly identifiable with the macromolecule 
itself, while the dotted lines are owing to migration of 
active chemical agents. For the purpose of the study of - 
biological systems, the system can be forced very often 
into conditions where only the shaded part is operatlV€. 
Sometimes this cannot be done, a familiar restrictiot 
to biologists. ` aa 
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RADIATION INACTIVATION 


QUANTITATIVE STUDIES ON PROTEINS 


e The result that the “inactivation volume” V is closely 
related to the volume of the molecule first was suggested 
in work by Lea et al. in 1944. In our laboratory, their 
work has been extended greatly to include many 
enzymes, albumin, and hormones; contributions by 
others, notably Fluke,” also have been made. A graph 
summarizing these, owing to Guild, is shown as Fig. 4. 
A log plot of the “radiation molecular weight” vs the 
accepted molecular weight is shown. There are some 
clear deviations, but the relation hardly can be over- 
looked. It is most remarkable and unexpected, and for 
some time no reasonable explanation could be given. 
In 1955, the suggestion was put forward" that, in a 
covalently bonded structure, there would be no reason 
why an initially positive region produced by ionization 
should be confined to one atom, but rather it should 
migrate and settle possibly in some point of weakness. 
Recent work by Gordy" using paramagnetic-resonance 
hyperfine structure has shown that, in irradiated ma- 
terial as extensive as proteins, there can be excited two 
characteristic patterns, one of which is clearly identified 
as owing to cysteine. Figure 5 shows some of his results. 


10-30 A in cell 


Much more in dilute 
solution 


Frc. 3. Indication of the distance of migration of radicals in a 
cell. It has been estimated by Hutchinson that an increase in 
distance of 10 to 30 A corresponds to the distance of migration 
of a radical through a sensitive target. In dilute solution the dis- 
tance of migration is much greater. 


That irradiation, which hardly can have been so intense 
as to excite the whole set of amino acids, should con- 
sistently give such patterns, strongly suggests that 
migration of the positive charge indeed does occur, 
resulting in its statistically settling in one or two favored 
spots. We propose to call this the primary lesion. 
Subsequently, the protein is exposed to water or to 
water and oxygen, and the primary lesion now exhibits 
a reactivity which is called the chemical action. This 
chemical action, by breaking an —S—S bond, or by 
removing a side chain, gives an altered molecule. We 
thus feel that, as of 1958, the process is to be regarded 
as in Fig. 6. On this view, there should be a possibility 
of modifying the action at two steps, A and B, and a 
lesser possibility at a. It has been shown that invertase 
and catalase are markedly more sensitive at tempera- 
tures just below the thermal inactivation region. This 
is possibly action at a or A. Braams, Hutchinson, and 
Ray! have shown that ribonuclease dried in acetic acid 
is four times as sensitive, and in glucose, twice as sensi- 
tive as normally found. A variety of additives have no 
effect, among them salt and glycine, while glutathione 
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Fic. 4. A plot comparing the observed molecular weight with 
the target molecular weight, for a wide variety of bombarded 
substances. This graph has been prepared by Guild and shows 
the plausibility of the idea that a primary ionization anywhere 
within the molecule will cause the loss of biological activity. There 
are exceptions and quite clearly there can be means of influence 
which will alter the sensitivity of the target, but by and large 
there is a good relationship. 


reduces the sensitivity by a factor of two. A sensitivity 
modification which is eightfold thus can be achieved. 
Hutchinson" also has confirmed the finding of Alex- 
ander!® that dry trypsin irradiated in air is 2.3 times 
as sensitive as in nitrogen, for y-radiation. For deuteron 
or a-particle irradiation, the effect is far less. 

The fact that the relation shown in the log plot of 
Fig. 4 holds, means that most additives which occur in 
ordinary separation procedures are fairly uniform and 
not able to produce the effects described above. 


Horn Protease 


Bovine 
albumin 


A 


À Lp 


Fic. 5. Reproduction of data due to Gordy showing the para- 
magnetic resonance in a number of irradiated proteins. Te is 
significant that while a variety of substances are shown quite 
similar resonance is observed in many of them. This means that 
there are probably regions in the molecule at which the formation 
of an unpaired electron is preferred and these can be regions of 


weakness for chemical action later on [from W. Go 
Ard, and H. Shields, Proc. Natl. Acad. Ai U.S. 41, 983 (1985), 
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lonization——> primary lesion—> chemical action—> altered molecule 
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QUANTITATIVE STUDIES ON NUCLEIC ACIDS 


The transforming principle is pure DNA, and so 
affords a possible (though not very convenient) measure 
of the biological activity of DNA. Studies of radiation 

action on this DNA have been made by Fluke et al., 

by Marmur and Fluke,” by Ephrussi-Taylor and 

Latarjet,! and by Guild and DeFilippes.'* A graph from 

the work of these last is shown as Fig. 7. The DNA was 

at irradiated in the dry state and fast protons were used 
to bombard. The logarithmic relation is not simply 
obeyed, except at high doses. Two components are 

i} found, one with a very large inactivation volume which, 
when re-expressed as a molecular weight, is between 5 
and 15 million, and a homogenous component of molecu- 

i | lar weight equivalent of 300000. The reason for the 

; components is not clear. It may be that a fraction of 

the long DNA chains are so placed that they can become 

crosslinked by radiation action, and so cannot enter 

BRE: the pneumococcus to cause transformation. The smaller 

Ei fraction cannot be crosslinked and is organized as 

ait} polymers of an essential unit of 300000 molecular 

weight, which must intercept an ionization to lose 
activity. 

These studies again show the great sensitivity of 
DNA. More physical measurements have been made, 
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10-Mev protons on transforming 
principle (dried from 0.15 M NaCl) 
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activation of transforming principle in the dry state 
Guild and DeFilippes. There is one very clear 
which corresponds to a molecular weight in the 
0 000. For low doses, the Dehayior inore 
pond: crosslinking of m 
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notably early work by Hollaender and co-workers. 
More recently, Butler”? has observed the change jn 
viscosity of thymus DNA when irradiated by x-rays ~ 
(Fig. 8). It is not plotted logarithmically, but the in- 
activation, as expressed by the fall in viscosity, is 
approximately logarithmic, and in the dry state the 
inactivation corresponds to an equivalent molecular 
weight of one million. Thus, both physical and bio- 
logical measures show a strong sensitivity of DNA to 
ionizing radiation. 

The sensitivity in aqueous solution is also high. Guild 
estimates that a unit of 13 million molecular weight - 
can be inactivated by 100 ev of energy employed to 
make radicals in water.® 

No clear proof of the precise nature of radiation action 
on DNA is available. Since it is so highly sensitive, and 
since it is a long thin molecule, the attractive hypothesis, 
which we adopt, is that it may be either broken, or 
crosslinked by ionizing radiation, whether dry or in 
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Fic. 8. A reproduction of data quoted by Butler showing the 
effect of 15 000 000-v electrons on solid and aqueous DNA. The 
sensitivity of the molecule DNA can be inferred from these data - 
[from J. A. V. Butler in Ionizing Radiations and Cell Metabolisiz 
(J. and A. Churchill, Ltd., London, 1956, and Little, Brown and 
Company, Boston, 1957), p. 59]. 


aqueous solution. When this occurs, the result is bio- 
logically measurable. It is the biological sensitivity 
which is really responsible for the radiation sensitivities 
observed. If one chooses as the “measure” of DNA ac- 
tivity its ability to be digested by DNase, the radia~ 
tion “molecular weight” is 4000, as was shown by 
Smith. This is in contrast to the figure of 300000 
for transformation. 


POLYSACCHARIDES 


Almost no work has been done on polysaccharides _ 
A dysentery toxin was studied by Caspar and shown tg 
have an inactivation volume corresponding to 11009 
molecular weight. Further work is clearly needed. 


EXAMPLE OF A RADIATION STUDY 
ON DRY PROTEIN 


The statistical analysis of radiation action 0? an 
enzyme, antigen, or antibody can give more than a 


RADIATION 
simple estimate of a volume. As an example, it is shown 
in outline how an enzyme, not purified particularly, can 
be characterized. Since the purification and measure- 
ment by physiochemical means has to be done yet, this 
work offers some chance to test the predictions of such 
studies. The enzyme is 8-galactosidase, and one aspect 
of the study was shown in Fig. 2 where cobalt irradiation 
was employed. Such radiation is random in volume, and 
the product ZV can be found from the data, yielding V, 
the radiation sensitive volume. Using heavy charged 
particles, such as deuterons or a-particles, a fraction of 
their ionization, generally about 75%, is confined to 
narrow tracks. A fraction is spread more widely, as 
seen in Fig. 9 where a crude representation of the ioniza- 
tion produced by a 4-Mev deuteron is shown. If that 
fraction of the inactivation of the enzyme due to the 
on-track ionization can be found, then two quantities 
tan be measured: the area S of the molecule exposed to 
radiation action, and the thickness Æ. Estimating the 
fractional inactivation due to the track is subject to 
some uncertainty, but can be done by using the Bohr 
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Fıc. 9. Representation of the detail of a track of a deuteron. 
The black dots represent the ionization and the long spurs are 
ô-rays. In considering the part of the ionization which is along a 
track, allowance has to be made for the 6-rays which are off the 
track. The method of doing this has been worked out and permits 
consideration of just the part which is on the track. 


theory of 6-ray production, the Bethe-Bloch energy loss 
expression, and some modern range measurements for 
slow electrons in protein. 

The effect of 4-Mev deuterons on -galactosidase is 
shown in Fig. 10. The data can be analyzed by a relation 


ln (n/n) = — SD, (6) 


where D is a measure of the number of deuterons per 
square centimeter and S is a cross section, which after 
correction, should bear some relation to the area of the 
molecule. In order to exploit the linear ionization process 
effectively, such cross sections can be found for different 
absorbers in the path of the deuteron beam. There re- 
sults what is known in nuclear physics as a “Bragg 
curve,” as seen in Fig. 11. From such a curve, the 
variation in cross section with the rate of energy loss 
of the deuteron or a-particle can be found. This is shown 
in Fig. 12, which also shows the result of applying the 
correction for the off-track ionization. 

From Fig. 12, a molecular area of 8.2 10~* cm? is 
deduced, and it is noted, in addition, that, for a rate of 
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Deuteron bombardment of 
B-galactosidase 


la(n/no) =-SD 
S = 10.1 x 1073 cm? 


Single experiment 
June 11, 1958 
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Fic. 10. The inactivation of 8-galactosidase by deuteron bom- 
bardment. The relationship as indicated is obeyed and from it a 
target area can be inferred. 


energy loss of 300 ev/100 A, the particles are 63% 
efficient. If this loss of efficiency is because of the fact 
that the ionizations are not dense enough along the 
track and that some “straddling” is occurring, then one 
should expect a relation 


S=So(1—e~*), (7) 


where So is the maximum cross section, 7 is the ionization 
per unit track length, and ¢ is the effective thickness. If 
the primary ionizations require on an average 110 ev 
one has, for 300 ev/100 A, it=1 and 7=2.7 per 100 A, 
so '=37 A. Thus, one has the following data about 


12 
10 
S 
3 8 
N 
v 
5 3 
3 8-Galactosidase 38 
2 “Bragg curve” for deuteron inactivation a 
m Al g 
v x 
= v 
c 
2 3 
x 
0 
(0) 2 4 6 8 10 12 14 16 


Absorption, cm air equivalent 


Fic. 11. A “Bragg curve” for the deuteron inactivation of 
8-galactosidase. The deuterons were systematically covered with 
foils of different thickness and their effectiveness measured. It can 
be seen that there is a rise as the ionization per unit path increases 
and then an abrupt fall as the end of the Tange is reached, 
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agreement is fairly satisfactory. Thus, the enzyme 
should have a molecular weight 370 000, a length of. 
114 A, and a diameter of 72 A, and so is roughly spheri- 


20 
i cal. Such a prediction should be subject to verification, 


a 


— a-particles 


INACTIVATION OF VIRUSES AS RELATED 
TO THEIR STRUCTURE 


~ 
mo) 


Inactivation of a whole biological system, having a 
structure and a number of functioning parts, presents — 
——_o——— — — Da . . 0.9 
aR a new problem. It is reasonably certain that radiation 
= Corrected z 9 9 
inactivates at least one of the molecular units of the 
system, but now inquiry must be made as to whether ~~ 
or not it has any effect on the biological functioning, or 
rather, any measurable effect. Viruses offer perhaps the 
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Cross section 


a O 200 400 600 800 1000 1200 1400 
very Energy loss, ev/100 A 
Fic. 12. The plot of the cross section vs rate of energy loss for 


. . 
deuteron and a-particle bombardment. A corrected line has been Southern bean mosaic virus 
drawn in which allowance has been made for the 6-rays. Deuteron bombardment 


B-galactosidase: 
volume=4.7X10- cc 
area= 8.2 107! cm? 
effective thickness= 3.7 10-7 cm. 


If it is assumed that the molecule is a cylinder, radius r 
and length /, these can be solved for, from the volume 
= and area, and r=36 A and /=114 A are obtained. The 
= effective thickness should be 4r/r=47 A, which has to 
_ be compared with 37 A, as deduced from Fig. 12. The 


Percent survival 


TABLE I. Virus properties available for cross-section measurement. 


Property Rough results 
Infectivity Close to whole virus 
o 50 100 150 x 10! 
kz: Deuterons/cm2 
Attachment Very small part 


Fic. 13. Inactivation of Southern bean mosaic virus by deu- 
Interference About 1/20 of virus terons as observed by Dimond and the author. From such data 
7 the cross section of the virus can be inferred and comes rather 
; close to the observed cross section in electron micrographs. 


_ Host killing About 1/6 of virus 
ogica Very small indeed the simplest system, yet a virus is a system and this 
must always be remembered. The problem can be 
rendered schematic as follows: 2 
Not tested 
radiation — molecular inactivation > 


reduction of function 


biological consequence. — 


Some animal 


; Thus, when a virus is irradiated and its ability to multi- 
cterial Complement fixation ot tested ply in the host is studied, one investigates the sum _ 
\nimal = total of: (1) ability to survive in outside world, (2) 
abili attach to host cell, (3) ability to invade | he 
) ability to multiply in the cell, and (5) ab 
outside world.. Thus, bland stat 


RADIATION 


about “virus inactivation” should be regarded with 
suspicion. 

Before treating separate inactivation studies, brief 
mention is made of two empirical correlates. The first 
is that the virus-inactivation cross section measured 
with heavy particles as bombarding agents (deuterons, 
a-particles) is close to the electron-micrograph cross 
section. This correlation fails for the large viruses but 
works for influenza and below. The second is that the 
inactivation volume measured with y-rays is close to 
the volume of the nucleic-acid content. This is less 
well established and does not work for polio or influenza, 
for example, but has been rather widely adopted as a 


100 } 


Serology 


Percent remaining 
o 


Infectivity 


101? 2x10!" 
Deuterons/cm2 
Fıc. 14. Differential effect of deuterons on the infectivity and 
the serological combining power of tobacco mosaic virus. It can 
be seen that very much more radiation is needed to have any 
effect on the surface whereas the infectivity is very readily 
affected. This shows that there is a clear cut difference in the 


two functions of the virus and enables some estimate of the rela- 
tive sizes to be made. 


simple interpretation of radiation inactivation. It has 
some validity, but must be used with caution. 

A chart is given, in Table I, of the various properties 
of viruses which are susceptible of study. It is clear to 
anyone who makes such differential radiation studies 
that a virus is mot a simple molecule. In what follows, 
a few examples of virus studies are given, and in con- 
cluding consideration is given to what can be deduced 
about one virus from this kind of work. Figure 13, 
shows the inactivation of Southern bean mosaic virus 

by deuterons as measured by the ability to form local 
` lesions on beans. The familiar relationship is found and 
the cross section, properly analyzed, corresponds to a 
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Fic. 15. Proof of the reality of irradiation target. Tobacco 
mosaic virus which is in long thin rods is irradiated in the manner 
shown in the insert of the diagram. In the first case, as the angle 
is varied, the target presents different aspects to the beam. This 
shows in the measurements as an increase in the cross section 
[from E. C. Pollard and G. F. Whitmore, Science 122, 335 (1955) ]. 


spherical particle of radius 150 A, which agrees with 
electron microscopy. 

Figure 14 shows the effect of deuterons on dry tobacco 
mosaic virus (TMV) measured in terms of (a) its ability 
to produce local lesions, and (b) its ability to precipitate 
antibody to TMV. The loss of precipitating ability is 


Percent survival 
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Deuterons/cm? of let 230 ev/100 A 


15 x 1012 


Fic. 16. Effect of deuterons on the cross reactivation of T1 
bacterial phage as observed by Till and the author. In case (b), 
the bacterium was superinfected with undamaged virus and the 
fact that part of the virus material could be given from the un- 
damaged virus to the damaged virus is shown in the lower sensi- 
tivity. The part which must be intact for cross reactivation to 
occur is approximately 60% of the whole virus [from J. E. Till 
and E. C. Pollard, Radiation Research 8, 344 (1958) ]. 
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' i Fic. 17. The inactivation cross section of T1 bacteriophage with 
4 Hit electrons of low voltage and penetration as indicated. It can be 
at $ seen that, apart from a small inactivation owing presumably to 
K some effect on the surface of the virus, the majority of the effect 
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only occurs when the energy of the electron is sufficient to pene- 
trate something like 250 A. This means that the sensitive part is 
inside the virus and also that none of it resides in the tail [from 
M. Davis, Arch. Biochem. Biophys. 49, 417 (1954). 
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very slight. In fact, it proved that most of the observed 
effect was on the virus and not on the combination with 
IG antibody. If one calculates the sensitive volume for the 
TaY removal of ability to precipitate with antibody, it corre- 
sponds to a molecular weight of less than 20 000. Similar 
experiments on Southern bean mosaic viruses indicate 
that there are possibly two active sites of molecular 
weights 30 000 and 6000, respectively. 

In view of all of this statistical reasoning, one some- 
times needs more real evidence to bolster one’s faith 
in numbers and deduction. An experiment, by Whitmore 
and the author,” on oriented samples of TMV is shown 
in Fig. 15. For samples of TMV held pointed at the 
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beam, with varying angles, the cross section is least 
when the virus particles, which are long and thin, are . 
held pointing at the beam and greatest when perpendic- 
ular to the beam. Controls held as at (b) showed no 
orientation effect. This experiment confirms the true 
physical character of the target. 


BACTERIOPHAGE EXPERIMENTS 


A series of phage properties has been studied. Two 
are selected as examples. The first is by Till and the 
author” and shows that T1 phage, when irradiated and 
allowed to enter a cell which has a second infection by 
a genetically different phage, can still “cross reactivate” 
the second phage, by which is meant that it can donate 
markers necessary for the progeny of both phages to 
function on a new host. The cross section for the loss of 
this ability is 60% of the single infection cross section, . 
and it implies that 60% of the phage DNA is genetic 
in character while 40% is not. By quite different 
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Fic. 19. Effect of low-voltage electrons on the infectivity of 
Newcastle disease as measured by Wilson and the author. It can 
be seen that there is no affect on the Newcastle disease until 
depth of penetration of about 200 A is observed. This means that 
the outer part of the virus is not concerned with the infectivity 
but there is a region inside which is actively concerned [from 
D. Wilson and E. C. Pollard, Radiation Research 8, 131 (1958)]. 


methods, for 72, Levinthal? has concluded somewhat 
similarly. The data are shown in Fig. 16. 

A technique of low-voltage irradiation, using electrons 
of known penetrating power, has been developed by 
Hutchinson et al? Applied to T1 (Fig. 17), only when 
electrons penetrate 125 A can they produce appreciable 
effect on T1. That is, the sensitive material is imbedded 
in the head, and, since the tail has a thickness of only 
150 A, cannot be in the tail. 


NEWCASTLE DISEASE AND INFLUENZA VIRUS 


Figure 18 shows an early experiment by Jagger and 
the author?’ on influenza virus. The infectivity in eggs 
was measured, and the cross section vs energy loss is 
shown. One way to analyze the data is indicated, in 
terms of an insensitive region surrounding one which is 
more sensitive. Figure 19 shows data by Wilson and 
the author?” on Newcastle disease virus (NDV) using 
the low-voltage electron technique. Quite independent 
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evidence for the existence of a protective coat is 
apparent. 

Figure 20 shows the assembly of conclusions about 
inactivation experiments on infectivity, hemagglutina- 
tion, and hemolysin ability for Newcastle disease virus. 
Such a semispeculative figure is representative of what 
can be done on one virus. So far, no very contradictory 
evidence has come to the attention of the author which 
invalidates this model of the virus. 


FURTHER APPLICATION AND OUTLOOK 


The study of cellular structure in relation to its 
function by ionization inactivation can be carried be- 
yond viruses. The effect of radiation on amino-acid 
uptake in Æ. coli can be analyzed to indicate that ribo- 
nucleoprotein particles, or ribosomes, are involved in 


sensitive unit 
(infectivity) 


Radiation-resistant coat 


Fic. 20. Sketch of Newcastle disease virus, showing the relative 
size of several components. Such a drawing is schematic and serves 
primarily to indicate relative sizes [from D. Wilson and E. Pollard, 
Radiation Research 8, 131 (1958) ]. 


the uptake. The low-voltage electron technique has 
been used by Preiss?! to determine the location of the 
invertase in yeast cells. Currently, -galactosidase in 
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E. coli is under study and seems to be in the protoplast 
membrane. The technique is not harder than many 
others now in use and in the next few years should 
contribute appreciably to our knowledge of the cell. 
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f S a preceding paper, Bloom (p. 21) describes ex- 
periments in which a stream of protons was colli- 
mated into a microbeam a few microns in diameter 
which could be used selectively to irradiate and inacti- 
vate chosen regions of living cells. Such studies should 
become increasingly important in modern biology as 
they allow correlations to be made between structure 
and function. However, before radiation probes can be 
A used to give meaningful answers to such questions, it 
is necessary that one learn more about those events 
intervening between the original radiation insult and 
the observable end effect being assayed. In the com- 
plicated cellular systems discussed in the following, 
lack of information about these intervening processes 
is almost complete and few unambiguous answers are 
available. co 

Considered are two of the primary problems of cell- 
ular radiobiology. The first of these—With over-all 
| cellular irradiation, what are the sites of damage as 

scored by various end-effects?—is a question which, in 

many cases at least, is answered. The second problem 
depends upon the answer to the first and is: Must the 
P energy initiating the processes leading to the observable 
effect be laid down by the ionizing radiation particles 
directly within the target sites or can a portion of this 
activation energy migrate from the environment to the 
sensitive sites? 
; With the task at hand defined in rather broad terms, 
“Ode I should like to limit it by considering almost exclu- 
pif sively a particular microorganism (the yeast Sacchar- 
omyces cerevisiae) and by dealing with only one end 
point of radiation damage, namely, the loss of ability 
of an irradiated cell to produce a visible colony. Al- 
though one could use other end points such as particular 
genetic mutations, division delay, biochemical abnor- 
malities, changes in cellular permeability, etc., the loss 
of ability to reproduce is, from a teleological point of 
view, of greatest importance. 

Bloom, in his description of the microbeam experi- 
ments, has already given the clue that, in his test 
system, the nucleus is much more sensitive to irradia- 
tion than is the cytoplasm. So, the first question can be 
rephrased: What is the relative radiosensitivity of the 
~ nucleus and the cytoplasm in yeast as scored by colony 
formation? i ‘ 

s possible to derive from a single strain of yeast 
strains in which the chromosomal material is 


m 
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present in quantities which are multiples of a basic 
unit; such strains are designated as haploid, diploid, 
triploid, etc., up to hexaploid.' It has been most profit- 
able to compare the radiosensitivities of related yeasts 
of various ploidies. In a typical experiment, yeast cells 
are suspended in an aqueous environment under various 
conditions and irradiated. Aliquots are then plated on 
nutrient agar on which a surviving cell, by our criterion, 
will produce a visible colony (about a million cells). 
The survival of such an irradiated population can then 
be compared with that of an unirradiated one. A semi- 
logarithmic plot of the surviving fraction of haploid 
yeast as a function of dose is shown in Fig. 1. Exponen- 
tial survival is obtained, the slope of the survival curve 
being a useful index of the radiosensitivity. The inac- 
tivation kinetics are first order, indicating that the 
inactivation process is caused by a single event, pre- 
sumably an energy transfer to a single molecule. One 
might now ask the question: What single molecule is 
necessary for the continued functioning of the cell? 
One is led almost inevitably to the conclusion that the 
pertinent molecules must be molecules of the genetic 
apparatus of the cell. 

The x-ray survival curve of diploid yeast is shown in 
Fig. 2.2 Two points are noteworthy: (1) these cells are 
much more radioresistant than are haploid cells, and 
(2) the inactivation kinetics are of higher order than 
first. The first attempt to explain mechanistically the 
comparative response of haploid and diploid strains 
was that of Zirkle and Tobias.? They assumed that 
radiation damage in this system is owing to the induc- 
tion of recessive lethal mutations. This is diagram- 
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Fic. 1. X-ray survival curve of haploid yeast. Experimental 
Rea paye errors of 2% or less (approximately the size of the 
points). 
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matically represented in Fig. 3 in which all of the 
chromosomal material of the haploid is shown in a 

* continuous array; this will be duplicated in the diploid 
and will be present m-fold in yeast of ploidy m. To 
successfully inactivate a multiploid cell, corresponding 
sites on homologous chromosomes such as c, c’, c”, etc., 
must be simultaneously inactivated. Such a model can 
successfully correlate haploid and diploid survival and 
predicts increasing radioresistance for yeast of higher 
ploidy. 

To test this hypothesis, Mortimer has extended these 
studies to yeast of ploidies up to six and finds that the 
triploid, tetraploid, pentaploid, and hexaploid are pro- 
gressively Jess radioresistant, in direct contradiction to 
the predictions of the recessive lethal model.! Figure 4 
shows that the doses required to produce the same 
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Fıc. 2. X-ray survival curve of diploid yeast. 


percentage inactivation decrease inversely with the 
ploidy for ploidies above haploid.‘ 

In another series of elegant experiments, Mortimer 
has shown that x-ray inactivation of haploid yeast may 
be owing to the induction of either recessive or of 
dominant lethal mutations.» By micromanipulation, 
individual irradiated cells of various ploidies are mated 
with unirradiated cells. A diagram of such an experi- 
ment with haploid cells is shown in Fig. 5. Large num- 
bers of cells have to be mated as such interpretations 
must be based on statistical evidence. Operationally, 
recessive lethal mutations in haploid célls are defined 
as those not causing loss of ability to divide when 
mating with an undamaged haploid cell; dominant 
-lethals are defined as those causing loss of ability to 
divide. 


. AORA 


Haploid 
S= ea~7kD 


M-Ploid: S=[1—(1 — e7*2)”]” 


Tic. 3. Diagram for recessive lethal model. 


Dominant lethals would be expected to be propor- 
tional to the total chromosome length, that is, to the 
ploidy. The data of Fig. 4 indicate that with yeast cells 
of higher ploidy the radioresistance is inversely related 
to the ploidy, or, the radiosensitivity is directly pro- 
portional to the ploidy. Thus, with yeasts of ploidy 
higher than diploid, the predominant path of inactiva- 
tion as scored by colony formation is interpreted by 
Mortimer as owing to the induction of dominant lethal 
mutations. 

It is very difficult to relate this dependence of 
radiosensitivity on ploidy to cytoplasmic damage re- 
sulting from irradiation. Cell volume is propor- 
tional to ploidy! and the diploid is more radioresistant 
than the haploid. Thus, if radiation damage were the 
result of cytoplasmic effects, the radioresistance should 
continue to increase with ploidies above two in contra- 
diction to the observed results. 

Additional evidence that the predominant path of 
radiation damage is by means of genetic damage is 
afforded by Tobias’ experiments with defective diploid 
cells in which recessive radiation damage is propagated 
to the daughters of irradiated cells (p. 289). 

In general, then, with yeast loss of ability to form 
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Fic, 4. Radioresistance of yeast as a function of ploidy [from 


R. K. Mortimer, University of California Radiation Laborat 
Rept. 3902 (August, 1957) ]. ey 
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Fic. 5. Operational scheme for differentiating between recessive 
and dominant lethal mutations induced by x-rays in yeast. 


colonies following x-irradiation can be explained in 
terms of genetic effects. The microbeam experiments of 
Zirkle and Bloom,® experiments with eccentrically lo- 
cated nuclei,™8 and many other experiments all indicate 
that nuclear inactivation is of primary importance in 
many types of cells. The relatively small number of 
cases to the contrary that have been reported (e.g., 
Duryee with frog eggs, Bacq ef al. with alga!) are 
interesting exceptions to this rule. 

Turn now to the question of whether the radiation 
energy causing cellular inactivation is laid down directly 
in the molecules whose modification causes cellular 
inactivation or whether a portion of this energy can 


_ migrate from the molecular environment to the affected 
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sites. Pollard (p. 273) has outlined in some detail the 
processes of direct-action inactivation in dried materials, 
There can be no doubt that such inactivation plays a 
role in cellular radiobiology. However, classical direct- 
action theory in its simplest form is unable to account 
for various experimental results, in particular those 
concerned with modification of radiation effects by 
various environmental factors such as oxygen concen- 
tration and phase state.!! Furthermore, with the de- 
velopment of radiation chemistry and the demonstra- 
tion of i vitro inactivation of various cellular constitu- 
ents by free radicals produced by the irradiation of 
water, it became necessary to consider the importance 
of such processes in cellular systems. There thus 
evolved the indirect action hypothesis in which free 
radicals produced by the irradiation of water were 
postulated to bring about at least some of the initial 
radiation damage. 


The most successful formulation of these concepts 
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Fic. 7. X-ray survival curves of haploid yeast irradiated 
at —10°C, liquid and solid phases. 


was in the migration model of Zirkle and Tobias.’ In 
this model, a proper averaging was made of the contri- 
butions to inactivation by various species of free 
radicals, produced at various distances from the target 
molecules, under various experimental conditions. A 
single effective collision between an energy carrier and 
a critical molecule was postulated to be effective in 
bringing about molecular change in conformity with 
the first-order inactivation kinetics experimentally ob- 
served in many of the simple unicellular systems. This 
model also includes as a part of itself the direct-action 
type of inactivation. 

One of the most studied of the modifying parameters 
is the oxygen concentration in the cellular suspension 
during irradiation. Haploid yeast cells irradiated in the 
absence of oxygen are approximately twice as resistant 


as those irradiated under aerobic conditions (Fig. 6"). 


The oxygen effect is operative at oxygen tensions 4S 
low as 10 micromolar. A comparable oxygen effect ha 


_ been | found for a tremendous variety of materials using 
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a considerable number of end results. Many chemicals 
have been found to decrease the radiosensitivity, but 
it is not yet certain that such protection acts other than 
by modifying the existing oxygen concentration. The 
phase state of the test system, whether frozen or liquid, 
greatly modifies radiosensitivity. In Fig. 7 are shown 
survival curves for haploid yeast irradiated at —10°C 
in the liquid phase (supercooled) and in the frozen state. 
Until quite recently, the modification of radiation 
response by such environmental factors as oxygen con- 
centration, various chemicals, or phase-state change 
was considered as evidence that at least the modifiable 
portion of the radiation effect was mediated through 
indirect-action mechanisms. A simple formulation of 
indirect action is diagrammed in Fig. 8. Oxygen modifi- 
cation can be explained easily in this model by the 
elimination of those free radicals depending on the 
-Presence of oxygen for their formation. Modification by 
chemicals could be owing to competition for the free 
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Fic. 8. Diagram of simplified indirect-action mechanisms. 


radicals between the added chemicals and the target 
sites, to diminution of the oxygen effect, or to masking 
of sensitive molecular side chains (e.g., sulfhydryl 
groups). The phase-state change would operate by 
either removing water from availability for these re- 
actions or by greatly decreasing the diffusion of free 
radicals. On the other hand, such modifying agents 
would not be expected to greatly modify the ionization 
levels of the target molecules and modification is diffi- 
cult to explain using the concepts of classical direct 
action. 

Alper, Howard-Flanders, and Alexander have ques- 
tioned the assumption that modification is @ priori 
evidence that an indirect action mechanism is opera- 
tive.!2415 For example, Alexander has shown that 
crystalline trypsin also shows an oxygen effect. How- 
„ever, this evidence is not conclusive as this system still 
contains about 5% water which might play an impor- 
tant role in the oxygen effect. They suggest that the 
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Fic. 9. Diagram of oxygen modification of radiosensitivity 


oxygen effect is due to reaction of an excited state with 
oxygen to produce an irreversibly damaged site (Fig. 9). 
Recently, Hutchinson has interpreted his results on the 
in vivo inactivation of yeast enzymes by x-rays in terms 
of the migration model and has found that a diffusion 
distance for the pertinent inactivating free radicals of 
about 30 A is involved, a distance an order of magnitude 
less than had previously been estimated.!® At the lowest 
oxygen concentrations at which the oxygen effect is 
important, the distance between oxygen molecules is 
about 700A. It is, therefore, unlikely in Hutchinson’s 
system that there would be opportunity of reaction 
between oxygen and hydrogen free radicals produced 
by irradiation to give rise to the hydroperoxyl free 
radical (H+O2=HOz), the generally assumed mecha- 
nism for the oxygen effect. Other objections to the 
hydroperoxyl free radical as an important factor in 
cellular radiobiology have been discussed elsewhere in 
detail.” Thus, while modification of radiation damage 
in the truly dried state has not been demonstrated, the 
generally accepted mechanism for the oxygen effect via 
the hydroperoxyl free radical is subject to question. 
However, the mechanism diagramed in Fig. 9 could 
also be operative for indirect action events, and at the 
present time this seems to be the best operational 
hypothesis for the oxygen effect. Recently, estimates 
have been made on the lifetime of the excited states 
that may be involved in the oxygen effect. At the low 
oxygen concentrations necessary for the oxygen effect, 
there are approximately 10° collisions/sec between 
oxygen molecules (assuming that they are free to diffuse 
within the cell) and a hypothetical target having a molec- 
ular weight of 100. Thus, it is necessary to assume that 
the excited state produced by the radiation energy must 
have a lifetime of at least a microsecond in length. 
Work by Howard-Flanders and Moore with bacteria 
in which cells irradiated under either aerobic or an- 
aerobic conditions were quickly switched to the oppo- 
site condition indicates that the lifetime of the oxygen- 
sensitive excited state is less than 20 msec." 


Howard-Flanders has also found that nitric oxide can ` 


be used to mimic oxygen.!8 He suggests that the action 
of oxygen and nitric oxide in increasing radiosensitivity 
may be associated with the fact that both have un- 
paired electrons in orbitals. 

The change in radiosensitivity associated with the 
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Fic. 10. Dependence of radiosensitivity on irradiation tem- 
perature: —30°C to 0°C, liquid and solid phases. 


phase-state change from liquid to frozen cells has been 
mentioned already. Figure 10 shows the radiosensitivity 
over a temperature region down to —35°C in both the 
liquid and solid phase. The progressive decrease in 
radiosensitivity with irradiation temperatures below 
0°C suggested to us that the decrease might be associ- 
ated with a progressive freezing of cellular water. To 
test this hypothesis, microcalorimetric studies were 
performed on frozen yeast suspensions. Figure 11 shows 
that cellular water is progressively frozen down to about 
—20°C, this region of constant freezing correlating 
nicely with the region of constant radiosensitivity. A 
certain fraction of the cellular water, 9%, is nonfreez- 
able; this is one definition of bound water. 

A. M. Rosenberg, in our laboratory, has developed 
an interesting way to control effectively the amount of 
free water within yeast.” Very high concentrations of 
various chemicals (ethanol, glycerol, glucose, etc.) can 
be used to either dehydrate the cell or to tie up cellular 
water which may play a role in radiobiological processes. 
Figure 12 shows the progressive decrease in radiosensi- 
tivity with increase in glycerol concentration. The 
radioresistance increases by about a factor of four with 
6.9 M glycerol. Additional protection is afforded by 
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anoxic conditions, the maximum radioresistance being 


about 5 times that typical of aerobic cells in standard . 


suspensions. We have been able to verify by several 
methods that glucose, for example, dehydrates the cell 
approximately to the degree expected from simple 
theory. Figure 13 summarizes the points mentioned in 
the foregoing—namely, that there is a correlation be- 
tween the residual fractions of radiosensitivity and of 
unfrozen cellular water with respect to freezing tem- 
perature, and between the residual fractions of radio- 
sensitivity and of cellular volume with respect to 
osmotic concentration. Furthermore, these two types 
of effective water removal are apparently closely 
related. 
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Tic. 12. X-ray survival curves for haploid yeast irradiated 
in different concentrations of glycerol. 


From the results in Figs. 10 and 11, one might expect 
that the temperature dependence of yeast would not 
greatly change for irradiation temperatures below 
— 35°C. However, when yeast suspensions are frozen 
at a rate characteristic of the irradiation temperature, 
an apparently anomalous behavior is observed below 
— 35°C, the radiosensitivity progressively increasing in 
the region from —35°C to — 72°C where a sensitivity 
is observed approximately equal to that typical of the 
liquid phase (Fig. 14). This behavior can be explained 
by an examination of the physical processes that occut 
within the cell during freezing. Meryman and Platt” 
have shown that, in mammalian liver cells, the rate of 
freezing (the speed at which the ice-water boundary 
advances) is of great importance in determining whether 
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Fic. 13. Relationship of the radiosensitivity of yeast to the loss of cellular water. Glycerol concentrations are to be read at 
the uppermost abscissa, freezing temperatures at the bottom. The curve labeled “packing fraction in glycerol” shows the rela- 
tive volume occupied by a standard amount of yeast after centrifugation in various concentrations of glycerol. 


the cellular water is frozen within or without the cell. 
At low rates of freezing (less than 1.5 mm/min), the 
increase in external osmotic pressure owing to the 
freezing of the external medium causes water to move 
outward across the cell membrane where it is then 
frozen. Thus, the cell is effectively dehydrated. At high 
rates of freezing (greater than 1.5 mm/min), however, 
the cellular water freezes within the cell before the 
increase in external osmotic pressure is effective in 
dehydrating the cell. In the yeast system used here, 
this critical velocity of freezing (1.5 mm/min) occurs at 
—35°C; thus, for cells frozen at temperatures above 
—35°C, dehydration would occur. For freezing at tem- 
peratures below —35°C, some internal freezing of water 
would occur before dehydration would be complete. 
These mechanisms are diagramed in Fig. 15. 

The results in Fig. 14 are interpretable in light of the 
migration model using the foregoing concepts of cellular 
freezing. On irradiation, free radicals are formed in all 
of the cellular water, whether liquid or frozen, whether 
within or without the cell, approximately in equal 
number per unit volume of water. We further assume 
that free radicals (or their products) produced by the 
irradiation of the ice are largely trapped within the ice 
crystal. The investigations of Stewart and Ghormley* 
give credence to this assumption. Now, if the frozen 
cellular water is largely exterior to the cell, the free 
radicals liberated on thawing will be effectively diluted 
by the external medium and the probability of an 
effective interaction between them and the sensitive 
regions of the cell will materially decrease. On the other 
hand, if a significant amount of the cellular water 1s 
frozen within the cell, the free radicals released on 


thawing will be in much the same positions they would 
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have been in on irradiation of the unfrozen cell. Thus, 
if the radiobiologically significant cellular water is 
frozen within the cell, the radiosensitivity under these 
conditions would be not dissimilar to that typical of the 
liquid phase. 

There are three critical tests for the foregoing hypoth- 
esis. If radiation sensitivity is primarily dependent on 
the degree of external dehydration rather than on the 
absolute amount of cellular water frozen (i.e., the tem- 
perature), then slow freezing to temperatures lower 
than —35°C by stepwise freezing at —5°C, —10°C, 
— 20°C, etc. should dehydrate the cells to the same 
degree as freezing at —35°C. The resulting radiosensi- 
tivity should be the same as that obtained with cells 
frozen at —35°C; from Fig. 14 (slow O» curve) this can 
be seen to be the case. Secondly, if fast freezing at 
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Fic. 14. Relative radiosensitivity of haploid yeast as a function 
of temperature, phase state, and rate of freezing. 
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Fic. 15. Mechanisms for fast and slow freezing in yeast. 


—72°C internally freezes the radiobiologically signifi- 
cant water, then freezing at even lower temperatures, 
say, —95°C, should produce the same radiosensitivity 
as that obtained at — 72°C and in the liquid phase; this 
was also found to be the case. Thirdly, if fast freezing 
at — 72°C freezes the radiobiologically significant water 
within the cell, then subsequent x-irradiation of such 
cells at temperatures higher than —72°C should pro- 
duce the same effect as that obtained at —72°C. This 
is true at — 50°C, but at higher temperatures there is a 
progressive loss of radiosensitivity. This decrease in 
sensitivity with increasing temperature for these pre- 
frozen cells is qualitatively similar to the rate of dis- 
appearance of hydrogen peroxide produced by irradia- 
tion of ice that has been observed by Stewart and 
Ghormley and is probably the result of back reactions 
among free radicals not firmly trapped in the ice.” 

No successful explanation of the phase effect in 
radiobiology based on the direct action model has been 
advanced (see Wood* for review). Probably the best 
way to differentiate between direct-action mechanisms 
and indirect-action mechanisms is by use of the phase 
effect. In the haploid-yeast system, 60% of the radia- 
tion damage under either aerobic or anaerobic conditions 
is capable of modification by the change in phase from 
liquid to solid state. Thus, if the above interpretations 
are correct, no more than 40% of the over-all radiation 
damage is owing to direct action. If oxygen modifica- 
tion operates through an indirect-action mechanism, 
then our results both with frozen, anoxic cells and with 
; -cerol-dehydrated, anoxic cells indicate that at least 
of the x-ray inactivation of this system is owing 


arizing, it is difficult to make any broad 
that have general applicability for cellular 
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radiobiology. The modification by oxygen of almost all 
radiobiological responses studied indicates the necessity , 
of understanding the associated mechanisms. The gen- 
erally accepted hydroperoxyl-radical mechanism is 
certainly questionable. At the present, the best opera- 
tional tool to differentiate between direct and indirect 
types of inactivation is the change in radiosensitivity 
with phase state. The success of the migration model 
in dealing with modification of radiosensitivity by 
various physical and biological parameters is most 
encouraging and this model, even though its details 
may need revision, has been most valuable in suggesting 
new experiments. 

In studying a biological unit as complicated as a cell, 
it is desirable to approach the problem from both the 
molecular level and from the gross phenomenological 
level. Phenomenological studies similar to those de- 
scribed here have greatly profited from work done with ~. 
simpler systems such as those described by Pollard 
(p. 273). On the other hand, work with more compli- 
cated cellular systems has given a direction to molecular 
studies and has focused attention on the necessity for 
working with materials in an aqueous environment It 
is to be hoped that it will not be too long before these 
two types of studies have overlapped. 
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HE animal body is an assembly of cells. In con- 

sidering the effects of radiation on such an as- 
sembly, the relationships between the individual cells 
and their assembly must be defined and, moreover, 
performance criteria must be found by which to charac- 
terize both the radiation effects on individual cells and 
the relation of these effects to the performance of the 
cell population. 

There are three important principles which charac- 
terize the biological effects of penetrating radiations: 

A. Radiation effects at the subcellular level are of a 
statistical nature. A single ionizing particle may lead to 
a decisive single radiobiological event in one of the many 
DNA nucleoprotein molecules or in another essential 
molecule. As a result, many different radiobiological 
events occur in the cells of an irradiated population. 

B. In establishing the effect of irradiation on a cell 
population, the measured results are determined by the 
defined criterion for an experimental test. Because of 
the replication of genetic alterations in mitotic cell 
division, the criterion used for measurement frequently 
involves the selection of certain more-fit cells and the 
rejection of others, leading to end results different from 
what might be expected from statistical averaging of 
the entire population. 

C. Radiobiology is an important basic tool because 

Sporulation of diploid yeast 
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Fic. 1. Sporulation of normal and recessive 
lethal diploid yeast cells. 


* Work at the Donner Laboratory reported in this paper was 
performed under the auspices of the U. S. Atomic Energy Com- 
mission. 
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its fundamental interactions occur at the atomic-molec- 
ular level in essential sites of the cell, sometimes in 
locations not accessible to the ordinary tools of physics 
and chemistry. These interactions are then amplified to 
a level involving the entire cell—i.e., by a factor of 
about 10!*. Similarly, radiobiologic events that happened 
to a single cell or to a few cells may be amplified to the 
entire animal or human organism affecting the fate of 
as many as some 10" cells. 


CELL-DIVISION DELAY AND INHIBITION 


Zirkle (p. 269) and Wood (p. 282) have admirably 
outlined the major causes of cell-division delay and in- 
hibition in irradiated yeast cells. In order to consider 
population effects in detail, the discussion of the same 
cells is continued with the understanding that in other 
species many additional interesting and diverse radio- 
biological phenomena occur. One cannot describe in a 
single chapter the wealth of phenomena that have been 
observed.} 

The following classes of genetic damage are of interest: 

A. Dominant lethal damage. This leads to inhibition 
of colony formation in one, or in a few, cell divisions.' 

B. Recessive lethal damage?* can kill the haploid 
cell, but will allow survival of the diploid. In some forms 
of this damage, the genetic material appears to be re- 
arranged so that progeny of such cells, for many genera- 
tions, divide slower than normal cells. When the diploid 
which bears recessive lethal damage is sporulated, it 
produces two dead spores for each recessive damage. 
The presence of recessive lethal damage in a diploid 
“softens” the cell for reirradiation, already suggested 
by Latarjet and Ephrussi,‘ and the survival of such cells 
indicates greater sensitivity than the normal diploid, as 
shown by Tobias and Stepka® and by Beam.® Recessive 
lethal radiation damage is illustrated schematically 
in Figs. 1 and 2. 


EXTENT OF DAMAGE IN NUCLEAR MATERIAL 


The existence of radiation-induced auxotropic mu- 
tants of yeast cells in haploid mating types, requiring 
specific nutrients, makes it feasible to carry out genetic 
analysis of the damage induced. By mating individual 
cells under the microscope, one can breed diploid cells 


t A good idea of the complexity of radiobiological research may 


be obtained by consulting the journal, Radiation Research, and 


the book, Proceedings of the International Congress of Radiation 


Research, Vermont, 1958, to be published by the Academic Press, 
Inc. 
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Fic. 2. Recessive model of radiation survival of diploid 
yeast cell with defects. 


with recessive requirements for specific nutrients. Irradi- 
ation of such cells often makes the requirement domi- 
nant due to production of damage in allelic sites. In 
this fashion, it is found that damage often extends to 
regions greater than a single biochemical locus; more- 
over, the probability of damage occurring from site to 
site is very different. The situation reminds one of 
chromosome breaks, which are seen to be produced 
copiously in other organisms. The damage extends most 
of the time to chromosome parts which include more 
than one gene. In relatively rare events, the damage 
extends to a single genetic locus only or to some part 
of it.” 


POSTIRRADIATION CHANGES IN 
NUCLEAR MATERIAL 


There are gene mutations that are extremely stable 
and persist through very many generations. It is 
believed generally that these mutations persist, and do 
not recover, and that their production varies linearly 
with dose. Except perhaps in some mutations of haploid 
cells, this statement is not exactly true, when one re- 
gards all kinds of nuclear damage from radiation.{ In 
cells of higher ploidy, the presence of genetic material 
in two or more sets allows postirradiation rearrange- 
ments in the course of several subsequent cell divisions. 
The fate of a typical diploid survivor in the postirradia- 
tion period is shown in Fig. 3, as reported by Tobias.’ 
Here, the irradiated cell was placed on nutrient agar 
under a microscope and observed continuously. As cell 
divisions occurred, the mother and daughter cells were 
separated. It is seen from the graph that, in the course 
of seven subsequent cell divisions, several cells were 
produced that failed to divide again. The radiated cells 
also exhibited long cell-division delays. A typical chart 
showing how a normal cell gives rise to progeny is given 
also. The process of generation of cells lacking viability 

can go on for many generations so that some colonies 
arise where cell replication is below optimum value. For 


f Note added in proof.—Most of the evidence for linearity comes 
rae radiation shies on the sperm of the fruit fly. Very recently, 


Russell ef al, [Science 128, 1546 (1958) ] have furnished experi- 


oe 


mental proof of the dose-rate dependence of some mutations by 
irradiating male and female mice with x-rays and y-rays. — 
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instance, by preirradiation it is easy to obtain colonies 
which produce one dead cell for every four normal ones. 

One of the suggested mechanisms for such a post- 
irradiation recovery effect is shown in Fig. 4. As the 
cell with recessive radiation damage in its nucleus 
divides, occasionally “chromosome crossover” occurs; 
i.e., segments of the chromosome exchange places with 
each other. The two progeny are not identical in chromo- 
some content. In the instance shown, one of them has 
homozygous lethals, whereas the other will survive, 
having gotten rid of the damaged part. A method of 
experimentally demonstrating chromosome crossover 
uses recessive genetic markers. In the example shown, 
an adenineless locus was carried in the heterozygous 
state. When homozygous, the cells needing adenine 
become pink; otherwise they are ivory colored. Thus, 
a clone established from one irradiated diploid cell may 


appear color segmented, with approximately 4, 4, or $ 
of the cells colored, depending upon whether crossover 
occurred in the first, second, or third postirradiation 


cell division. This sort of delayed phenotypic appearance 
of induced chromosome changes has been known since 
the work with bacteria by Witkin, Demerec, and 
Latarjet.- Taking advantage of the detailed genetic 
knowledge available in yeast cells, James demonstrated 
ultraviolet-induced homozygosis'! and Mortimer dem- 
onstrated the x-ray-induced phenomenon.” 

Appearance of homozygosis, following damage, of 
recessive genetic traits is a defense mechanism, but it 
also serves to make recessive lethals existing in the 
population homozygous. 

There are other nuclear changes suspected of occur- 
ring in the postirradiation period. In some large plant 
and animal cells, these are the consequences of chromo- 
some breaks and their rejoinings. The result may be 
wholly- or partially-increased ploidy, euploidy, or 
aneuploidy. 


CHANGES IN RADIORESISTANCE 


In a mixed population of cells, various stages of cell 
division are present. It has been shown in haploid 
yeast cells that, in the early phase of budding, the cells 
are more than ten times more resistant than in inter- 
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Fic. 3. Postirradiation divisions of diploid cell. (a) Nonirradi- 

ated. (b) Irradiated. Open circles indicate cells which have stoppe 

dividing. 
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Fic. 4. Mechanism of postirradiation effect—chromosome cross- 
over. If crossover occurs in delayed fashion, one may observe the 
appearance of segmented clones of cells, indicated on the left. 


phase.®!3 Moreover, kinetics of the sensitivity to radia- 
tion are different; while a single ionizing event in the 
proper location is sufficient for inhibition during inter- 
phase when most new DNA synthesis occurs, dividing 
cells require some 100 such events. Accepting DNA 
nucleoprotein as the site of action,§ this phenomenon 
is a clear indication that physical and perhaps chemical 
states of this substance must change radically in the 
course of cell division. DNA molecules undergoing 
synthesis are highly sensitive; existing DNA molecules 
exhibit decreased sensitivity. It would appear that, 
during mitotic division of yeast, the building blocks of 
the new chromosome are present in great multiplicity, 
after which they condense to make just two new nuclei. 


RECOVERY FROM MUTATIONAL DEFICIENCIES 


Point mutations induced by ultraviolet- or x-rays 
generally have a low but definite spontaneous reversion 
rate in the haploid cell. It appears that within the same 
biochemical locus many kinds of damage may be pro- 
duced and that each of these has a characteristic rever- 
sion rate. Since many kinds of mutations may be pre- 
pared by radiation at the same locus, these eventually 
may serve to help in the understanding of the chemical 
nature of the genes and their ability to transmit infor- 
mation to other cell constituents. || 

When a diploid cell is made up of two haploids with 
heteroallelic auxotropic deficiency at the same locus, 
the cells cannot grow into colonies in the absence of the 
required nutrient. However, as Roman and Jacob have 
shown, ultraviolet irradiation of such a cell can bring 
about a striking reversion to independence of the same 
nutrient.14 This occurs at unusually low doses and may 
be regarded as a beneficial effect of radiation, since it 


thromosome damage 
icle, it has never been 
th the DNA molecule. 

DNA may have fhe 

same effect, and such an assumption indeed would explain the 

chain reaction which results in the breaking of whole chromosom 
|| See chapters by Levinthal (p. 249) and by Lennox (p. E 


§ Although one generally assumes that 
occurs as a result of a single ionizing part 
proved that this primary interaction is wl 
Release of an enzyme capable of acting on 
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induces repair. The mechanism is not known. It is as 
though the irradiation helped the cell to make faultless 
replicas from the imperfect parts of the two allelic genes, 
a mechanism sometimes called the copy-choice method 
of replication. 


CYTOPLASMIC EFFECTS 


While the cell nucleus undergoes injury, cytoplasmic 
deficiencies also take place. There is every reason to 
believe that RNA enzyme molecules and proteins are 
equally as vulnerable as DNA. Even if such injury 
occurs, however, the situation for the cell usually is not 
assumed to be too serious, since there are many identical 
RNA molecules and since it is assumed that, with the 
aid of DNA, they may be resynthesized; on the other 
hand, the loss of function of an essential DNA molecule 
may prove fatal to the cell. 

There are some autonomous cytoplasmic particles in 
various cells. According to the work of Ephrussi,! these 
particles in yeast are known to be able to duplicate 
themselves and the rate of their duplication may occur 
somewhat independently from that of DNA.1® When 
the granules carrying the cytochromes are lost from the 
cells, they cannot be resynthesized. Radiation produces 
these small-colony mutants, with ultraviolet- being 
much more efficient than x-rays (as first shown by 
Raut"). 

Almost any other cytoplasmic variable tested has 
been shown to be effected by radiation, though generally 
requiring higher doses than do the nuclear effects. 
Examples consist of disturbances in ionic balance, re- 
lease of ATP and other substances from the cells, 
changes in activity of various enzymes, etc. These effects 
in yeast cells are reviewed by Rothstein.'® 


EXTRACELLULAR EFFECTS 


Extracellular effects also are of importance in study- 
ing cell populations. These include the following factors: 


A. Radiation effects on the nutrient medium. 

B. Decrease in the autocatalytic action of cells which 
may synthesize important substances for other cells. 

C. Release of growth-promoting and radiation-pro- 
tective substances by the irradiated cells into the 
medium. 


Interestingly enough, the last item of these seems to 
be of greatest importance. It first was detected in yeast 
cells under ultraviolet light by Loofbourow some 20 
years ago, and more recently has been demonstrated 
for x-rays by Gunter and Kohn.” In mammalian tissue 
culture, Puck and Marcus” have shown such effects, 
while Révész™ has worked on the problem in detail for 
ascites cells. i 


CYTOLOGICAL EFFECTS OF PENETRATING RADIA- 
TIONS IN TERMS OF INFORMATION THEORY 


Radiations generally decrease the information con- 
tent of the genetic apparatus. They also increase the 
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noise of the information channels—namely, extragenic 
components—which are instrumental in transmitting 
instructions for cellular functions, particularly duplica- 
tion. This necessitates a lag in cell division which has 
a definite limit, however, and, if complete information 
is not transmitted in due time, the new cell is lost. 
Systems with redundancy in the genetic apparatus can 
transmit the information faster and more reliably. 

Radiation damage to the coding system not only 
consists of the knocking out of individual symbols, but 
also consists of deletions of extensive parts of the code 
or of their rearrangement in spatial order. 

The cell has mechanisms to repair the damaged 
genetic code to some extent. Given time, it can develop 
a method of more-reliable information transmission 
even in the presence of continuous radiation which is 

i producing new damage at a steady rate. Recovery may 
occur by increasing the channel capacity and, in some 
instances, by correcting the code or increasing its re- 
dundance. In the course of such events, groups of cells 
with maximum equivocation or greatest information- 
transfer efficiency receive preference and may take over 
the colony. 


POPULATIONS OF CELLS IN THE ANIMAL BODY 


Radiation effects in the animal body are consequences 
not only of irradiation of its cells, but also of changes in 
functions of its organs and in their interactions with 
each other. For purposes of this discussion, an organ 
may be regarded as a heterogenous population of cells 
in a continuous state of activity—growth and anabolism 
for some cells, decay and catabolism for others. The cells 
are bathed in body fluids, which bring fresh nutrients 
Heal and eliminate the products of cell metabolism and, at 
times, some of the cells themselves. 


YEAST-CELL POPULATIONS IN THE 
} | i : STEADY STATE 


eae = To approximate conditions in an organ, Welch in our 
ay f _ laboratory” worked with a population of diploid yeast 
cells, propagated continuously in a device resembling 
the chemostat of Novick and Szilard.” A schematic 
H view of this device is seen in Fig. 5. When a nutrient 

medium flows into this system at a given rate and when 

a medium with cells is eliminated from it at the same 

rate, steady-state populations of cells are established 

with a constant rate of cell division. Welch has irra- 

diated such cultures continuously with x-rays for more 

than one hundred consecutive cell divisions. As the 
_ radiation delayed cell division, the flow rate of the 
= chemostat was adjusted so that a new steady state of 
i ae was established at the same cell density 
i the controls. In a population of this kind, radiation 
cause killing by the dominant lethal mechanism. 
recessive lethal mechanism can also kill, either by 
e of heterozygosis into homozygosis, or by the 
of homozygous recessive lethals. If one 
1 per that no recovery of gene-radiation 
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damage takes place, then it follows that in each genera- 
tion the population can tolerate a very small fraction 
of the dose required for acute killing. 

Actually, the population has maintained continuous 
proliferation up to the highest dose rate tested, about 
6000 r/generation time. The main effects of radiation 
were the prolongation of the time required for cell 
division, and the generation of some cells incapable of 
further cell division. 

From the number of viable and nonviable cells pres- 
ent, it is possible to evaluate the mean lifetime of cells 
in the population and the average number of daughter 
cells a mother cell is capable of producing before her 
ability to undergo mitotic division is lost. 

Figure 6 indicates how the number of daughter cells 
from each mother cell decreased rapidly with increasing 
dose rate. Each control cell produced about 66 daugh- 
ters; this number decreased to about seven at the 
highest dose rate. At the same time, however, the cells 
appeared to acquire greater radioresistance by a factor 
of two than they had had prior to irradiation. The in- 
creased radioresistance appeared to develop by adapta- 
tion, rather than by mutation. A few generation times 
after radiation was stopped, the cells regained their 
normal rate of cell division. Cells taken out of the 
steady-state culture and irradiated in conditions of 
starvation showed normal radioresistance. 

Although a great deal more needs to be done with 
continuously proliferating cellular systems, Welch has 
demonstrated that diploid populations can live in the 
presence of surprisingly large radiation levels, and that 
part of the radiation-induced damage recovers in the 
course of successive cell divisions. 


RADIATION EFFECTS ON MAMMALS AND MAN 


It is impossible to do justice to the intricate and 
detailed research work that has been going on in radia- 
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Fic. 5. Radiation of yeast-cell populations in chemostat. 
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“tion physiology during the past decades. The reader is 


referred to the excellent review articles and books on 


* the subject.”—?6 It is important to note that bacterio- 


logical techniques of propagation recently have been 
applied successfully to human cells by Puck § and that, 
as a result, some information is already available on 
the radiosensitivity of various diploid and tetraploid 
human-cell strains. The mean lethal doses range from 
about 90-300 r, and the mechanisms of inhibition of 
colony formation are perhaps not too different from 
those for yeast cells. It is not known, however, what 
the radiation resistance of differentiated human cells 
in the body is, where it is possible that the very rapid 
humoral and metabolic exchange and the changes of 
radiosensitivity due to cell division might lead to an 
increase of radioresistance, as noted in the steady-state 
yeast-cell cultures. 


RADIOSENSITIVE TISSUES 


There is a 60-year-old law in radiobiology, originally 
proposed by Bergonnier and Tribondot; according to 
this law, the most radiosensitive tissues are those which 
have the highest mitotic index—i.e., the highest rate of 
cell division. One now understands that this law is only 
approximately true because radiation injury manifests 
itself mainly in the cell-division process. Radiated cells 
usually die when they attempt cell division; if their 
functions in body tissues do not require frequent cell 
division, the cells usually can perform ordinary meta- 
bolic and enzymatic functions for a long time post 
irradiation, since these functions are much less radio- 
sensitive than cell division itself. 

Correspondingly, in the acute phase, a single dose of 
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Fic. 6. Progeny decrease with dose rate. 


{ See chapter by Puck (p. 433) for detailed discussion. 


radiation usually affects the most rapidly proliferating 
tissues; the bone marrow and its varied cell population 
and certain epithelial tissues are affected first. Early 
radiation deaths frequently are due to injury to the 
intestinal mucosa and to white-cell proliferation, while 
failure of the blood clotting mechanism and anemia 
make their appearance later. The neurons of the central 
nervous system of the adult are quite radioresistant, 
since almost no cell division occurs in them. In the 
developing mammalian embryo, however, as shown by 
Hicks,” nerve tissue is rapidly proliferating and has 
the greatest radiosensitivity. 


INDIRECT RADIATION EFFECTS 


There are very intricate relationships between various 
organs of the body, and it is not surprising that remote 
effects of radiations have been observed. For example, 
if an organ receives a dose of radiation, depressive effects 
due to this injury manifest themselves elsewhere in the 
body. Starting with the work of Hevesy, it was shown 
that a small depression of cell division and of DNA 
synthesis occurs everywhere in the body if an organ, 
such as the spleen, is irradiated.?* To illustrate remote 
effects of radiation, one may mention work done at the 
Berkeley cyclotron in irradiating pituitary and various 
regions of brain tissue by high-energy deuterons.” 
Because of their small scatter, these particles are well 
suited to make small radiolesions in various parts of the 
body. It was found that a large localized dose to the rat 
pituitary (size 1X2 mm) will cause progressive develop- 
ment of the hypophysectomized state. The growth of 
the animal, calcification of its bones, and functions of 
adrenals, thyroid, and gonads are all affected; develop- 
ment of the remote effects has different dose-response 
curves for each kind of effect. 

An organ that is shielded from radiation confers a 
degree of protection against effects from whole-body 
radiation to the animal. Shielding of spleen and bone 
marrow is of greatest benefit, sometimes increasing 
radiation tolerance by about a factor of two. Jacobson 
and his associates*! have shown that transplantation of 
bone marrow, spleen, or embryo homogenate to mice 
which have received a dose of x-rays increases radiation 
tolerance. The greatest part of this effect is owing to 
transplantation of healthy, unirradiated bone-marrow 
cells, which prosper on the tissue bed of radiation-killed 
bone marrow. Apparently, one may allow death of all 
cells of the marrow by whole-body x-irradiation, and 
the animal may be resurrected by the administration 
of a sufficient quantity of serologically compatible 
healthy cells. This technique has given rise to valiant 
attempts now being made in various medical installa- 
tions to save terminal leukemia patients. Patients re- 
ceive a lethal dose of whole-body radiation, followed by 
a bone-marrow transplant obtained from healthy rela- 
tives. The radiation is supposed to kill all bone-marrow 
cells, including the leukemic cells. If the transplanted 
marrow saves the patient from the radiation effects, it 
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is hoped that leukemia will not recur. Because of im- 
munological reasons, much research is needed before 
bone-marrow transplants are completely safe for 
humans.” 


LONGEVITY 


In normal populations of humans, animals, inverte- 
brates, and even unicellular organisms, there appear to 
be identical laws predicting the rate of death of the 
population. About a hundred years ago, Gompertz 
worked out details of this function, and at the present 
time Jones et al. have applied it in detail to many 
problems of population survival. 

The relative rate of death of a population of N in- 
dividuals as function of time £ is 


@ logN 
yee de 


al, 
f where a is a characteristic constant of the population. 
| To illustrate this principle, a normal survival curve of 
mosquitoes is reproduced in Fig. 7.** The relative rate 
of death is an exponentially increasing function. In adult 
humans, the death rate is described by a similar ex- 
ponentially increasing function, with the rate doubling 
itself about every eight years. According to the theory 
by Jones, a single dose of radiation will modify the rela- 
tive rate of death by shifting it upwards, parallel to 
itself. Continuous irradiation is supposed to increase 
the rate constant, a. Although many longevity estima- 
‘ tions were made on the basis of this theory which can 
oe, ye used for daily permissible-dose determinations, rigor- 
xperimental proof for radiation effects is not as yet 
ilable. Several other theories have been proposed.*® 
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CARCINOGENESIS 


It is of great interest to know something of radiation 
carcinogenesis. Cancer cells represent a population 
which is abnormal in the sense that (a) it has become in- 
dependent of some of the growth-controlling factors 
characteristic of the tissue from which its cells originate; 
(b) it is frequently able to metastasize; and (c) it may 
have metabolic products which lead to death of the host. 
The rate of incidence of “spontaneous” cancer, chemi- 
cal-, virus-, or radiation-induced neoplasms, usually, at 
least in some definite time interval, follows a law similar 
to that propounded for the rate of death of populations: 
The rate of onset increases as an exponential function 
of time. As an illustration, the rate of onset of bone 
tumors resulting from radioactive Sr® incorporated in 
bone is shown in Fig. 8; this work was done at the 
Argonne National Laboratory.**** The rate constant of 
tumor incidence increases with increased radiation dose . 
and is a more important index of carcinogenesis than is 
the over-all increase of tumors in a lifetime, which, in 
addition to the cancer-inducing factor, depends upon 
factors determining longevity. In practice, such an ex- 
ponential rate curve means a ‘‘time lag” of varying 
length before the demonstrable onset of neoplastic 
growth. 

The detailed biochemical steps that lead to cancer 
are not known, but it is useful to draw an analogy be- 
tween diploid cells of animal tissue that have received a 
dose of radiation and diploid yeast cells, discussed 
earlier in this paper. An individual, sublethally irra- 
diated yeast cell frequently develops an abnormal colony 
distinguished by its slow rate of growth in the process 
of “recovery.” Some of the cells regain their full ability 
to proliferate by processes outlined in the section in 
which the recovery from radiation damage was dis- 
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Fic. 8. Daily probability of bone-tumor development in the 
mouse following monthly injections of Sr (after Brues*). Ab- 
scissas: time after first injection (days), Ordinates: daily proba- 


bility of bone-tumor development. Curves show effect of monthly 
doses of Sr®, of 1.0,-9.5, 0.2, 0.1, and 0.05 „C/g. 


** Note added im proof.—A. Brues and M. Finkel have published 
recent interpretations of their experimental work in carcinogenesis 
[Science 128, 637, 693 (1958)]. Examples of other views on re 
subject are found in the work of E. B. Lewis [Science 125, 96 
(1957) ] and in references 36, 40, and 42. 
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cussed. This process often takes many cell generations, 
and in the course of it a number of changes can occur 
that will modify the genetic constitution of the cell. 
Changes resembling carcinogenesis in yeast cells also 
have been discussed by Lacassagne ef al." Maisin 
et al. and Warburg.® 

Most human cells normally are under hormonal and 
possibly neural control of the homeostatic apparatus 
and, as a result, they usually are in a state of prolifera- 
tion which is below their maximum proliferative ca- 
pacity. As an illustration, consider the human breast. 
This organ develops under elaborate hormonal influence, 
responding to secondary sex hormones elaborated by 
the ovaries and adrenals and to lactogenic hormone. In 
addition, the cells of this gland are influenced by growth 
hormone and posterior pituitary hormones. The state 
of this tissue and its metabolic activities are conveyed 
in some way to the hypothalamus and the pituitary, 


“and as part of homeostatic control the gland may re- 


ceive increased hormonal stimulation when its perform- 
ance is below par. 

There is increasing evidence available to show that 
hormonal balance is an important factor in radiation 
carcinogenesis. The intricate pattern of the hormonal 
factors has been studied for more than 80 years; here 
a few recent experiments only are noted, with mammary 
cancer as a restrictive example. Lacassagne has shown 
clearly that mammary cancer in mice may be induced 
by estrogen in females, as well as in castrated males. 
More recently, it appears that the joint application of 
estrogen and lactogenic hormone accelerates the onset 
of cancer. Bond ef al.*° found that a sublethal dose of 
whole-body irradiation in a strain of rats induced mam- 
mary cancer. However, the frequency of mammary 
cancer was decreased greatly if the ovaries of the ani- 
mals were removed, thus eliminating most of the endog- 
enous, secondary sex hormones. It is known that, 
following subtotal I! thyroidectomy in mice, thyroid 
tumors as well as pituitary tumors result. Furth eż al.*! 
found that some rats recovering from whole-body radia- 
tion developed pituitary tumors, apparently in response 
to hormonal demands from endocrine target organs. 
When such tumors are transplanted into other rats of 
the same inbred strain, hormonal stimulation from the 
graft includes adrenal, thyroid, and mammary tumors 
in the new hosts. Apparently, whenever some tissue in 
the hypothalamic-pituitary-endocrine organ system de- 
velops sublethal tissue damage involving chromosome 
derangements, so that the tissue is unable to perform 
its functions satisfactorily, there is strong hormonal 
and/or nervous stimulation of these tissues causing in- 
creased rate of proliferation. Under such conditions, 
genetic rearrangements may lead to neoplastic tissues. 
An interesting illustration is furnished by localized 
deuteron irradiation of the rat pituitary at Berkeley.” 
Here 945 rep to the rat pituitary led within one year 
to the development in every irradiated animal of pitui- 
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Fic. 9. Radiation carcinogenesis. 


tary tumors, with 3.5 tumors per gland. Higher doses 
led to lessened incidences of tumors, and when most 
pituitary cells were killed, none of the animals developed 
pituitary tumors. 

There are many other instances of hormonal influence 
over carcinogenesis. Perhaps the most complete investi- 
gations of this kind were made by Kaplan and his 
group.” 

It would appear that a better understanding of 
homeostatic control mechanisms may lead to an under- 
standing of the etiology of cancer and possibly to some 
methods of prevention or delay of its onset. In the 
meanwhile, it has been apparent that cancer cells that 
do not become completely undifferentiated in the car- 
cinogenic process still retain some measure of depend- 
ence on hormonal control. Work is in progress in many 
countries to hypophysectomize patients with advanced 
and progressing metastatic carcinoma, in the hope that 
the cancer cells will stop growing and the tumors will 
regress. 

The proton- and a-particle beams of the Berkeley 
184-in cyclotron have proved to be promising tools for 
research on hypothalamic functions and for therapeutic 
investigations involving radiation hypophysectomy. 
The techniques and initial results have been described. 
While hypophysectomy is not the final answer to cancer 
therapy, there are some patients who appear to have 
benefited from the pituitary irradiation. Detailed bio- 
chemical study of induced tumor regressions may bring 
additional clues with respect to the nature of the hor- 
mones which control normal or abnormal proliferation. 

In conclusion, radiobiology has been useful as an aid 
to understanding not only the complex physiological’ 
changes following irradiation exposure, but also, the 
mechanism of carcinogenesis. Basic research on mole- 
cules and cells has helped in the understanding of some 
phenomena of cell division and of the nature of some 
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genetic transformations. In Fig. 9, the present status of 
understanding of radiation carcinogenesis is shown in 
simplified fashion. It is apparent from the figure that 
there is a qualitative similarity between postirradiation 
events in diploid yeast cells and in somatic tissue cells. 
Many of the diploid yeast survivors develop small- 
colony mutants, some of them cytochrome deficient, 
that proliferate slowly and revert to rapid cell division 
many generations later. This process may be analogous 
to the slow development of benign precancerous lesions 
and the subsequent somatic mutation to cancerous cells. 
The time rate of occurrence of both sets of phenomena 
is quite similar. It is clear, then, that more work is 
needed to ascertain the genetic and biochemical features 
of delayed recovery in irradiated unicellular organisms 
and changes in the aerobic-anaerobic metabolism which, 
according to Warburg,® are known to occur in carcino- 
genesis. Perhaps such work may furnish some day direct 
clues to the detailed etiology and inhibition of cancer. 
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HE most important concepts of the structure of 
the nucleus are based on information drawn from 
three rather separate approaches which are now con- 
verging. These approaches had their origins in the nine- 
teenth century. The first of these—the optical approach 
—had its origin with Robert Brown’s first recognition 
of the nucleus by the use of the light microscope in 1833. 
This distinguished botanist was also the discoverer of 
e what is now called Brownian movement. The amplifica- 
tion of his original observation, as carried out with 
various optical tools, forms an exceedingly important 
part of present knowledge of the nucleus. 

The second stream of knowledge is that of genetics. 
It can be regarded as beginning with Mendel’s observa- 
tion that inherited characteristics are transmitted from 
generation to generation in ratios of simple whole 
numbers. During the early part of the present century, 
derivatives of this discovery, correlated with optical 
observations on the nucleus, led to the field of cyto- 
genetics. Subsequently, the information derived from 
genetic sources has played an essential part in concepts 
of nuclear structure. Indeed, the genetic data provide 
the only evidence regarding certain important struc- 
tural features of the nucleus. 

One can trace the third stream, the chemical one, to 
the work of Miescher, who published an important 
compendium in 1897 on the chemical characteristics of 
extracts derived from pus. Lamentably, this raw ma- 
terial was abundantly available in those days, before 
aseptic surgery had become popular. From this con- 
venient product of human suffering, Miescher extracted 
substances which he named nucleic acids. 

One can ask if the nucleus is an essential part of the 
cell. One can recall the red cell of the mammal, which 
is active for about 120 days without a nucleus. But how 
do other cells fare if the nucleus is removed? This has 
been investigated in a relatively small number of forms. 
It is possible to remove the nucleus from an amoeba by 
surgical means, as Mazia, for example, has done. It is 
possible also to amputate and study non-nucleated 
fragments of certain large algae, such as Acetabularia. 
One can summarize much by saying that a non-nucle- 
ated cell or cell fragment can survive for a certain 
period. An amoeba can eat, it can move about after 
enucleation, but in due time its synthetic capacities 
appear to degenerate, and, like the red cell, it perishes 
without reproducing. The red cell is perhaps the most 
successful of cells in functioning for a long period with- 


out a nucleus. 
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It is possible to isolate nuclei and to separate them 
from cytoplasmic components. If such isolated nuclei are 
studied biochemically, their metabolic behavior can be 
followed. The most elegant work along these lines has 
been carried out by Mirsky and Allfrey, based on earlier 
work by Dounce. The nuclei contain protein, DNA, 
RNA, and small amounts of lipid. Allfrey and Mirsky 
have found that isolated nuclei are capable of carrying 
out amino-acid incorporation into proteins and that 
certain other enzymatic activities are associated with 
the nucleus. f 

If one examines the unstained nucleus with a light 
microscope, either in the living cell or after fixation with 
ordinary reagents, one sees regions of varying density. 
This inhomogeneity of structure becomes even more 
striking following staining, since the nuclear components 
vary in their affinity for dyes. The components which 
bind the dyes have been called chromatin. As a rule, 
the term “chromatin” is used primarily to designate 
strongly-staining materials in the nucleus of the cell. It 
is a general term without any specific chemical signifi- 
cance. The term “nuclear chromatin” is applied to two 
distinct structures within nuclei. One comprises the 
chromosomes themselves; the second consists of nucleoli 
which are accumulations containing considerable quan- 
tities of RNA. Many of the staining reactions of chromo- 
somes and of nucleoli are very similar or identical, but 
there are a few methods for distinguishing them from 
each other. These depend on chemical differences be- 
tween DNA on the one hand, which dominates many 
of the properties of chromosomes, and RNA on the 
other, which has an important role in the nucleoli. 

The appearance of nuclear chromatin can change 
greatly under different functional states of the cell. As 
the cell approaches a mitotic division, the chromatin 
material rearranges itself. At that time, one easily can 
see well-defined bodies, the chromosomes, so called be- 
cause they can be stained so as to assume very vivid 
colors. It is well established that the chromosomes are 
present in the intermitotic nucleus, where they usually 
occur in a rather tenuous form which makes it difficult 
to recognize them morphologically. Yet chromosomal 
activity can be detected by a number of different means. 
The genetic method is perhaps the most powerful. 

In certain specialized cases, the chromosomes may be 
present in a form quite different from the dispersed state 
which they assume in the normal intermitotic cell. For 
example, in the salivary gland cells of the fruit fly, 
Drosophila, the intermitotic chromosomes occur as dis- 
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crete bodies known as “giant” chromosomes, which 
may be over 200 long. Under the light microscope 
they show a longitudinal fibrillar structure and a cis 
tinct series of aperiodic crossbands. These crossbands 
are unique in the sense that they exist in distinct and 
recognizable patterns in each chromosome. These pat- 
terns are important for correlating the genetic influence 
of the chromosomes with its structure. 

Consider now the question : How can one exploit the 
properties of the various chemical components of the 
nucleus in such a way as to gain information about their 
disposition, structure, interactions, and relationships? 
Both acid and basic dyes are bound by many of the 
components of the nucleus. From this, it is inferred that 
nucleoproteins behave as a mixture of anionic and 
cationic ion-exchange resins. These properties can be 
attributed to their negatively charged phosphate groups 
and positively charged protein groups. One can exploit 
these characteristics by setting up a suitable competitive 
system which introduces colored cationic molecules to 
bind to the nucleic-acid phosphate groups, replacing 
positively charged protein groups. This procedure tags 
phosphate groups of the nucleic acids with colored 
tracers or indicators. The capacity of anionic polymer 
groups, such as nucleic-acid phosphate groups, to bind 
basic dyes is spoken of as the property of ‘‘basophilia.”’ 
Thus, nucleochromatin is often spoken of as being 
strongly basophilic. In this way, the nucleic acid can 
be tagged rather nonspecifically. Analogously, one can 
use anionic dyes such as azosulfonic acids to bind to 
polymers containing basic groups in tissue components. 
Such polymers are found in the nucleus in the form of 
the basic proteins which are associated with the nucleic 
acids to form nucleoproteins. 

Most of these dye-binding methods do not distinguish 
between RNA and DNA. But, from the analytical data 
on isolated nuclei, there is reason to believe that both 
are present in the nucleus. If one turns to another 
property of nucleic acids—namely, the absorption spec- 
trum in the ultraviolet of the purine and pyrimidine 
residues—it can be shown that the same structures 
which take up basic dyes also absorb strongly in the 
ultraviolet. This is consistent with the view that the 
nucleoprotein complex is responsible for the staining 

reaction and for the absorption in the ultraviolet. But 

the latter method does not permit a distinction between 

DNA and RNA components. Wi 
There is, however, a reaction which does distinguish 


between these two types of nucleic acids. It has proved 


ceedingly useful in characterizing the nucleic- 

pa hemi Ee cells. This reaction is known as the 
Feulgen reaction. It depends upon the presence m 
lei acids of linkages which are differentially sus- 
= m to hydrolysis. Using appropriate conditions, 
E will hydrolyze certain sugar-base linkages 
eee ‘A. The hydrolyzed sugars of 
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in DNA but ‘ae ed to reducing groups resembling 


eS =e ae can be labeled by coupling them to a 
ydes. 
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suitable colored reagent which reacts with aldehydes. 
Each single aldehyde group formed by hydrolysis is 
then tagged with a colored chromophore. This permits 
the DNA to be distinguished sharply from RNA. This 
useful reaction can be used quantitatively. 

There is another important approach to the problem 
of distinguishing the localization in cells of RNA and 
DNA. This involves the use of specific enzymes. The 
enzyme, ribonuclease, will depolymerize RNA but not 
DNA. If RNA is removed from the nucleus by action 
of this enzyme, a loss of dye-binding capacity may 
appear in certain structures. The material originally 
present which was removed by the ribonuclease is thus 
shown to be RNA. It is also possible to remove DNA 
by deoxyribonuclease, and thus to ascertain which 
dye-binding component contains DNA. Still another 
method, often less reliable and less rigorously specific 
than the others, can be used to stain DNA and RNA in’ 


contrasting colors in the same cell. These three methods 
—the Fuelgen reaction, the use of specific enzymes, and 
the use of differential staining—permit one to distinguish 
cytochemically between DNA and RNA and provide 


information about the localization of these substances 
within the cell. 

It turns out that DNA is found characteristically in 
the chromosomes and frequently as a jacket surrounding 
the nucleolar material. RNA is abundant in the central 
portions of the nucleoli and small amounts can be found 
within the chromosomes or scattered about in the 
nucleus. 

If a large number of nuclei from a given species are 
examined by the Feulgen or some other suitable method 
and the amount of DNA in the individual nuclei is 
measured by absorption microspectrophotometry, as 
carried out by Alfert and by Swift, a small number of 
cells (for example, sperm cells) display a certain unit 
quantity of DNA. However, the majority of cells con- 
tains approximately twice this unit amount, while a 
third group of cells yields values clustering around four 
times the unit amount. In certain organisms—in the 
human liver, for example—one may find nuclei with up 
to eight times the unit value. From this, it derives that 
the amount of DNA in individual nuclei tends to occur 
in integral multiples of some unit quantity. 

The chromosomes in dividing cells can often be 
counted with precision. In most higher animals, sperma- 
tozoa, the spermatids, and fully mature ova are found 
to contain a certain number of chromosomes. This 
number is called the “haploid number,” and the chro- 
mosomes making up this number comprise a single set 
of chromosomes. These cells contain a single unit quan- 
tity of DNA, as mentioned earlier. The great majority 
of cells in most multicellular organisms contains twice 
the number of chromosomes characterizing a single set, 
and contains twice the unit quantity of DNA. Such cells 
are said to contain the diploid number of chromosomes. 
In diploid cells, most of the chromosomes occur in 
homologous pairs. Each member of a pair resembles its 
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mate closely, but morphological and genetic differences 
ə between the members of a homologous pair are often 
found. One may also find cells with four, six, eight, or 
some other small integral multiple of the haploid number 
of chromosomes, and with the same multiple of the unit 
amount of DNA found in haploid cells. Such cells are 
called tetraploid, hexaploid, or octaploid, respectively. 
In mammalian liver, one can find many tetraploid and 
some octaploid cells in a population which is predomi- 
nantly diploid. 

Each set of chromosomes contains a single set of 
genes. Thus, in diploid cells, many genes are represented 
twice, once in each member of a homologous pair. Such 
cells are said to be homozygous for the characteristics 
represented by genes which occur in identical form 
twice in each cell. But some gene loci are not identical 
in each member of a homologous pair of chromosomes. 
Such genes are represented only once in each cell, which 
is then said to be heterozygous with respect to the 
characteristic represented in those dissimilar loci. These 
data have contributed to the concept that there is a 
definite amount of DNA characteristic of each chromo- 
some set, each chromosome, and each gene locus. 

Some organisms live very well through most of their 
life cycle with haploid numbers of chromosomes in each 
nucleus. The fungus, Neurospora, for example, which 
has figured extensively in genetic studies, has a single 
set of chromosomes in each nucleus throughout long 
phases of the life cycle. The diploid phase may be very 
brief. In most metazoa, however, the diploid phase 
predominates. 

One occasionally finds numbers of chromosomes that 
are not even multiples of the haploid number. Such cells 
are called “aneuploid” and contain amounts of DNA 
which are not integral multiples of the amount of DNA 
in haploid cells. Relevant examples are provided by 
certain cancers and leukemias. As a rule, the presence 
of aneuploidy and of morphologically abnormal chromo- 
somes is interpreted as evidence for some genetic 
abnormality. 

Levinthal (p. 227) remarks that information from 
genetics has led to the concept that the genetic carriers 
are arranged in linear sequence along some structure. 
The identification of the chromosome as this structure 
marked a major advance in the history of cytogenetics. 
Correlations between genetic and morphological data 
were most successfully worked out initially in Dro- 
sophila, where the chromosomes of the salivary glands 
are large and where genetic studies are readily carried 
out because of the short life cycle of the fly. Similar 
studies correlating genetic behavior with morphological 
abnormalities of chromosomes have now been carried 
out in a number of different forms. However, much of 
the corresponding work in virus and bacterial genetics 
is carried out conceptually, without direct visualization 
of chromosomes. The genetic data are used to construct 
a linear sequence, but no attempt is made ordinarily to 
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correlate this sequence with morphological features of 
chromosomes—one important reason being that chro- 
mosomes have not been incontestably visualized in 
bacteria or in viruses. In these cases, the structural con- 
cept of linear sequence of units depends entirely upon 
genetic evidence. 

At present, most of the important concepts of nuclear 
structure have been derived from genetic studies, from 
chemical analyses, and from examination with the light 
microscope. When combined ingeniously, these ap- 
proaches have proved to be very powerful. 

The electron microscope has yielded some additional 
information. It is evident that the intermitotic nucleus 
is surrounded by an envelope which consists of two unit 
membranes, an inner one and an outer one. These two 
membranes are joined to each other at certain sites. 
The lines of junctions surround and define pores several 
hundred Angstroms in diameter, perforating the nuclear 
membrane. Through these pores, the nuclear and cyto- 
plasmic matrices communicate directly. The outer mem- 
brane of the nuclear envelope may also be continuous 
with membranes in the cytoplasm. The interior of the 
nucleus, however, is free from membranes, and is thus 
in sharp contrast to the cytoplasm, where membrane 
structures are frequent and often densely packed. 

Electron micrographs usually show a rather irregular 
accumulation of granular material within the inter- 
mitotic nuclei. The granules presumably represent DNA 
and RNA, combined with protein. Very fine intra- 
nuclear helical threads have been described by some 
authors, but these are not seen with clarity and their 
significance is difficult to assess. No one has yet recog- 
nized structures in the nucleus corresponding to the 
nucleic-acid helices as studied by Hall and Doty, with 
the electron microscope, and pictured elsewhere in this 
volume (p. 107). One can find nucleoli in electron micro- 
graphs. They appear as rather irregular accumulations 
of granules more densely crowded together than else- 
where in the nucleus. 

Mitotic cells show no nuclear envelope. The chromo- 
somes are recognizable as dense accumulations of gran- 
ular material without any membranous investment. 

Thus, it appears that direct electron microscopy of 
the nucleus of ordinary cells has provided little informa- 
tion which is satisfying in the light of the physiological 
importance of the nucleus and its contents. 

In certain specialized types of cells, organized struc- 
tures have been detected with the electron microscope 
which are not characteristic of cells in general. Thus, 
in some sperm cells, dense striations running the length 
of the nucleus have been seen. Moses has detected: 
more-delicate organized structures in certain spermato- 
cytes. These appear as long thread-like elements with 
delicate side chains extending laterally. Efforts have 
been made to associate the filaments with some phase 


of chromosome structure. But, at the present time, it 
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stent with those ofa nucleic-acid helical chain. 


nucleus contains particles which closely resemble the 
ees P P particles of the cytoplasm, it reveals relatively 

lit ttle concerning the mechanism of transfer of nucleic 
d from nucleus to cytoplasm. Such a transfer would 
ride a means whereby the nucleus could transmit 
ormation to the cytoplasm. Yet one can see images 
hich Sugeest transition stages in the course of outward 


zh ‘very closely resemble the nucleoli. Similar ap- 
Epea ances observed with the light microscope have led 
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to the view that in some cells whole nucleoli may be 
discharged from the nucleus carrying large packets of 
RNA to the cytoplasm. On the other hand, it may be 
that much of the RNA escapes into the cytoplasm 
through the pores in the nuclear envelope. Another 
mechanism whereby RNA could pass from nucleus to 
cytoplasm involves a binding of the newly found RNA 
to the inner nuclear membrane. The latter then would l 
flow into the cytoplasm, carrying the bound particles 
with it. Such a mechanism, however, is not well 
documented. 

We are forced, then, to conclude that the task of 
correlating the fine structure of the nucleus with its 
function and with the chemical and genetic evidence 
at our disposal is largely before us. 
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Fine Structure of Cytoplasm: The Organization 
of Membranous Layers 


FRITIOF S. SyJOSTRAND 
Karolinska Institutet, Stockholm 60, Sweden 


HEN climbing the ladder of levels of organization 

of living matter, one finds nowadays that at 
least indications of steps are present along the whole 
ladder. Some steps may be weak and might break when 
exposed to too critical attention. Light microscopy has 
furnished information regarding a great number of sub- 
cellular structural elements, the so-called cell organelles, 
as well as of cytoplasmic differentiations of a more 
* diffuse character. The most impressive of the cell organ- 
elles which are localized in the cytoplasm are the mito- 
chondria and the Golgi apparatus. Basophilic regions 
of the cytoplasm represent more-diffusely distributed 
parts which long ago were called the ergastoplasm by 
Garnier. The reason for this term was the idea that 
these regions were especially active parts of the cyto- 
plasm. In the exocrine pancreas cell, a filamentous or 
lamellar structure was observed within the basophilic 
regions of the cytoplasm, ‘‘die Basalfilamenten” of 
Heidenhain, or “die Basallamellen” of Zimmermann. 
These various components of the cytoplasm were con- 
sidered immersed in the ground substance of the cyto- 
plasm, a component which was assumed to lack any 
higher degree of specialized organization. 

The submicroscopic organization of the regularly 
occurring cell organelles was completely unknown, how- 
ever. A few cell components, such as the myelin sheath 
of peripheral nerve fibers and the outer segments of the 
retinal receptors of the vertebrate eye, were assumed 
to consist of alternating layers of lipid and protein mole- 
cules. This assumption was based upon polarization- 
optical data of these strongly birefringent components. 

Electron microscopy has made possible the fairly 
close analysis of the submicroscopical organization of 
these various components. In some cases, the combi- 
nation of electron-microscope data with polarization- 
optical and x-ray diffraction data has made it possible to 
propose models containing some features of the molecular 
architecture of these components. In this latter respect, 
the results can be considered rather crude. 

On the other hand, electron microscopy has not re- 
vealed any definite new structural component of the 
cytoplasm. It has helped in bridging the gap between 
the molecular and the organizational levels of the cell 
organelles by making supramolecular elements available 
for direct observation. 

The most striking feature of the ultrastructural 
organization of the cytoplasm is the frequent occurrence 
of membranous components in various cell organelles, 
as well as in the ground substance of the cytoplasm. 
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Membranes of various dimensions and different organi- 
zation appear to represent a common and basic principle 
of organization in the cytoplasm. 

Some examples of such membranous elements are 
presented here, together with some arguments which 
justify an interpretation of some of the structural 
patterns as reflecting a certain molecular structure of 
the membranes. Also, the probability that the observed 
structural patterns represent preformed patterns exist- 
ing in the intact living cell is discussed. Finally, some 
ideas are presented regarding the functional significance 
of the various membranous elements. 

When evaluating the structural patterns observed by 
means of the electron microscope, it is wise to account 
for a certain degree of deformation introduced by 
preparatory techniques. When comparing two patterns, 
sometimes one pattern easily might be imagined as 
derived from the other as a result of disorganization or 
deformation. In such a case, the second pattern is 
selected as probably more directly related to the i vivo 
pattern. When a certain variation in the patterns is 
observed, the most frequently occurring pattern is 
selected as the most representative. 

The chapter on the biochemistry of mitochondria by 
Lehninger (p. 136) makes it quite natural to start this 
survey with some electron micrographs of mitochondria. 
For this purpose, mitochondria may be chosen from 
almost any type of cell. Let us choose those of the 
retinal receptors of the eye. The first figure shows a 
schematic drawing of the rod type of receptor in the 
guinea pig retina. These receptors are segmented cells 
with each segment structurally organized in a charac- 
teristic and different way (Fig. 1). The outer segments 
which are most remote from the pupil of the eye contain 
the photochemically active molecules, such as rhodopsin 
in the rod cells. Here, the primary reactions in convert- 
ing electromagnetic energy into chemical energy take 
place. The inner segment contains all of the mitochon- 
dria of the receptor and can be looked upon, therefore, 
as the energy-generating center of the cell. 

The rod and cone fibers connecting the inner segment 
with the synaptic bodies are organized like unmye- 
linated nerve fibers. The vitreous end of the receptor. 
cell forms the synaptic body where the synaptic con- 
tacts between receptors and nerve cells are located. 

Figures 2 and 3 picture the inner segment, ZS, of a 
cone cell in the perch retina. Most of the inner segment 
consists of a dense aggregation of closely packed mito- 
chondria which together form the so-called ellipsoid. 
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; ‘ite P tic drawing of retinal rods in the guinea pig eye. 
i event 3 iS. AE segment; r.f., rod fiber; r.syn., rod 
ic body; mit., mitochondria; r.n., rod_nucleus [from F. S. 
Intern. Rev. Cytol. 5, 455 (1956)]. 
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All of these membranes appear to be triple layered.“ 
It is striking to note the rather constant spacing of the 
layers of these mitochondria membranes. 

In 1952, when we first presented our structural model 
of mitochondria, Palade? proposed a different model 
consisting of a single-layered surface membrane which 
showed local infoldings which he called “cristae mito- 
chondriales.”” These cristae left a central space in the 
mitochondrion free. After confirming our observations 
on the triple-layered character of the surface membrane, 
Palade changed his model by enveloping his original 
model in a peripheral opaque layer. He assumes® that 
this layer represents the surface membrane of the mito- 
chondrion and that an outer space or chamber extends 
between the two opaque layers located at the surface 
of the mitochondrion. This space is continuous with 
spaces which extend in the cristae. An inner mitochon- 
drion chamber forms a continuous central space. This 
interpretation differs from our own. 

Our interpretation of the mitochondrial pattern is 
based upon certain observations that we had made on 
some typical lipoprotein systems—namely, the outer 
segments of retinal receptors?’ and the myelin sheath 
of peripheral nerves.’ In both cases, polarization- 
optical data, and, in the myelin sheath, x-ray diffraction 
data had permitted the proposal of models showing 
alternating layers of lipid and protein molecules. X-ray 
diffraction data revealed some dimensions for the thick- 
ness of the layers in the myelin sheath and of the double 
layer of dried, mixed nerve lipids. For the latter, a 
value of 67.4 A was obtained." 

A discussion of the observations made on the outer 
segments of retinal receptors gives a necessary back- 
ground for our interpretation of the mitochondrial 
pattern. In electron micrographs of sections (Figs. 4-6) 
oriented parallel to the long axis of the outer segments 
of rods and cones, alternating light layers and pairs of 
opaque layers are seen, the two opaque layers of a pair 
being fused along their rims. They thus bound a light 
interspace which, in guinea pig rods, has a thickness of : 
70 to 80 A. These pairs of layers form triple-layered 
disks, the total thickness of which varies from 100 to 
250 A with the species and with the type of receptor, 
but remains constant for any one type of receptor. 
These disks are the unit structure of the outer segments. 

The outer segments are 40% lipid in content.” More 
than 80% of these are phospholipids. The polarization- 
optical data have revealed that presumably the lipid 
molecules are oriented parallel to the long axis of the 
outer segment, and that the protein molecules form 
transversely oriented layers." 

When fragmenting isolated outer segments, it is 
possible to obtain round disks of varying thicknesses 
and also fragments of such disks.” The thicknesses of 
the disks isolated from the guinea pig retina are multi- 
ples of 140 A—that is, multiples of the thickness of the 
triple-layered unit disk which can be observed in sec- — 
tions and which measures 140 A in thickness in the 
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Fic. 2. Survey picture of the bound- 
ary between the outer, OS, and inner, 
IS, segments of a cone cell in the perch 
retina. Most of the inner segment con- 
sists of a dense aggregation of mito- 
chondria [from F. S. Sjöstrand, Ær- 
gebnisse der Biologie (Springer-Verlag, 
Berlin, 1958), Vol. XXI, p. 128]. 
44 000. 


guinea pig retina. The thickness of the thinnest mem- 
brane fragments is about 30 A. These fragments are 
osmiophilic. The thickness of the osmiophilic layers of 
the unit disks as observed in sections through guinea 
pig retinas is 30 to 40 A. There seem to be no doubt 
that these components in fragmented and in sectioned 
material are identical. The 140-A thick disks which 
were obtained by fragmentation could be demonstrated 
to consist clearly of two thin membranes, and under 
favorable conditions they could be split completely into 
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two thinner components with a thickness of 70 A at 
the edge. 

The fact that the thicknesses of the various disks 
obtained by fragmentation represent multiples of 140 A - 
makes it justifiable to conclude that, when the outer 
segments are fragmented in piles of unit disks of differ- 
ent thicknesses and when these piles are dried on a  ăć 


supporting film, the interspaces between the triple- ts 
layered disks contribute insignificantly to the thi 


nesses of the piles. This means that these interspace: 
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p Fic. 3. Higher magnification of a region at the border between 
iN the inner and outer segments of a cone cell in the perch retina. 
ANEN Several mitochondria are observed densely aggregated. The triple- 
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similarities to the triple-layered disks of the outer segment [from 
F. S. Sjöstrand, Ergebnisse der Biologie (Springer-Verlag, Berlin, 
1958), Vol. XXI, p. 128]. X65 000. 
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are filled mainly with an aqueous, ionic medium con- 
taining little or no lipids. The lipid molecules, therefore, 
are located presumably in the triple-layered disks. Their 
localization (Fig. 7) in the less-opaque space bounded 
by the 30- to 40-A thick osmiophilic layers was assumed 
= because the dimension of this space, 70 to 80 A, could 
well accommodate a double layer of lipid molecules. 
h Furthermore, the osmiophilic layers had shown a rather 
; remarkable tensile strength in the fragmentation experi- 
F ments which appeared to point to a protein nature. The 
total volumes of the opaque layers and of the spaces 
= surrounded by these layers are estimated to be about 
A E is observation is in accordance with the deter- 
ire of 40% for the concentration of lipids in 
ry weight. 
milarity of the mitochondrial pattern and that 
e outer segments of retinal receptors is rather 
king. Values as high as 40% were reported for the 
d content of mitochondria, and they showed positive 
gence after freeze-drying," which could be as- 
o be due to oriented lipid molecules. The triple- 
branes of the mitochondria were interpreted 
ayers sandwiching a double 
. 8). This interpretation is 
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based upon a series of evaluations of indirect evidence 
and upona generalization which well might be erroneous. 

The myelin sheaths of peripheral nerves also show a 
characteristic layered structure. Furthermore, periph- 
eral nerves can be fragmented into thin membranous frag- 
ments.'*!® In this instance, polarization-optical data, 
as interpreted by Schmidt,” revealed that the lipid 
molecules were oriented radially and that protein mole- 
cules formed concentric layers around the axis cylinder. 
X-ray diffraction data by Bear, Palmer, and Schmitt! 
were interpreted as showing a repeat period of 171 A 
in amphibian nerves, and 185 A in mammalian nerves. 
The repeat period was interpreted as containing two 
bimolecular leaflets of lipids, each about 67 A thick, 
and a layer of proteins about 25 A thick. A concentric- 
ally layered structure could be observed in the electron 
micrographs, obtained in 1952, of osmium-fixed, 
sectioned peripheral nerves. Opaque layers about 25 A 
thick are arranged concentrically in the myelin sheath 
with a definite periodicity. In each period, a fainter 
opaque line is seen to halve the main period. The mean 
periodicity was estimated at 120A in the electron 
micrographs of 20 nerve fibers. 

Although the x-ray diffraction and the electron- 
microscope data did not coincide as to the length of 
the period, the over-all scheme predicted from the 
former data appears to be pretty well confirmed by the 
latter. From the dimensions, it was concluded that the 
opaque layers correspond to the protein layers, and the 
less opaque layers to bimolecular leaflets of lipids. The 
recent extensive work by Fernández-Morán and Finean” 
on this problem seems to support this interpretation. 

A membranous component which is frequently pres- 
ent in the basophilic regions of the cytoplasm is dis- 
cussed next. These membranes are particularly abun- 
dant in the excretory pancreas cells, the salivary gland 
cells, the plasma cells, and other types of cells in which 
a somewhat intense protein synthesis takes place. 

In the exocrine pancreas cells (Fig. 9), most of the 
ground substance of the cytoplasm is filled with this 
type of membrane.!:!8 In osmium-fixed material, the 
membrane is easily identified by the numerous opaque 
particles that are attached to one surface. These parti- 
cles have an average diameter of 150 A. Their form 
varies; the particles are more or less irregularly, angu- 
larly shaped. The membrane to which these particles 
are attached consists of an osmiophilic component which 
is about 40 A thick. These membranes are arranged in 
pairs and a pair bounds a narrow space or, as in the 
thyroid epithelium, rather large, irregularly shaped 
spaces. They thus appear to divide the cytoplasm into 
two different parts or compartments.” 

Opaque particles of identical appearance, as those 
attached to the membranes, frequently occur free in 
the cytoplasm. Palade and Siekevitz” clearly demon- 
strated that the opaque particles are responsible for the 
RNA content of the microsome fraction from pancreas 
tissue. The terms microsomes, ribosomes, or microsomal 
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Fic. 4. Longitudinal section through the outer segment of a retinal rod cell from the perch eye showing the ‘i 
uniformly fayered structure of the outer segment. X59 000. 


particles now seem to be used to classify this component At this point, it seems justifiable to make some 
which represents only a minor part of the classical comments on what is known about the fixation and 
microsome fraction. embedding artifacts that are introduced when fixing — 
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Fic. 5. Longitudinal section through a retinal-rod cell of the 
perch eye showing the triple-layered disks, which represent the 
elementary component of the outer segment [from F. S. Sjöstrand, 
Ergebnisse der Biologie (Springer-Verlag, Berlin, 1958), Vol. XXI, 
p. 128]. X82 000. 


MY i and embedding labile living matter. The tissue is fixed 
in a solution of osmium tetroxide, dehydrated in a 
graded series of concentrations of ethyl alcohol, and 
then transferred to methacrylate which is polymerized 
before the tissue can be sectioned. 

At the beginning, we were very much concerned about 
this problem and, therefore, we intentionally chose to 
analyze such structural components about which a 
little was known from an ultrastructural point of view 
through polarization-optical and x-ray diffraction analy- 
sis. In the outer segments of retinal receptors and in the 
myelin sheath, patterns in complete harmony with 
polarization-optical data obtained from fresh, unfixed 

"8 material were observed by means of electron micros- 
= copy, as mentioned in the foregoing. Regarding the 
vi exocrine pancreas cells, there were no data available 

regarding birefringence of the cytoplasm in vivo. Living 
exocrine pancreas cells in mice were analyzed in polar- 
ized light and we found that the cytoplasm was bi- 
_refringent.1. When analyzed, this birefringence proved 
to be negative with the axis oriented perpendicularly 
to the cell membrane, a result that could fit in very 
with the electron-microscope picture. — 
n the direct study of living material is excluded, 
ational way of checking the reliability of such 
z technique is to change the technique so 
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radically that it appears unlikely that the two tech- 
niques could produce identical distortions. The second 
preferred preparatory technique is then preservation 
by means of freeze-drying. In fact, we proceeded in the 
opposite direction and studied frozen-dried material 
first! and later checked the observations made on exo- 
crine pancreas cells by means of osmium fixation and 
birefringence analysis. 

The freeze-drying technique is a rather delicate 
procedure with many pitfalls. The observations made 
on frozen-dried, sectioned material, stained with phos- 
photungstic acid, were rather surprising.” Similar struc- 
tural patterns could be observed in the ground substance 
of the cytoplasm as well as in the mitochondria, with 
the remarkable exception that the patterns appeared 
as negative patterns, as compared to those obtained 
after osmium fixation (Figs. 10 and 11). These results 
mean that these patterns probably are preformed and 
exist in the living cell. At the same time, however, they 
may make it necessary to re-evaluate the interpretation 
of the molecular structure assumed to be responsible 
for these patterns. 

One feature of the frozen-dried specimen must be 
noted. The so-called RNA particles do not show up. 


Fic. 6. Section oriented perpendicularly to the triple-layered 
disks of the outer segment of a cone cell in the perch eye. Notice 
the tendency to a uniform blackening of the whole triple-layered 
disk due to a staining of the middle layer [from F. S. Sjöstrand, 
Ergebnisse der Biologie (Springer-Verlag, Berlin, 1958), Vol. XXI, 

p- 128]. X87 000. 
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Fic. 7. Schematic drawing to illus- 
trate a proposed interpretation of the 
localization and orientation of lipid 
and protein molecules in the triple- 
layered elementary disks of the outer 
segments of retinal receptors [from 

eF. S. Sjöstrand, Ergebnisse der Biolo- 
gie (Springer-Verlag, Berlin, 1958), 
Vol XL, p. 128]. 
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The cytoplasm between the cytoplasmic membranes 
appears homogeneous even after long periods of stain- 
ing. One cannot exclude, therefore, the possibility that 
these particles are artifacts formed through an aggrega- 
tion or precipitation in connection with the preparatory 
procedure. This possibility has to be tested more exten- 
sively than has been done to date. 

Even if some doubts can be raised regarding the par- 
ticulate form of the RNA-protein components in the 
cytoplasm of intact living cells, there is little or no 
doubt that these particles indicate the approximate 
location of RNA in the cytoplasm. The firm connection 
of the particles with the cytoplasmic membranes, on 
the other hand, can be questioned as due to secondary 
absorption. This would explain why such particles are 
attached to the outer opaque layer of the nuclear mem- 
brane in cells where the RNA is located close to the 
nucleus. Furthermore, it explains why the particles are 
absent when the cytoplasmic membrane is located very 
closely to mitochondria where the amount of RNA 
easily can be assumed to be inadequate for the forma- 
tion of such RNA particles. 

Another structural component of the cytoplasm is 
the so-called Golgi apparatus. ‘This component has been 
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interpreted by light microscopists as an apparatus in- 
volved in secretory mechanisms, and it has been ex- 
tensively studied in glandular cells. The secretory 
products were assumed to be either produced or ac- 
cumulated in this region. 

Electron microscopy has revealed another type of 
cytoplasmic membrane as the main component of the 
Golgi apparatus.” =" These membranes are arranged in 
pairs also and bound a space that can be either very 
narrow, about 100 A or less in width, or very wide 
appearing as large vacuolar, irregularly shaped spaces 
(Figs. 12 and 13). The membranes consist of one osmio- 
philic component, 60 A thick. Presumably, a less osmio- 
philic component separates the membrane pairs, because 
the space between the pairs frequently is very constant, 
measuring about 60 A in width. 

It is rather striking that the zymogen granules which 
contain the secretory products of the pancreatic cells 
show a definite topographic relationship to the Golgi 
apparatus. The zymogen granules can be divided into 
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and of high density, and are bounded by a surface 
membrane about 50 A thick. The precursor granules 
are irregularly shaped and of lower opacity, but are 
bounded PY a surface membrane which shows out- 


ee nes described earlier reaches close 
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Fic. 8. Schematic drawing which 
illustrates an interpretation of the 
three-dimensional appearance of 
the mitochondria and of the mo- 
lecular architecture of the mito- 
chondria membranes. The three- 
dimensional representation A and 
B is based on an interpretation of 
the patterns observed in sections 
which can be, for instance, like 
the ones shown in Cand D. E gives 
some commonly observed dimen- 
sions of the mitochondrial mem- 
branes, and F a scheme of the 
proposed arrangement of lipid and 
protein molecules in the mem- 
branes [from F. S. Sjöstrand, in 
Fine Structure of Cells, Proc. Symp. 
VIII Cong. Cell. Biol., Leiden, 
1954 (Noordhoff, Ltd., Leiden, 
1955), p. 16]. 
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to the Golgi membranes. A zone of granulated cyto- 
plasm, however, separates the territories of these two 
types of membranes. 

Turning to the problem of the functional significance 
of these various kinds of membranes, the mitochondrial 
membrane is considered first. The aride by Lehninger 
(p. 136) introduces the present concept regarding the 
importance of the structural organization of the mito- 
chondria. The biochemical data point to a structural 
factor as important in connection with coupled phos- 
phorylation. The electron transport seems to depend 
upon an organization of macromolecular particles. The 
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Fic, 9. Cytoplasmic mem- 
branes in the basophilic 
cytoplasm of exocrine pan- 
creas cells. Notice the 
opaque 150-.\ particles at- 
tached to the one surface of 
the membranes which are 
arranged in pairs. In the 
lower part of the picture a 
cell boundary represented 
by two mutually parallel 
lines, and below the bound- 
ary a mitochondrion [from 
F. S. Sjöstrand and V. 
Hanzon, Exptl. Cell Re- 
search 7, 393 (1954)]. 
X 108 000. 


smooth coordination of various enzymatic reactions was phological point of view, but are interpreted as supp a 
imagined as being the result of a topographic arrange- ing this ideas The mitochondrial membranes 
ment of the enzyme molecules according to certain 
patterns. The membranes of the mitochondria were 
assumed to be composed to a great extent, or altogether, 
of mitochondrial enzyme molecules oriented and spread na interactions. This organi 
out in layers in combination with layers of oriented be important for a rapid step- ae 
lipid molecules. strate molecules, and possibly 

The experiments done so far to isolate the mem- synthesis. i 
branous components of the mitochondria from the One may assume that the is of the se 
ground substance have been rather crude from a mor- i a 
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steps. First, an enzyme-containing membrane 

formed which is able to synthesize the secretory 
roducts (Fig. 14). This membrane is manufactured in 
the Golgi apparatus from material delivered by the 
cytoplasm as represented by the RNA particles. These 
rticles swell, fuse, and mix with lipids, and extensive 

ults in the formation of vacuoles bounded 
embrane consisting of lipoproteins. The 
and pairs of membranes are formed 
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__ Fics. 10-11. Sections through exocrine pancreas cells preserved by means of freeze-drying. The methacrylate section was stained with 

Phosphotungstic acid before the examination in the electron microscope. The cytoplasmic membranes are visible as well as the mito- 
jut the pattern is a negative one as compared to that shown in Fig. 2. The spaces between the membranes (Fig. 11) are 
‘ filled with material in contrast to the lack of material, except the opaque 150-A particles, after osmium fixation. No such 
rticles can be observed in these pictures. Mi, mitochondria; CM, cytoplasmic membranes; V, ice crystal vacuole [from F. S. Sjöstrand 
d R. F. Baker, J. Ultrastructure Research 1, 239 (1958) ]. X30 000 (Fig. 10); X56 000 (Fig. 11) (Fig. 11 is on opposite page). 


which disintegrate into fragments forming the surface 
membrane of the precursor granules. The surface mem- 
brane then synthesizes the secretory products which 
are successively accumulated in the granule. Raw 
material can be’ delivered all of the time by the RNA- 
rich cytoplasm. 

The foregoing is a wild hypothesis that could explain 
the morphological observations made so far. It is 
derived, however, from still pictures; the various stages 
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Fic. 11 (for legend, see 
Fig. 10). 


have been selected subjectively from these pictures. 
Experimental evidence remains to be presented. One 
thing seems to be quite clear—namely, that the Golgi 
membranes can form vesicles or the membranes of 
secretory granules. This latter process is demonstrated 
better by the goblet cells in the intestinal epithelium. 
Holmberg,’ in our laboratory, found that, after ad- 
ministration of diamox, an inhibitor for carbonic anhy- 
drase, the Golgi apparatus in the ciliary epithelium of 
the eye partially disintegrated into vesicles and the 
whole cytoplasm became loaded with vesicles. These 
reversible changes could be described in a quantitative 
way and be correlated with a temporary 60% inhibition 
of the secretion of aqueous humor. 

In certain types of cells, the plasma membrane ap- 
pears extremely folded. It is not quite clear whether 
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these folds are tight infoldings of the plasma mem- 
brane or whether they represent interdigitating ridges 
of the cell surfaces of adjacent cells. Zetterqvist,?® 
in our laboratory, demonstrated that the plasma mem- 
brane can appear rather different at various parts of 
the surface of the same cell. In the intestinal epithelium 
(Fig. 15) the free surface of the cells forms a number of 
cylindrical processes, the so-called brush border. The 
plasma membrane bounding these processes appears as 
a 105-A thick triple-layered component (Fig. 16). 
Where two epithelial cells are in close contact, the 
plasma membrane appears as an osmiophilic layer 60 
to 70 A thick. Between the two osmiophilic layers of 
the two adjacent epithelium cells, there is a light space 
with a very uniform thickness of about 100 A. x 
On the other hand, the plasma membrane which 


T 


JOSTRAND 


5 


S% 


P 


17 IR EAP WO) I 


312 


i a_a 000 FFX '[(PS61) STF “2 Yosvasay [19D ndry ‘uozuepy `A pue puensols ‘S'a woy] 
So[nuvss uasowAz padojaaap Aljny 31V rqa “7 pue 07 uaasjaq sadvjs əJeıpəauəzur se pajosdiazul 
aq URS YIYAM səjnurı3 uasourdzoid “17 ‘səuvıqurw 1309 oy} wos dojaaap 0} 1vədde yey) soqnuess 
ussowuAzord 7 Ssauviquiau duse do ‘Jyp ‘snyeredde 18109 oy} Jo apis jeseq əy} uo səfnur13 yo 
“SOULIQUIOW 109 ‘PH 'svəaDued əsnow I JO Jð Əuəoxə ur ur snyeredde OD V ZI “Oly 


-89 RL eng asa, 


‘000 €8X “ZI “By eas ‘suoneə 


Tyo uoljvu 


104 ‘ZI “Sty jo Hed e jo uonroyiuseu ysy 


CC-0. Gurukul Kangri University Haridwar Collection. Digitized by S3 Foundation USA 


IMIONIT SAMIR WEA WIKIS Or Cry WOR Ie AS wi 313 


covers the surface of the cell facing the intercellular 
spaces and the basement membrane appears as a 
triple-layered structure which measures only 70 A in 
thickness. This structure consists of two opaque layers, 
each about 25 A thick, separated by a 20-A thick light 
interspace. A similar appearance has been reported 
recently by Robertson?’ for the plasma membranes of 
the Schwann cells. Robertson has interpreted this 
approximately 70-A thick triple-layered structure as 
representative of the whole plasma membrane (unit 
membrane) and which consists of two protein mono- 
layers sandwiching a double layer of lipid molecules. 
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Fic. 14. Schematic drawing illustrating a hypothesis which 
aims at interpreting the functional significance of the Golgi ap- 
paratus of the excretory cells of the pancreas as an assembly line 
for membrane material with the power to synthesize the secretory 
products of the exocrine cells from raw material in the cytoplasm. 
CM, cytoplasmic membranes; GA, Golgi apparatus with forma- 
tion of membranes from material delivered by the cytoplasmic 
membranes; ZG, zymogen granules formed from precursor granules 
in the Golgi apparatus; CS, cell surface facing the lumen of the 
excretory duct, L [from F. S. Sjöstrand, UNESCO Symp. Patterns 
of Cellular and Sub-Cellular Organisation, Edinburgh, 1957 (to 
be published.) ] 


Fic. 16. Transversal section through the cylindrical processes 
of the brush border zone in an intestinal epithelial cell. The 
processes are bounded by a triple-layered surface membrane with 
an average total thickness of 105 A [from H. Zetterqvist, The 
Ultrastructural Organisation of the Columnar Absorbing Cells of the 
Mouse Jejunum. Thesis, Stockholm (1956) ]. X103 000. 


This interpretation is based on a great deal of evidence 
and it is discussed further by Schmitt (p. 455). It is 
assumed that the consistency of the width, about 100 A, 
of the light interspaces between two cells in close con- _ 
tact indicates the presence of some binding material, — 
such as lipids or mucopolysaccharides. 

Consider now the formation of membranous material 
in the outer segments of retinal receptors. It is seen, in _ 
the double cones of the perch retina, that the disks of 


+ 
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at the upper surface of the cell which faces the lumen of the gut a 
; af ae Per f [from H. Zetterqvist, The Ultrastructural Organisation of She pes 
Fic. 15. Schematic drawing of a cylindrical epithelial cell lining Columnar Absorbing Cells of the Mouse Jejunum. Thesis, Stock-  _ 
° the small intestine in the mouse. Notice the so-called brush border holm (1956) ]. > Re 
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. The peripheral parts of two outer segments belonging to two twin cones of the perch retina. The outer segment shown in 
If of the picture demonstrates that the less opaque interspaces in the triple-layered disks are open, and that the opaque 
acent disks are continuous. At the opposite side of this segment, the conditions are those observed in the outer segment 
pper half of the picture. The less opaque interspaces are closed. This shows that the pile of disks is formed by a continuous 


epeatedly folded. In other types of receptors, no such obvious folding can be observed but the disks are mutually 
ug tube-like stalks which are continuous with the opaque layers of the disks. X100 000. 
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outer segments appear as folds of one continuous 
membrane (Fig. 17). If one looks at the outer segments 
of the kitten retina a few days after birth, one can 
see the outer segments in the process of being formed. 
They seem to be formed through a folding of a mem- 
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Fic. 21 


0-21. Three-dimensional reconstruction of the synaptic body of a retinal rod of the guinea pig eye made from a series of 40 
ccording to the classical technique of the three-dimensional reconstructions for series of sections. Two dendrites, D1, and D2, 
e cells are shown entering into synaptic contact with the synaptic body of the receptor. The plasma membrane bounding the 
body has been removed with the exception of a fragment, PM, in order to show the large vacuoles and the synaptic ribbon 
ogether with synaptic vesicles and granules, constitute the elementary components of the retinal synaptic bodies. In Fig. 21, 
: of the extensions from adjacent synaptic bodies, R2 and R3, of receptor cells are shown making contact with the surface membrane 
econstructed synaptic body [from F. S. Sjöstrand, J. Ultrastructure Research 2, 122 (1958)]. 


brane (Fig. 18), each fold being transformed into a disk 
later on, each disk remaining associated with its neigh- 
bors through a tube-like connection located close to the 
center of the disk at the end of a deep incision. 

It well might be that the mitochondrial membranes 
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are formed in the same way, which explains the fact 
that, in 1 to 10%, the light interspace of the inner 
mitochondrial membranes is continuous with the light 
interspace of the outer mitochondrial membrane. This 
would give one explanation of why this kind of contact 
between outer and inner membranes is so rare. 

Recently, we succeeded in obtaining quite extensive 
series of serial sections. From one series of 40 sections 
with an average thickness of 250 A, the author has 
made a three-dimensional reconstruction of the synaptic 
connections and the inner structure of the synaptic 
bodies of the retinal rods in the guinea pig eye.? This 
reconstruction has made it possible to reveal the very 
complicated arrangement of various membranous com- 
ponents of this part of the cell (Figs. 19-21) and to draw 
the first primitive circuit diagram (Fig. 22) of the 
synaptic contacts of a receptor cell. This technique is, 
of course, of interest when analyzing the central nervous 
system, but it is necessary also for a detailed description 
of the topographical relations between different struc- 
tural components of the cytoplasm. 

One now may survey some of the types of membranes 
as observed in the cytoplasm (Fig. 23). Distinction can 
be made between the cytoplasmic membranes which, 
after osmium fixation, are associated with the RNA 
particles, the infolding of the plasma membrane, and 
the Golgi membranes. These various types of mem- 
branes easily can be distinguished morphologically and a 
neutral terminology is proposed for them: a-, B-, and 
y-cy tomembranes.”? 

As assumed by Porter and Palade and accepted by 
most American electron microscopists, all of these types 
of membranes, as well as the nuclear membrane, the 
centrosphere, and various types of vesicles in the cyto- 
plasm, have been considered to represent different 
aspects of one continuous canalicular system, the so- 


Fic. 22. A trial to make a circuit diagram of the synaptic 
contacts of a retinal receptor in the guinea pig eye showing the 
extensive inter-receptor contacts. R,-Rs, receptor cells; Bı and 


B», bipolar nerve cells [from F. S. Sjöstrand, J. Ultrastructure 
Research 2, 122 (1958). 
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Fic. 23. Schematic presentation of some of the types of mem- 
branes that appear in cells. A. a-cytomembranes; B. Golgi mem- 
branes (y-cytomembranes); C. 8-cytomembranes; D. nuclear 
membrane [according to B. A. Afxelius, Exptl. Cell Research 8, 
147 (1955) ]. Some characteristic dimensions of the membranes 
are presented [from F. S. Sjéstrand, Methods in Enzymology 
(Academic Press, Inc., New York, 1957), Vol. IV, p. 391) ]. 


called endoplasmic reticulum, extending throughout the 
whole cytoplasm. There are no evidences that we have 
been able to confirm that might allow the conclusion 
that these various components form a continuous 
system. We represent an opposite standpoint in accept- 
ing morphological differences as indicating differences 
regarding function, and we stress that the cytoplasm is 
differentiated into a limited number of structurally and 
presumably functionally different components. 

For the future, it is probably less important to extend 
one’s knowledge of the structure of cells to more and 
more types of cells. The structural patterns are repeated 
in a rather monotonous way. It is more important by 
far to try to analyze further the molecular architecture 
of the various components of the cytoplasm by using a 
variety of techniques. What appears most important, 
however, is to try to supplement the collected geo- 
metrical data with biochemical data. It is necessary 
then to work on fractions of homogenized cells, to im- 


prove the techniques of fractionation, to work out _ 


methods that make it possible to identify in the fractions 


the various cell components, and to estimate their _ 


relative concentrations in a quantitative way. - 
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the im oie that the construction of a surface 
embrane with certain properties represents one of the 
ps in the creation of life, and that this principle 
as been applied in a rather monotonous way by nature. 
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INTRODUCTION 


| eects nese structures represent one of the most 
important and widespread components of biologi- 
cal systems.'* Ranging from the individual membranes 
of cells to the complex multilayered structures found in 
the nerve myelin sheath, photoreceptors, and chloro- 
plasts, they embody similar patterns of molecular 
organization. As derivatives of cell membranes, the 
lamellar structures exhibit the characteristic permea- 
bility properties, electrical activity, and other func- 
tional features essential to life. Moreover, by virtue of 
their repetitive and orderly arrangement, the multi- 
layered systems are uniquely endowed for performing 
the functions of conversion, transfer, and storage of 
energy in organisms. 

Investigation of the morphological substrate of these 
processes has been handicapped by the marked lability 
of membranous systems when subjected to the effects 
of analytical techniques. However, with recent ad- 
vances, a general picture of the organization of these 
systems at the macromolecular level is now gradually 
emerging. The successful application of high-resolution 
electron microscopy in combination with polarization- 
optical and x-ray diffraction studies has revealed the 
extraordinary degree of regularity and structural differ- 
entiation which is present throughout the various hi- 
erarchies of organization down to the molecular level. 

This review describes the salient features of the fine 
structure of the nerve myelin sheath, photoreceptors, 
and other representative lamellar systems. It is becom- 
ing increasingly evident that all lamellar systems appear 
to have certain common structural parameters at the 
molecular level, possibly reflecting an underlying func- 
tional analogy. Therefore, as more > information becomes 
available on any one of these specialized systems, new 
approaches are suggested in the correlation of structure 
and function. Thus, many of the biochemical studies 
and recent concepts derived from solid-state physics in 
investigating the mechanisms of photosynthesis," 
may eventually prove to be of operational value in 


elucidating the primary events of nerve function and 
sensory reception. 


FINE STRUCTURE OF THE NERVE 
MYELIN SHEATH , 


The myelin sheath of nerve fibers is one of the most 
highly ordered biological systems, and exhibits all of the 
* Part of these studies were aided by a research grant (C-3174) 
from the National Institute of Neurological Diseases and Blind- 


ness of the National Institutes of Health, Public Health Service, 
U. S. Department of Health, Education, and Welfare. 
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properties of the smectic fluid-crystalline state. Despite 
the high water content and marked lability of the mye- 
lin sheath, its ultrastructure can be studied by com- 
bined application of different techniques under various 
conditions. The analysis of the fine structure of the 
myelin sheath can be regarded, in fact, as one of the 
best examples of the systematic application of comple- 
mentary biophysical and biochemical methods. Since 
the myelin sheath derives from a multiply-folded 
Schwann-cell surface,’ it can also be considered a 
model system for the study of cell-membrane structure 
in general. 

Polarization-optical studies of the myelin sheath'-” 
indicated that it is composed of concentrically arranged 
protein or lipoprotein lamellae which alternate with 
layers of lipid molecules oriented with their long axes 
in the radial direction. This concept was confirmed and 
extended by x-ray diffraction studies of fresh nerve.4—8 
It was assumed that the thickness of the concentric 
lipid-protein unit layers corresponded to the funda- 
mental small-angle x-ray diffraction spacings of 170 A 
recorded in amphibian, and of 180 to 185 A found in 
mammalian peripheral nerves. 

Subsequently, the postulated layers and the concen- 
tric laminated structure of the sheath were observed 
directly in electron micrographs of thin nerve sections 
fixed with osmium tetroxide.” However, since the 
preparation of these specimens required fixation, de- 
hydration and embedding in a plastic medium, followed __ 
by slicing into ultrathin sections, the possibility of 
artifacts had to be seriously considered. Moreover, the _ 
regular patterns of selective deposition of osmium could 
not be interpreted in terms of specific regions containing EA 
the lipids, lipoproteins, water, and other components of = 
the normal myelin structure. i 

One is obviously dealing here with a case in which — 
combined application of low-angle x-ray diffraction res 
techniques and high-resolution electron microscopy can 
supplement each other in many ways. It is instructive ie 
to consider the merits and drawbacks of each technique — 
in order to appreciate the type of approach required , 
analyzing biological systems. 


The x-ray diffraction method offers the great ad 
tage of permitting examination of the intact — 
trunk in Be living animal.!#.18 The infor Ena 


A EE of the myelin sheath.1.18.24 Ero ant 
low-angle x-ray diffraction pattern, the dimensions : 


4 320 H. 


approximate distribution of scattering groups in the 
radial direction of the sheath can be deduced. Reliable 
reference parameters are, therefore, provided for check- 
ing the results of electron microscopy in studies of the 
normal and modified myelin sheaths.!5:7 

Finally, the x-ray diffraction pattern recorded in a few 
hours represents an average of the main structural 
parameters of all of the nerve fibers contained in the ex- 
posed nerve trunk which contribute to the diffraction. 
The practical significance of this is immediately ap- 
parent when comparing it with the far greater time 
factor involved in the corresponding electron-micro- 
scope study. Examination of nearly 100000 serial 
ultrathin cross sections would be required for a com- 
plete electron-microscope study of a 1-mm long segment 
of a few fibers only. 

Conversely, electron microscopy offers the unique 
advantage of revealing directly the complex patterns of 
macromolecular organization in selected areas of the 
specimen. 


Structure of the Normal Myelin Sheath 


X-ray diffraction patterns recorded from fresh per- 
ipheral nerve in a direction perpendicular to the fiber 
axes feature a series of very well-defined reflections at 
low angles. The wide-angle reflections at about 4.7 A 
í exhibit meridional intensifications, while the low-angle 
reflections are precisely oriented in the equatorial di- 


į rection. The low-angle reflections can be interpreted in 
M ; terms of a fundamental radial repeating unit varying 
from 170A in amphibians to 184A in mammalian 
a peripheral nerve. The low-angle reflections also show 


ti a characteristic alternation of intensities in the even 
j and odd orders [Fig. 1(b) ]. It is assumed that the 
fundamental repeating unit consists of two parts having 
Blo. a very similar distributions of x-ray scattering power; 
from the intensities of the odd-order reflections, the 
magnitude of the difference between the two parts can 
be estimated.!* Finean'®—'* has determined that this 
“difference factor” is appreciable in peripheral nerve 
myelin, but appears to be negligible in the optic nerves. 
High-resolution electron micrographs of ultrathin 
sections of osmium-fixed peripheral nerve (Fig. 1) show 
a series of concentrically arranged, dense lines separated 
by lighter spaces with an average period of 130 to 140 A. 
There is, thus, a correspondence of the periodic layers 
observed in the electron micrographs of myelin prepara- 
tions and the fundamental radial unit derived from 
ee sy diffraction patterns. 
The discrepancy of 20 to 30 A between the layer 
ess and the x-ray fundamental spacing is owing 
rinkage effects introduced by osmium fixation 
er preparative procedures connected with the 
“amination of the thin sections in the electron micro- 
e. In addition to the periodic dense bands about 
| wide, the electron micrographs feature light inter- 
ate bands which are much narrower (10 to 15 A). 
tion in the densities of the two principal 
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bands of osmium deposition is related to the “difference 
factor” which manifests itself in the low-angle pattern 
as an increase in the intensity of the first-order reflection. 

The present lack of precise information on the chem- 
istry and localization of the sheath proteins, the lipids, 
the water layers, and other components of myelin has 
not as yet permitted a direct correlation with the 
structural data.” =" However, by combined electron- 
microscope and x-ray diffraction studies of controlled 
physical and chemical modifications of the myelin sheath, 
a general picture of the molecular arrangements in the 
radial units can be obtained.!*:!*7!°425 Satisfactory 
correlation and interpretation of structural relationships 
at the macromolecular level requires the preparation ofa 
large number of ultrathin (100 to 200 A) undistorted 
serial sections of the same specimens utilized for the 
parallel x-ray diffraction studies. This has been greatly 
facilitated by improved thin-sectioning techniques using 
a diamond knife.?:?9 

Such a combined approach has led to a detailed 
analysis of the preparation procedures, thus validating 
important findings and defining artefact sources. The 
extraction and enzymatic-digestion experiments per- 
formed on fresh myelin proved to be particularly 
revealing. 


Lipid-Extraction Experiments 

Extraction of fresh nerve with acetone at 0°C re- 
moves about 30% of the cholesterol,! leaving the other 
lipid components essentially intact within the frame- 
work of a still highly organized residual myelin sheath.” 
The main modifications revealed by electron micro- 
scopy and x-ray diffraction [ Figs. 2(d) and 2(e) ] of the 
residual sheath comprise expansion of the layered 
structure with internal rearrangements and formation 
of collapsed layer systems [Figs. 2(a) and 2(b) ]. The 
dense lines are relatively resistant to lipid solvents, and 
their persistence in the collapsed layered structure may 
indicate that the heaviest osmium deposition is in the 
region of the protein or stable lipoprotein constituents. 
These modifications emphasize the importance of cho- 
lesterol in the myelin structure, and lend further support 
to the existence of the phospholipid-cholesterol complex 
postulated in the unit cell by Iinean.!* 

Low-angle diffraction patterns of the lipid extract 
show a strong 34.2-A reflection [Fig. 2(c)] character- 
istic of cholesterol. When the dried lipid extract is 
examined by electron microscopy, thin crystalline lam- 
ellae are found embedded between condensed multi- 
layered aggregates. These crystalline lamellae give 
electron-diffraction patterns which are similar to those 
recorded from pure cholesterol. 

A more extensive breakdown of the sheath is observed 
after alcohol extiaction. The fine-layered structures ob- 
served in certain areas may represent the refractory 
protein framework in addition to recrystallized lipid 
components. The extraction experiments suggest, there- 
fore, that the dense bands observed in electron micro- 
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(a) 


(a) High-resolution electron micrograph of myelin-sheath segment from a transverse section of an 


osmium-fixed rat sciatic nerve embedded in gelatin. The average layer spacing is 110 A. Note compact and well- 
preserved dense lines with moderate enhancement of the intermediate lines. 850000. (b) Low-angle x-ray 
diffraction pattern of rat sciatic nerve recorded with Finean camera. This pattern features a fundamental period 
of 176 A, with characteristic alternation of the intensities of the even and odd orders. 


graphs of the normal myelin sheath represent areas of 
selective osmium deposition at lipoprotein interfaces, 
whereas the light bands are regions occupied by lipid 
chains and associated myelin components which do not 
react primarily with osmium tetroxide. 


Enzymatic-Digestion Experiments 


Although digestion with trypsin is not specific, its 
application to fresh nerve fibers produces characteristic 
modifications of the myelin fine structure.!?*! In addi- 
tion to slight expansion of the concentric layers, the 
uniform dense lines appear to dissociate into rod-shaped 
granules, 30 to 50 A wide and 40 to 60 A long (Fig. 3). 
The unit myelin lamellae isolated from the sheath after 
trypsin digestion likewise exhibit extensive dissociation 
into composite elongated granules of similar dimen- 
sions.” An analogous granular fine structure of the 
dense layers is commonly found after freezing and 
thawing of fresh nerve, KMnỌ;, fixation, and other 
types of preparations.*! Moreover, there are also nu- 
merous indications of a compact granular fine structure 
in the dense layers of the normal myelin sheath. 


X-ray diffraction data furnish supporting evidence 
for the presence of a regular organization within the 
plane of the layers. A strong 60- to 70-A vector, which 
appears necessary to account for the relative intensities 
of the low-angle reflections, can be related to this type 
of organization within the planes of the lipoprotein 
layers.” 


Nerve-Degeneration Studies 


The breakdown of myelin known to occur during 
in vilro degeneration of nerve affords an opportunity to 
study the fine structure of the disintegrating sheath 
without introducing extraneous reagents. Comprehen- 
sive studies of nerve-fiber degeneration, which are now 
being carried out by the author with combined applica- 
tion of high-resolution electron microscopy and low- 
angle x-ray diffraction techniques, reveal interesting 
details of the layer structure. 

When fresh peripheral nerve is enclosed in a glass 
capillary under aseptic conditions and left to degener- 
ate at 20°C, low-angle x-ray diffraction patterns re- 
corded at periodic intervals show characteristic changes 
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Fic. 2. (a) Myelin-sheath segment from transverse section of fresh rat sciatic nerve extracted with acetone at 0°C for 12 hours prior 
to fixation with 2% osmium tetroxide and embedding in butyl methacrylate. Notice the expanded periods (160 A) of the modified 
layers at the right, and the transitions (arrows) to the collapsed period (43 A) at the left. X300 000. (b) Myelin-sheath segment from 
longitudinal section of fresh rat sciatic nerve extracted with acetone at 0°C for 12 hr prior to fixation with osmium tetroxide and em- 
bedding in butyl methacrylate. Notice the transitions (arrows) from the expanded layer system to the collapsed layer system X280 000. 
(c) Low-angle x-ray diffraction pattern of the lipid material extracted from fresh sciatic nerve by immersion in acetone at 0°C for 12 
hours, showing the characteristic cholesterol spacing at 34.2 A. (d) Low-angle x-ray diffraction pattern of residual dried rat sciatic nerve 
after extraction with acetone at 0°C for 12 hours. (e) Low-angle x-ray diffraction pattern of acetone extracted nerve fixed with buffered 


2% osmium tetroxide and embedded in methacrylate. 


_ Fic. 3. Myelin-sheath segment from a rat sciatic nerve which was incubated with 1% crystalline trypsin for 12 hours prior to fixation 
in buffered 1% isotonic osmium-tetroxide solution and embedding in butyl methacrylate. Notice the marked dissociation of the dense 


lines (arrows) into elongate granules. X520 000. 


(Fig. 4). Compared with the normal pattern, marked 
nodifications of the distribution of x-ray scattering 
power within the unit are noted as degeneration pro- 
gresses. There is marked intensification of the second- 
order diffraction, followed by the appearance of a 70-A 
reflection, and gradual extinction of the lower orders in 
later stages of degeneration. The corresponding electron 
micrographs (Fig. 5) show various forms of granular 
dissociation of the dense layers, resembling the struc- 
tures observed after trypsin digestion. In addition to 
this supporting evidence of granular structure within 
the layers, numerous other features of myelin organiza- 
being uncovered as the complex dissolution 
followed at the submicroscopic level. 


ested Molecular Organization of Myelin 


largely on their pioneering x-ray diffraction 
Schmitt and co-workers?" concluded that 
] unit of the myelin sheath in the 


studies, 
the fundamental radia 


v 
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internodal portion of peripheral nerve comprised two 
lipoprotein layers, each of which consisted of bimolec- 
ular leaflets of 67 A sandwiched between protein layers 
of 25 A, with interposed water layers contributing a 
further 25 A. Taking into consideration the contraction 
of the lipid layers during drying, and the results of other 
experimental modifications, Finean!*—!8 arrived at a 
more detailed picture of the molecular arrangement in 
the myelin unit. According to this conception, each 
lipoprotein layer would consist of two phospholipid- 
cholesterol complexes associated with a cerebroside 
molecule, intercalated between monolayers of protein. 
The two units are distinguished by a “difference factor” 
which has not been accurately determined, but can 
nevertheless be explained by the mechanism of myelin 
formation. As Geren®? showed, the myelin sheath is 
formed by an infolding and multiple wrapping of the 
Schwann-cell membrane around the axon in embryonic 
fibers. Assuming that the Schwann-cell lipoprotein 
membrane is asymmetric, then this process of rolling 


BIND S@RUCDUR BE OF 


Fic, 4. 


BIOLOGICAL 


12days 16days 


LAMELLAR SYSTEMS 323 


ke 
Fe a- 
IY Oa A 


Fic. 5. 


I'ic. 4. Low-angle x-ray diffraction patterns of rat sciatic nerve recorded during in vitro degeneration at 20°C. Notice marked modifica- 
tions of the distribution of x-ray scattering power within the radial unit as degeneration progresses. 


Fic. 5. Electron micrograph of nerve myelin sheath during in vitro degeneration (4 days), showing dissociation of the dense lines into 
granular structures, disappearance of the intermediate line, and other modifications of the layered structure. X280 000. 


onto the axon will produce the observed symmetry 
difference in successive layers.'® 

The lipids are considered to be oriented with their 
long axes radially in the sheath. The 4.7-A, meridionally 
intensified ring recorded from fresh nerve has been 
related to the cross sections of the lipid molecules, 
indicating the average interchain separation of their 
hydrocarbon chains. However, in order to account for 
the permeability properties of the membranes, various 
possibilities including the existence of real or potential 
“holes” have been discussed. One of the suggested con- 
figurations envisages a continuous layer of lipids sand- 
wiched between monolayers of protein with a fenestra- 
ted or open-network type of structure. The filtering 
action of the protein layer would combine with the 
selective solubility of the penetrating ions or molecules 
in the lipid layer to account for the sieve mechanism of 
permeability. The formation of submicroscopic pores 
by radial extensions of the protein layers across the 
lipid layers has also been considered, but no direct 
evidence for the existence of these discontinuities is yet 
available. Nevertheless, the x-ray diffraction and elec- 
tron-microscope data clearly indicate that there must 
be a considerable degree of organization within the 
plane of the lipoprotein layers which remains to be 
investigated. 

Water is one of the most important components of 
fresh myelin, constituting at least 35% of the myelin 


sheath.” Information on the precise localization of 
water within the myelin layers is also of great impor- 
tance when considering the possible pathways of diffu- 
sion of ions and other solutes between the axon and the 
extracellular fluids.?*:*! The water layers at the aqueous 
interfaces of the fundamental repeating unit in the 
compact myelin are about 12 to 15 A thick.*! Under 
certain experimental conditions,’ larger amounts of 
water can be incorporated between the aqueous inter- 
faces in the myelin unit. The water is presumably 
coordinated on the protein layer located at the aqueous 
surface of each bimolecular leaflet. The thickness of the 
water layers is largely determined by the electrical 
charge density at the aqueous interfaces,’ and by the 
ionic strength, since water is expelled from lipid layers 
in the presence of ions. These changes in thickness of 
the aqueous interfaces may be of significance in con- 
nection with ion movements in the intermembrane 
spaces. However, as Schmitt*! has pointed out, the 
physicochemical properties, such as pH and ionic 
strength in capillary spaces of these macromolecular 
dimensions, would be quite different from those in bulk 
solutions. The effects of the electrically charged lipid 
groups on the structure of the hydration water in these 
aqueous interfaces may also induce the formation of a 
crystalline arrangement resembling “frozen” hydration 
sheaths postulated in protein molecules.*2 

Through the application of high-resolution nuclear 
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Fic. 6. Single unit disk isolated from the outer segment of frog retinal rod, showing the granular surface structure and the marginal 
cord. Latex particles (2800 A diam) have been added for calibration in this shadowed preparation. X25 000. 


magnetic-resonance spectrometry, it now appears pos- 
sible to obtain important supplementary information 
on the water content of fresh whole nerve and the hy- 
dration state of its components, in a rapid and non- 
destructive way.” The size of the proton magnetic- 
resonance signal recorded from biological materials is 
al to the water content.” Since the mobility 


roportion ) 
P e solid, nonaqueous constituents 1s 


of the protons in th 


much less than in the aqueous phase, the resonance 
spectrum consists of a narrow line owing to the water, 
superimposed on a broad line owing to protons in the 
supporting solid. After suitable calibration, the water 
content can be accurately determined by evaluating 
the line width of the proton resonance, or by measuring 
the amplitude of the derivative of the narrow absorption 
curve. Both methods have been applied in preliminary 
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Fic. 7. (a) Transverse section through the basal region of the outer segment of a retinal rod of the guinea pig, showing the character- 
istic tubular structures and vesicular formations. Notice cross-sectioned filaments which establish connection with the inner rod segment. 
X32 000. (b) Longitudinal ultrathin section through the basal part of the outer segment of retinal rod from guinea-pig eye showing the 


regular arrangement of the double membrane disks. X48 000. 


determinations of the water content of fresh nerves 
under various conditions, using commercially avail- 
able equipment and a special transistorized NMR 
spectrometer.” "5 

The protons in the hydration shells postulated around 
the macromolecules lack the high mobility of the water 
in the bulk aqueous environment, and give a broad and 
weak resonance line. By comparison with the strong 
narrow line owing to the more-mobile water protons, 
the proportion of water forming the hydration shells of 
nucleic acids has been estimated. Similar experiments 
can be carried out on living nervous tissues, and should 
yield valuable information on the hydration state of the 


. bound water in the myelin sheath. 


The present concepts of the myelin’s molecular archi- 
tecture refer almost exclusively to the lipoprotein frame- 
work which stands up to various analytical techniques. 
Within this framework, no detailed localization is yet 
possible of the numerous enzymes, electrolytes, trace 
metals, and other essential components of the mye- 
lin sheath. Although deuterium and radiophosphate 
studies*? indicate that myelin has a relatively low 
metabolic turnover, more information is required on 
the structural relationships with the phospholipases, 
phosphatases, and other enzymes involved in lipid 
metabolism.***? Likewise, the molecular models will be 
incomplete until specific localizations can be assigned 
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to the cholinesterases” and related enzymes which play 
an important role in nerve conduction. Recent investi- 
gations which link the metabolism of copper with cer- 
tain demyelinating diseases’ also focus attention on the 
possible relationships of trace metals with the lipopro- 
tein layers. In view of the close association of the 
porphyrins" and related photodynamic agents with the 
central myelin, it may prove of interest to establish a 
more specific correlation with the ultrastructure of the 
myelin sheath. 


FINE STRUCTURE OF PHOTORECEPTORS 


Electron microscope studies have revealed that most 
of the photoreceptors in biological systems, where light 
is converted into chemical or electrical energy, feature 
a regular lamellar fine structure. In the vertebrate eye, 
the photoreceptor elements are composed of submicro- 
scopic plates orderly disposed in a pile; whereas in the 
invertebrates, closely packed thin-walled tubules build 
up the light percipient constituents. The process of 
visual excitation is initiated by the absorption of light 
in the characteristic photopigments which form an 
integral part of all photoreceptors. In the vertebrates, 
the visual pigments, rhodopsin and iodopsin, are located 
in the outer segments of the rods and cones, respec- 
tively. Polarized-light studies had already indicated 
that the rod outer segments consist of transversally 
oriented protein layers alternating with | | : 


ayers of longi- 
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Fic. 8. (a) Oblique longitudinal section through a rhabdomere of the housefly retinula showing the lamellar type of internal structure. 
The highly regular submicroscopic organization is illustrated by the nearly perfect parallel array of several score dense bands occupying 
the entire rhabdomere. These parallel bands correspond to the dense walls of the tubular compartments sectioned longitudinally. 
X60 000. (b) High-resolution electron micrograph of longitudinal ultrathin section through a rhabdomere of the housefly retinula, 
demonstrating the fine structure of the tubular compartments. The regular array of sharply-outlined annular profiles (400 to 500 A in 
diameter) bounded by a dense osmiophilic substance is interpreted as representing cross sections of the closely packed tubular compart- 
ments. This differentiation of the dense “walls” of the compartments into a thin, osmiophilic boundary line, 20 to 30 A wide, associated 
with a dense layer of approximately 60 A, may be of interest in relation to the postulated models of the macromolecular photoreceptor 


complexes. X 100 000. 


tudinally oriented lipid molecules, and could, therefore, 
be compared with a radial cylinder of the myelin sheath. 
Electron microscopy confirmed that the entire rod 
outer segment consists of several hundred unit disks, 
about 150A thick, stacked up in regular array.?!:43,44 
The unit disks of the frog retinal rods have a diameter 

of about 6u and exhibit a peculiar lobulated shape with 
numerous incisures outlined by a dense marginal cord 
(Fig. 6). Thin sections disclose the remarkably regular 
layering of the unit disks [ Fig. 7(b) ], which are ar- 
ranged with their planes perpendicular to the rod axis. 
Each disk consists of dense double membranes which 
stain intensely with osmium and enclose a light space 
of about 70 to 80 A. The repeating unit along the axis 
of the rod in osmium-fixed preparations corresponds 
closely to the spacing of about 320A recorded in the 
low-angle x-ray diffraction patterns.’® Recent studies 
demonstrate tubular processes at the incisures which 
are closely associated in the basal portion with the 
bundle of thin fibrils connecting the outer and inner rod 
segments [ Fig. 7(a) J. The inner segments om ap 
merous densely-aggregated long mitochondria,“44° which 
are regularly encountered in all types of photorecep- 


"A 


o 


tors.*18 The retinal cones show a similar layered 
structure.“ 

It has been tentatively assumed that the protein 
membranes stain densely with osmium, while the lipid 
molecules may be predominantly localized in the inter- 
membrane compartments of the disk.“ However, knowl- 
edge of the detailed chemical composition of the photo- 
receptors is still too meagre, and controlled extraction 
and digestion studies of the type described earlier 
remain to be done. The exact location of the visual 
pigments is also unknown, but Wolken and co-workers” 
have suggested that the pigment molecules are oriented 
as monolayers at the aqueous protein and lipoprotein 
interfaces. The chloroplasts exhibit a very similar 
layered structure, and the chloroplastin-pigment com- 
plex appears likewise to reside in the aqueous protein- 
lipoprotein interfaces. 


Light Receptors in Insect Compound Eyes 


In the insect compound eye, each separate visual 
element or ommatidium consists of eight sensory retin- 
ula cells.® The differentiated medial part of each 
retinula cell, known as a rhabdomere, is built up of 
approximately 20 000 to 30 000 closely packed tubular 
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Fic. 9. Oblique cross section through a retinula from the eye of the giant tropical moth, Erebus, demonstrating the symmetrical 
arrangement of the fused rhabdomeres. Note that the general configuration and orientation of the tubular compartments of any one of 
the rhabdomeres is matched by the corresponding rhabdomere pattern directly or diagonally opposite to it. The equivalent rhabdomeres 
of contiguous ommatidia are also oriented with their main axes in the same direction, thus introducing a degree of “structural polariza- 
tion” in the over-all pattern which must bear relation to the analysis of polarized light in the insect eye. X 6000. 


compartments which vary from 400 to 1200 A in diam- 
eter, and are oriented in highly regular array [ Fig. 8(a) ] 
with their long axes normal to the longitudinal rhab- 
domere axis.‘7—?:! These tubular rhabdomere compart- 
ments are regarded as differentiated microvilli of the 
retinula cell membrane. The walls of the tubular com- 
partments exhibit a dense boundary line of about 30 A 
in which the oriented visual pigments may possibly be 
located’? [Fig. 8(b)]. Within each ommatidium, the 
rhabdomeres are radially arranged in a symmetrical 
pattern formed by matched pairs of opposite rhab- 
domeres displaying similar orientation of their layered 
structure (Fig. 9). 

This differentiated submicroscopic organization of the 
visual elements—and their structural coupling in co- 
ordinated groups'’—suggests a correlation with the 
remarkable ability of insects to recognize the regional 
patterns of polarization of the sky as a basis for light- 
compass orientation.*-' It is assumed that the 
radially arranged rhabdomere pairs containing dichroic 
visual pigments similarly oriented in their periodic 
tubular compartments may correspond to the functional 
units of the analyzer for polarized light postulated in 
the compound eye.t7=19.51,55 ° 

The visual system of the insects and other arthropods 
features an intimate association with a complex three- 
dimensional network of air-filled tubules or tracheoles 
and with pigment granules, which is probably of con- 
siderable functional significance.” 


The structural analysis of the insect compound eye is 
proving to be one of the most revealing examples of how 
the highly differentiated tubular and lamellar textures 
can provide a periodic array of gas-liquid-solid inter- 
faces at the supramolecular level, which are ideally 
suited for selective interaction (absorption, reflection, 
diffraction, scattering, etc.) with incoming light signals. 


RELATION OF LAMELLAR STRUCTURES TO 
ENERGY-TRANSFER PROCESSES 


Essentially the same type of macromolecular archi- 
tecture described in the unit layers of the myelin sheath 
and the photoreceptors is encountered with only minor 
modifications in the organization of the mitochondria, 
intracytoplasmic lamellar complexes, Golgi apparatus, 
and other specialized lamellar systems.2!:*!:%3 Jf this 
striking structural similarity is actually the expression 
of an underlying functional analogy, then interesting 
general correlations may be derived from critical colla- 
tion of the available information. 

Correlation of fine structure with energy-conversion 
processes has been most fruitful in those systems—like 
mitochondria and chloroplasts—which can be isolated, 
in quantity and studied outside of the cell without 
appreciably impairing their functional activity. Bio- 
chemical studies®*~** have shown that certain fragments 
of mitochondria are still capable of carrying out the 
essential functions of electron transport and oxidative 
phosphorylation. Based on this evidence, the mitochon- 
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drion has been compared with a giant polymer made up 
of certain repeating units incorporated in the internal 
and external membranes.*” Recently, it has been pos- 
sible by digitonin treatment to obtain fragments of the 
mitochondrial membrane which contain relatively in- 
tact “respiratory chain assemblies” including the en- 
zymes necessary for coupling phosphorylation to elec- 
tron transport.*® These fragments contain protein, 
phospholipid, and a small amount of cholesterol ;** they 
may possibly correspond in many respects to the double 
membrane components obtained by sonic fragmenta- 
tion of mitochondria.***’? The available evidence indi- 
cates that these enzymes and related components of the 
electron-transfer chain are precisely arranged in specific 
patterns built into the matrix of the lipoprotein layers, 
which provide the highly ordered ‘‘floor space” where 
these assemblies are anchored.57:55 
It has been suggested that the electrons are trans- 
ported in this comparatively rigid structure, approxi- 
mating the solid state, by being relayed through small, 
mobile molecules within the lipoprotein matrix.*® 7 
Considering the important participation of water in all 
lamellar systems, it is tempting to associate the water 
IE molecules or hydrated small molecules with this transfer 
function. 

The general problem of structural interaction be- 
tween an ordered substrate and its aqueous environ- 
ment appears to be of particular significance in the 
fi fluid-crystalline lamellar systems with their high water 
‘| ai content. Many unique features of proteins in aqueous 
K solutions suggest that the macromolecules may be 
surrounded by a hydration sheath of icelike charac- 
lai i ter.*?-96.9 The induction of a crystalline lattice of 
67 hydration water is ascribed to the local and long-range 
$ cooperative electric-field effects of the nonpolar side 
chains in proteins.” In the case of the 10- to 15-A thick, 
aqueous layers intercalated between the protein-lipo- 
protein layers, these effects are expected to be even more 
pronounced in determining the formation of crystalline 
lamellae of “frozen” water. 

On first consideration, this concept of “frozen”? hy- 
dration lamellae permeating in periodic array such a 
multilayered complex would seem to be incompatible 
with the smectic fluid-crystalline character of biological 
= systems. However, it should be borne in mind that the 

$ ‘Scelike? hydration sheaths lodged in ultracapillary 
l -~ spaces of these molecular dimensions may exhibit 
= physical and physicochemical properties which are quite 
ihe different from those encountered in bulk solutions. 
= Also, only that part of the hydration water which is in 
Tee immediate contact with the protein or lipoprotein layers 
may be ordered in a crystalline lattice; while the re- 
maining “liquid” aqueous strata, and the additional 

rstitial water channels would still allow sufficient 

m of movement of the flexible layered substrate. 


lL 
om 
s connection, it is interesting to note that the 


w 
Dri 


FERNANDEZ-MORAN 


of the crystalline lamellae, and is largely responsible for 
the lubrication properties of this material. 

It is, therefore, conceivable that the postulated crys- 
talline hydration sheaths could provide an organized 
framework interconnecting the lipoprotein layers and 
serving primarily as a transfer medium for charged 
carriers like electrons, holes, protons, and ions. By 
virtue of the extensive hydrogen bonding in the icelike 
lattice, the effects of ion movements could be produced 
over comparatively long distances without involving 
bodily displacement of the ions, which would be an 
important factor in the relatively compact lamellar 
systems. 

Regardless of these hypothetical assumptions, it is 
obvious that much more must be known about the 
hydration state and properties of water in lamellar 
systems before detailed functional correlations can be 
attempted. Nuclear magnetic-resonance spectrometry, 
supplemented by spin-echo techniques, now offers the 
possibility of obtaining essential quantitative data on 
relaxation times and correlation times of the water 
components in living biological systems under various 
physiological conditions. It is also suggested that a 
structural determination of the hydration shells of 
lattice-ordered water may eventually be possible by 
extending the neutron-diffraction studies already suc- 
cessfully carried out on ice® to suitable models of 
multilayered systems, or to liquid crystals of adequate 
size. 

Important additional evidence that the processes of 
energy transfer and conversion in the chloroplasts are 
dependent on the highly ordered molecular arrange- 
ment in their layered structure has been furnished by 
comprehensive recent studies of the mechanisms of 
photosynthesis.’ =7® Calvin and co-workers*+:* have 
strongly indicated that the initial process of energy 
transfer is a physical one, dependent on the specific 
physical organization of the chlorophyll and its associ- 
ated pigments in the ordered, ‘‘quasicrystalline” layered 
structures. Absorption of a light quantum by chloro- 
phyll raises an electron to a conduction band, leaving 
behind a “hole,” both of which migrate together as an 
“exciton” through the oriented array of chlorophyll 
molecules. The conjugated carotenoid molecules act as 
conductors of electrons, while the lipids in the chloro- 
plasts would function as an insulator, permitting a 
separation of charges when the electrons are “trapped” 
on one side of the layer and the “holes” on the other. 
This polar character is similar to that which exists at a 
p-n semiconductor junction. The electrons participate 
in the formation of a reducing agent leading to the 
reduction of carbon dioxide; the positive holes react 
with water to form oxygen, thus initiating the two basic 
reaction sequences of the photosynthetic mechanism. 
Direct experimental evidence has also been presented 
for the existence of trapped electrons and holes by 
recording electron spin-resonance signals from illumin- 
ated whole chloroplasts’-® at low temperatures 
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Fic. 10. Suggested experimental arrangement to investigate semiconductor phenomena associated with the mobility of electrons and 
holes in active biological lamellar systems. By applying microelectrodes (filled with a suitable solid free radical, ‘‘C’’) directly to the 
outer segments of retinal rods, it might be possible to determine the drift velocity and density of injected current carriers. Addition of 
a transverse magnetic field would then permit determination of the Hall effect and other magnetic concentration effects. Experiments 
of this type, carried out on photoreceptors and chloroplasts or on artificial multilayered structures containing specific photopigments, 
are needed to furnish direct experimental proof of the postulated semiconductor properties in biological systems. 


(— 150°C), and, therefore, excluding the possibility of 
any ordinary enzymatic process. 

By tentatively extending these concepts to other 
lamellar systems, one arrives at a general picture of 
paracrystalline lipoprotein layers containing assemblies 
of specific enzymes associated with photopigments or 
specialized electron-transfer systems, all of which are 
organized in highly ordered patterns. These biological 
lamellar systems, permeated with hydration water of 
possible icelike character, might have many properties 
in common with semiconductors, although, as pointed 
out earlier,” the concepts of solid-state physics are 
not directly applicable to the fluid crystalline state. 
However, merely as a working hypothesis, the semi- 
conductor analogy should prove useful as a possible 
basis for the physicochemical amplifying devices sug- 
gested in connection with the events leading from the 
primary process of sensory excitation to the complex 
bioelectric phenomena associated with nerve-impulse 
propagation. ® © i 

Although these ideas are admittedly speculative, they 
may, nevertheless, serve to encourage new experimental 
approaches. Of particular interest would be systematic 
investigations designed to furnish direct experimental 
proof in biological lamellar systems of the two processes 
of electronic conduction in semiconductors, correspond- 
ing to positive and negative mobile charges, which are 
characteristic of transistor action. In the suggested 


experiment (Fig. 10), two capillary microelectrodes 
filled at the tip with a suitable solid free radical are 
applied directly to the outer segments of freshly isolated 
retinal rods, or, alternatively, to large chloroplasts. 
With the outlined arrangement, it might be possible to 
determine the drift velocity and the density of injected 
current carriers, in analogy to the classical experiments® 
carried out on small samples of germanium using micro- 
manipulator techniques. By introducing a transverse 
magnetic field, determination of the Hall effect and 
other magnetic-concentration effects® should also be 
possible. Application of these techniques to simple 
model systems containing multilayers of organic semi- 
conductors would probably be the first step in this type 
of study. The study of organic semiconductors®®:® is 
just beginning now, and further developments in this 
field are bound to have far-reaching implications in the 
interpretation of biological phenomena. 
Ultrastructural studies are confronted with more im- 
mediate tasks as a result of recent improvements in the 
preparation techniques and the useful resolving power 
of electron microscopy. Thus, continuation of earlier 
experiments based on rapid freezing of glycerine-treated 
tissues with liquid helium,” followed by special fixation 
and embedding procedures at low temperatures, may 
ultimately permit a more reliable correlation of organi- 
zation at the molecular level with sequentially arrested 
states of activity in biological systems. Application of 
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moire effects® to directly visualize equivalent configu- 
rations of atomic dimensions may also prove to be of 
considerable operational value in the study of paracrys- 
talline components. 

From the foregoing cursory survey, it is evident that 
the correlative investigation of fine structure and func- 
tion of lamellar systems poses some of the most chal- 
lenging and rewarding problems in modern biology, 
which will require an integrated approach enlisting the 
best efforts of biophysicists, biochemists, and physicists. 
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HE fine structure of specialized organelles such as 

mitochondria, which are known to be concerned 
with the metabolic side of the carbon cycle, has been dis- 
cussed in the three foregoing chapters (Bennett, p. 297; 
Sjostrand, p. 301; Ferndndez-Mordn, p. 319). The 
structures concerned with the other half of this cycle— 
namely, the photosynthetic aspect—are considered 
here. 

It is of interest, in terms of the evolution of specia- 
lized photosynthetic organelles, to examine the fine 
structure of certain primitive types of cells, since these 
are presumably representative of organisms arising very 
early in the evolutionary process. Such a group of 
organisms are the blue-green algae. Examination in the 
electron microscope shows that, in certain types such 
as Nostoc (Fig. 1), there are no specialized membrane- 
bounded organelles such as mitochondria, nuclei, and 
chloroplasts; rather there exists a generalized mem- 
brane structure ramifying throughout the cytoplasm. 
It seems reasonable to conclude that, in such cells, the 
various metabolic functions such as oxidative phos- 
phorylation and photosynthesis are carried on in rela- 
tion to an apparently undifferentiated membrane 
system (or perhaps in specialized patches of the 
membrane), rather than in well-defined organelles. In 
other blue-greens (e.g., Anabena), the photosynthetic 
material is present in a more specialized and segregated 
state, but the primitive organelles lack limiting mem- 
branes. It is of interest to note that, in higher organisms, 
an intracellular membrane system (the endoplasmic 
reticulum)! appears to be universally present, and to 
have clear continuity with the membranous envelope 
of the nucleus (and possibly with the limiting mem- 
branes of other organelles). It seems likely, therefore, 
that comparative study of such primitive cells might 
yield important clues concerning the origin and inter- 
relationship of the endoplasmic reticulum and specia- 
lized organelles such as mitochondria, chloroplasts, and 
nuclei in the cells of higher organisms. 

Proceeding a little higher in the evolutionary scale to 
a typical green alga, such as Nitella (Fig. 2), one ob- 
serves typical membranous elements of the endoplasmic 
reticulum, the characteristically structured mitochon- 
dria found in all higher organisms, ad chloroplasts, 
bounded by a well-defined, double limiting membrane 


* Most of the work described herein was carried out in the 
Chemical Physics Section, C. S. I. R. O., Melbourne, Australia, in 
collaboration with Dr. F. V. Mercer and Dr. J. D. McLean of the 
Botany Department, University of Sydney, N. S. W. 


and containing a number of relatively well-ordered 
dense lamella within a finely granular matrix material.” 
Chloroplasts of this structural type are highly charac- 
teristic of the lower forms of plant cells.3~® 

In the higher plants, of which corn (Zea mays L.) is 
chosen as an example, the structure of the chloro- 
plasts’— usually differs from the more primitive pat- 
tern already described. Figure 3 shows in transverse 
section the leaf of a three-week-old corn plant. Around 
each vascular bundle are a number of parenchyma 
sheath cells, the chloroplasts of which are specialized 
for the formation and storage of starch. Most of the 
photosynthetic activity, however, is carried out in the 
chloroplasts of the mesophyll cells. Even at the light- 
microscopical level, there are obvious differences in 
appearance between the chloroplasts of the parenchyma- 
sheath cells and those of the mesophyll cells. In the 
electron microscope (Fig. 4), the parenchyma-sheath 
chloroplasts of Zea’? resemble algal chloroplasts (cf. 
Nitella, Fig. 2) in their over-all plan of organization. 
Within a double external limiting membrane, a number 
of densely staining lamellae (each about 130 A thick) 
are set in a finely granular matrix which presumably 
contains most of the soluble enzymes of the chloroplast. 

The mesophyll chloroplasts are similarly lamellated 
but, in addition, possess well-defined regions (the grana) 
in which the lamellae are more densely and regularly 
packed (Fig. 4). If the section is in the plane of the 
lamellae, the grana appear as circular profiles. If, on the 
other hand, the plane of the section is normal to the 
lamellar plane, the grana appear as regular rectangular- 
shaped regions. The concentration of lamellar surface 
within the grana is about twice as high as in the inter- 
vening (intergrana) regions, and it can be seen (Fig. 5) 
that this comes about because of a pairing or bifurca- 
tion of the lamellae at the edges of the grana. This 
type of chloroplast with well-defined grana is charac- 
teristic of higher plants, most of which do not possess 
plastids of the type found in the parenchyma-sheath 
cells of Zea. 

In both types of chloroplasts, the individual lamellae 
(each 130 A thick) of both the grana and intergrana 
regions exhibit a compound-layer structure (Figs. 6 and 
7), consisting of a central dense line (the P zone, about’ 
40 A thick and often resolvable as a doublet) on both 
sides of which are less dense layers (the Z zones). 
Finally, the entire compound layer structure is edged 


by very thin dense lines, the C zones. Such differences — 
in density arise in part from differences in reactivity — 
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Fic. 1. Vegetative cell of a Nostoc spp., showing the generalized whorl-like lamellar system and the absence of specialized organelles 
such as chloroplasts and mitochondria. X60 000. (Unless otherwise indicated, all illustrations are electron micrographs of thin sections 
of plant material fixed in osmium-tetroxide solutions of appropriate pH and tonicity, and embedded in methacrylate. Details are given 


in references 2, 7, 13, and 14.) 


with osmium tetroxide as well as from intrinsic differ- 
ences in electron density within the various components 
of the structure. Within the grana, the close packing of 
such compound lamellae results in close apposition of 
C zones and, thus, gives rise to a layer structure (Figs. 
8 and 9) with a repeat period of 130 A, bearing a re- 
markable resemblance to that found in the myelin 
sheath of nerve fibers. The central dense lines of the 
compound lamellae correspond to the major dense lines 
of the periodic structure within the grana and are often 
observed as doublets (Fig. 9), each line of the pair being 
about 15 A thick. It is clear, therefore, that the sym- 
metrical structure of the individual compound lamellae 
arises from the close apposition of two structurally 
asymmetric “unit membranes” (each about 70 A thick) 
as shown schematically in Fig. 11. The J zones, which 
occur midway between P zones in the grana, arise by 
apposition of the C zones of the unit membranes. The 
work of a number of people, notably Robertson," 
indicates that the cell membranes and intracellular 
membrane systems of most, if not all, cells consist of 
such unit membranes in single, double, or compound 
array. This also appears to be true of plant cells. In 
Fig. 10, the tonoplast is seen as a single membrane with 


a pair of dense edges, and the double limiting membrane 
of the chloroplast consists of two such unit membranes. 
The intrinsic asymmetry of these unit membranes is 
not usually evident in the electron microscope except 
where they are stacked in double or multiple array, as 
in the lamellae and grana of chloroplasts and in myelin 
sheath. In both cases, this stacking results in a set of 
major dense lines with less-dense intermediate lines in 
between. As already indicated by Ferndndez-Morén 
(p. 319), there is good reason to believe that each 70-A 
membrane comprises a double layer of mixed lipids 
(the Z zones) sandwiched between two thin monolayers 
of protein which stain densely with osmium tetroxide, 
with a probable contribution to this latter density 
arising from reaction of the hydrophilic groups of the 
phospholipids with osmium tetroxide. In the case of 
myelin sheath, a very satisfactory correlation between 
electron-microscopic and x-ray diffraction data has 
been achieved” and the chemical composition is reason- 
ably well known. It seems fairly certain, therefore, that 
this type of structure is essentially correct for most 
cellular membranes, with only such minor differences 
in detail arising as are demanded. by differences 10 


function. 
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lic. 2. Transverse section through the cytoplasm of a Nitella cell, showing the lamellated structure of a chloroplast surrounded by 
its limiting membrane c. m., mitochondria m, various membranous elements within the cytoplasm, and the tonoplast lining the lumen of 
the central vacuole_C. V. X75_000. 


Fic. 3. Phase-contrast micro- 
graph of a relatively thick 
transverse section through the 
leaf of a three-week-old plant 
of Zea mays L., illustrating the 
difference in appearance be- 
tween the chloroplasts of the 
parenchyma-sheath cells P. S. 
and those of the mesophyll cells 
M. A vascular bundle lies in 
the center of the field. 
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Fic. 4. Electron micrograph of a region similar to that outlined in Fig. 3, illustrating the lamellar structure both of parenchyma-sheath 
cells P. S. and mesophyll M chloroplasts of Zea. In the mesophyll chloroplasts, the grana appear as dense rectangular regions when the 
section is normal to the lamellar plane, and as circular regions when the plane of section is parallel to the lamellar plane. X30 000. 


Fic. 5. Two grana in 
a mesophyll chloroplast 
of Zea, showing the type 
of connection between 
the grana lamellae and 
those in the intervening 
(intergrana) regions. 
(Cf. Fig. 11.) X85 000. 


In the case of the chloroplast, the situation is less 
well documented, but certain conclusions can be drawn 
from the available structural and chemical data, and 
by extrapolation from the better characterized myelin 
system. It can be seen from Fig. 10 that the membranes 
making up the lamellae and grana of the chloroplast 
are considerably more dense than the external limiting 
membranes of the chloroplast and the tonoplast, which 
exhibit densities more characteristic of the usual cellular 
membrane systems. In particular, the L-zones of the 
chloroplast lamellae exhibit higher densities than are 
found in cell membranes, endoplasmic reticulum, etc. 
A possible explanation of this may lie in differences of 
chemical composition. It is known that the chloroplast 
is deficient in phospholipids as compared with myelin 
sheath.’ This deficiency is counterbalanced by the 
presence of considerable amounts of carotenoids and 
chlorophyll. Furthermore, as is shown later, there is 
good evidence to suggest that chlorophyll is an integral 
component of the chloroplast lamellae and that suffici- 
ent is present for its incorporation as a monolayer over 
the entire lamellar area of the chloroplast as estimated 
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FıG. 6. Mesophyll chloroplast of 
Zea, illustrating the high degree 
of order within the grana, and 
the compound-layer structure of 
the intergrana lamellae (arrow). 
X 160 000. 


seems justified that the photosynthetic lamellar ele- 
ments of the chloroplast consist of structurally asym- 
metric unit membranes (Fig. 11) in which chlorophyll 
and carotenoids are incorporated in orderly array. It is 
of interest to note that the exciton-migration theory of 
energy transfer proposed by Calvin appears to require 
the presence of such a structural asymmetry. The type 
of model suggested by Calvin (p. 157) has the necessary 
asymmetry (Fig. 12) and appears to be consistent with 
the presently available chemical and structural evi- 
dence. Such a staggered, partially overlapping configu- 
ration of the chlorophyll molecules has the further 
advantage over other models (which usually involve 
coplanarity of the porphyrin “heads” of the chloro- 
phyll) in that it offers a plausible explanation for the 
low values of dichroic effects so far observed in chloro- 
plasts. Furthermore, it would allow a greater degree of 
m-electron interaction than allowed by the coplanar 
type of model, thus facilitating energy transfer, and is 
consistent with the type of packing found in crystals of 
A polycyclic organic compounds by x-ray diffraction. As 
has been seen the chloroplasts of highér plants charac- 
teristically contain grana in which the unit membranes 
are arranged in a highly ordered and close-packed array. 
The significance of this specialization is unknown at the 


~ present time, but it is tempting to speculate that the Fre. 7 Gnade d layer structu aA 
: 7 r i -f. re of the lamellae in e 
high degree of order might deadcte anes EGR HAAR; Colection. Ditizea shea Hauialacanlsst of Zea. X270 000. 
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Tic. 8. Mesophyll 
chloroplast of Zea 
to illustrate the 
myelin-like structure 
within a granum. 
Note the main peri- 
odicity of about 130 
A and the presence 
of fainter intermedi- 
ate lines. X370 000. 
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ping of photons, or perhaps result in facilitation of 
exciton migration as a result of a stabilization of the 
layer structures which possess the conduction bands 
necessary for energy transfer. 


SWELLING PROPERTIES OF THE CHLOROPLAST 


The swelling characteristics of chloroplasts offer a 
striking demonstration of the “plasticity” of such lipo- 
protein layer systems. As already mentioned the swell- 
ing of mitochondria can be largely prevented by active 
expenditure of energy from ATP hydrolysis. A similar 
phenomenon exists in the case of chloroplasts. Klein 
(unpublished work) has shown that the swelling re- 

': sponse of isolated chloroplasts depends on whether the 
suspension of plastids is illuminated or is kept in the 
dark. 

When isolated Nitella chloroplasts are placed in a 
hypotonic medium, swelling results first in a separation 
of the lamellae to distances many times that charac- 

E teristic of the intact chloroplast.? In highly hypotonic 

media, the lamellae break up. The hydrophobic hydro- 
carbon chains of the lipids thus are exposed at the free 
eS edges of the membranes, giving rise to an unstable 
situation, with the result that such free edges “zip 

gether to form membrane-bounded vesicles. Similar 
ults have been obtained for isolated Zea chloroplasts 
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(Fig. 13), and the characteristically vesicular structures 
found in so-called microsome fractions isolated by 
conventional procedures arise in similar fashion (e.g,, 
Hodge") from the endoplasmic reticulum. This plastic 
behavior is understandable in terms of what is known 
of membrane structure. In the type of layer structure 
under consideration here, it seems likely that the ori- 
ented lipid molecules possess a high degree of rotational 
freedom and considerable translational freedom within 
the plane of the membrane. Both properties are charac- 
teristic of the smectic fluid-crystal state and confer a 
remarkable capacity for changes in the topographical 
configuration of such membrane systems. 


CHLOROPLAST DEVELOPMENT 


The development of the chloroplast is of interest in 
that it appears to take place by a process which is 
essentially the reverse of that involved in the swelling 
phenomena just discussed. In brief, the compound- and 
extended-layer structures of the lamellae and grana 
appear to be formed by a process involving the fusion 
of small vesicles or micelles one with another to form 
extended cisternae or ‘“‘double-membrane structures.” 

The development of a leaf or of a plant is a compli- 
cated sequence of events, the time and spatial relation- 
ships of which lead to difficulty in following the process 
of chloroplast development. The etiolated plant (one 
grown from seed in total darkness) is a much more 


Fic. 9. A granum in a mesophyll chloroplast of Zea, illustrating 
the splitting of the main dense lines (arrow). The central dense 
lines (P zones) of the intergrana lamellae are also occasionally 
resolvable as doublets, thus showing that the compound lamellae 
consist of two closely apposed asymmetric unit membranes. 
220 000. 
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Fic. 10. Chloroplast in a three-week-old wheat plant, showing lamellar structure very similar to that in Zea mays. The tonoplast ¢ 
appears as a single membrane, and where suitably oriented with respect to the plane of the section appears as two fine dense lines with 
a less dense layer between them, the over-all thickness being about 70 A. The chloroplast envelope c. m. comprises two unit membranes 


spaced about 100 A apart. X 160 000. 


favorable system for such a study, since such seedlings 
can be exposed to light for various periods and the 
effects on chloroplast structure noted at definite time 
intervals following such exposure. This system depends 
on the fact that light is essential for chlorophyll syn- 
thesis in higher plants. 
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Fic. 11. Diagram to illustrate the densities observed in electron micrographs of osmium-fixed chlor 
and the structureal relationships involved in the formation of the compound 


a membranes. 


Figure 14 shows typical plastids in an etiolated leaf 
of Zea.'* In the absence of chlorophyll, the plastids, 
although recognizable as well-defined organelles with 
external limiting membranes, fail to develop lamellae. 
Instead, the interior of each plastid is partially filled 
with a mass of small vesicles (the prolamellar body). 


; Grana 
MUTT 


sn 
S 


LMI — L Zone 


Se AN 


STAT 
Ere 
ail 


CC-0. Gurukul Kangri University Haridwar Collection. Digitized by S3 Foundation USA 


oplast lamellae and na 
lamellae (and grana) from structurally asymmne tne 4 


~ pare with Figs. 4 and 6.) 


338 ALAN Je 


"AN S 
VIUNO — 


Phospholipid 
ev 130A 


—, 


Lin ...... 
AW, 


Fic. 12. Diagram to illustrate one way in which the compound 
lamellae observed in electron micrographs of chloroplasts could 
be built up from structurally asymmetric unit membranes having 
a molecular structure of the type proposed by Calvin (p. 157). 


— Protein 


On placing the plant in daylight, chlorophyll is synthe- 
sized (as evidenced by a progressive greening of the 
leaves), and definite structural changes are observed in 
the plastids. Double membranes are observed (Fig. 15) 
apparently emerging from the prolamellar body." At a 
later stage, typical chloroplast lamellae are seen, usually 
in a pattern radiating from a progressively decreasing 
prolamellar body (Fig. 16). It is of interest to note that 
in certain Zea mutants which are unable to synthesize 
chlorophyll, the plastids are similar to those of etiolated 
normal Zea in that they lack formed lamellar elements 
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and contain only small vesicular structures. The evi- 
dence thus strongly suggests that chlorophyll is at 
least essential for the formation of typical chloroplast 
lamellae and is probably an integral component of 
these membrane structures. 

At a later stage, rudimentary grana with a fine 
structure indistinguishable from that of the more ma- 
ture grana already described may be observed (Fig. 17) 
within the mesophyll chloroplasts. After two days, the 
developing plastids closely resemble normal chloroplasts 
and no prolamellar bodies can be discerned, presumably 
as a result of complete conversion of the small vesicles 
into formed lamellar elements. At this stage and in 
earlier stages of recovery from etiolation, however, and 
in young chloroplasts of normal plants, one commonly 
observes immediately under the chloroplast limiting 
membrane a number of vesicles and sacs (Fig. 18) with 
membrane densities considerably lower than those 
characteristic of the already differentiated lamellae 
and grana. These observations suggest the possibility 
that this region, or the limiting membrane of the chlo- 
roplast itself, may be responsible for the formation of 
the small vesicles, perhaps by outpocketing and pinch- 
ing-off from the limiting membrane, or by some direct 
synthetic process. 

The concept of vesicle fusion either with other vesi- 
cles or with already existing extended membranes is 
attractive in relation to (1) the formation and main- 
tenance of structures of higher order, such as the 


6. 13. Mesophyll chloroplas 


Zea isolated in 0.5 M glucose in neutral phosphate buffer, showing a marked swelling reaction. (Com- 
Nees that the intergrana lamellae have broken up and formed large numbers of vesicles. X 40 000. 
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lic. 14. Plastids in an 
etiolated leaf of Zea. The 
plastids contain masses of 
minute vesicles, but are de- 
void of lamellae, presum- 
ably because the absence of 
chlorophyll prevents their 
formation. 


Fic. 15. Plastid from an 
etiolated leaf of Zea after 
several hours exposure to 
daylight showing an early 
stage in the formation of 
lamellae. Note that the 
lamellae appear to arise 
initially as double mem- 
branes, apparently by a 
process involving an orderly 
fusion of the small vesicles 
comprising the prolamellar 
body P. B. X55 000. 
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Fic. 16 (upper left). A somewhat later stage in recovery from etiolation than that shown in Fig. 15, illustrating the progressive 
reduction in size of the prolamellar body P. B. and formation of typically dense chloroplast lamellae. 


Fic. 17 (upper right and lower area). Mesophyll plastid of etiolated Zea after 20-hr exposure to 
i daylight showing the beginning of grana formation. X210 000. 


endoplasmic reticulum and the various organelles of could be accounted for by fusion of vesicles with the 
the cell (the vast increase in membrane area required Schwann-cell membrane); and (2) the transport both 


for the formation of the myelin sheath of nerve fibers into and out of the cell of various substances. Such & 
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Fic. 18. Mesophyll plastid of etiolated Zea after 48-hr exposure to daylight. Typically dense chloroplast lamellae and grana are 
present. Note the presence of less-dense vesicles and cisternae in the region immediately beneath the chloroplast envelope c. m. 


X85 000. 


mechanism has been proposed for the extrusion of 
acetylcholine in “quantized”? amounts from the nerve 
endings in myoneural junctions, and for the passage of 
zymogen granules from the apical regions of acinar 
pancreas cells through the limiting membrane into the 
lumen of the acinus. Similarly, the occurrence of the 
reverse process (i.e., pinocytosis) seems to be well 
documented. While mechanisms of this general type 
are widely supported by indirect experimental evidence, 
and appear to be thermodynamically feasible in terms 
of current knowledge concerning the structure of lipo- 
protein layer systems, the crucial problem of energy 
coupling in the physical control of such membrane 
systems remains unelucidated. 
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N discussing biological membranes one deals with a 

considerably higher level of organization than has 

characterized most of the material covered in the fore- 

going papers, even in the consideration of cellular 

biology, since one must deal with everything from the 

presumed limiting membranes of individual cells to 

entire layers of cells, each of considerable complexity 

and, in some cases, even consisting of several different 

cell types. However, in general, the problem is over- 

simplified here by disregarding whenever possible all 

details of structure which are not obviously con- 

cerned directly with the processes under immediate 

examination. 

The animal body can be considered to be made up of 

a number of compartments separated one from another 

and from the external environment by a series of mem- 

branes. Each compartment has a more or less charac- 

teristic composition, often differing markedly from that 

of the adjacent compartment so that across these 

membranes there may be rather steep gradients of 

chemical concentration and often of electrical poten- 

tial as well. The problem is to clarify the means by 

which various materials cross these membranes and by 

which the striking gradients are maintained. To some 

extent the penetration of various materials can be 

defined in terms of relatively simple forces; in other 

instances it is apparent that the movements of certain 

substances require the utilization of energy, a process 

which has been designated as active transport and 

li which is currently the subject of numerous studies. In 

no case, however, can it be said that the mechanisms 
involved have been identified. 

Consider, in a general fashion, the possible ways in 

hich materials might cross biological membranes and 

he forces which might effect these movements. These 

considerations are based heavily upon the treatment of 

the subject of membrane transport as developed by 

Ussing. In general, the definable forces which produce 

: movement across membranes can be classified as: 

2 (1) those due to gradients of chemical activity including 

i? differences in concentration or activity coefficients; 

(2) those due to gradients of electrical potential; and 

(3) those exerted upon solutes by the flow of solvent 

through solvent-filled channels. In addition, one finds 

in practically all biological systems movements of 

= solute which are explained by none of these definable 

= forces and which, therefore, are designated as “active 

= transport.” The definition of active transport is a 

r arbitrary affair. As defined, it has the advantage 

1g out those solutes which must be considered 
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directly involved in the process by which metabolic 
energy is utilized to perform transport work. However, 
clearly all concentration or electrical gradients and 
flows in response to pressure or osmotic gradients 
ultimately depend upon metabolic work even though 
this may be remote from the particular membrane 
under consideration. So the requirement of metabolic 
work is not sufficiently exclusive for an adequate 
definition of active transport. 

However, the definition does not include all proc- 
esses which have specificities beyond simple diffusion 
and flow and does not include the process by which 
certain substances can cross membranes only, say, by 
combining with a specific carrier even though the move- 
ment is entirely downhill with respect to gradients of 
electrochemical potential. One must recognize, there- 
fore, the arbitrariness of this definition. Presumably, 
when, and if, a better understanding of the discrete 
processes involved is attained, the need for the term 
active transport will disappear. 

Of course, the manner in which solutes may cross 
membranes depends upon the nature of the membrane. 
If it is a continuous lipid layer, substances can pass 
this layer only by dissolving in it, although their pas- 
sage through such a layer might be facilitated by 
combination with some component of the lipid layer 
which would enhance the lipid solubility of the solute. 
For such membranes, the permeability to particular 
solutes should, in general, be a function of lipid solu- 
bility. Indeed, observations going back many years 
have shown that in several chemical series the permea- 
tion of cells can be related to lipid solubility. However, 
there are many features of cellular membranes which 
cannot be explained if they are assumed to consist of 
a solid lipid layer and it is generally believed that the 
lipid layer contains pores, through which water forms a 
continuous phase and through which the water soluble 
solutes pass. Indeed, it is only through such channels 
that forces produced by the flow of solvent can act. 
Since, as is pointed out later, there is evidence that 
solvent flow does produce movement of solute, it is 
apparent that such solvent-filled pores (and in this 
case, of course, the solvent referred to is water) are 
present in some membranes. For present purposes, it 
is convenient to, accept the view that cell membranes 
generally consist of lipid layers penetrated by aqueous 
pores, the walls of which may bear an excess of fixed 
charged groups which will impede the passage of ionic 
species of the same sign. In some cases where mem- 
branes consist of more complicated structures than the 
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plasma membrane of single cells and may consist of an 
entire layer of cells attached to a common basement 
membrane, there may be reason to believe that the 
passage of water soluble substances occurs between the 
cells rather than through them. This appears to be the 
case in blood capillaries, but probably not in others 
such as the mucosal lining of the alimentary tract and 
the skin of frogs and toads. 

Perhaps the best-studied biological membrane sys- 
tem and the one in which theoretical considerations 
have been applied most explicitly is the frog skin which, 
in particular, Ussing has used as a model system for 
extensive studies.*~> Unfortunately, from the structural 
point of view, the frog skin is rather complicated, but 
perhaps it may be simplified for present purposes by 
making the probably valid assumption that only the 
basal layer of cells is involved in the transport processes 
of concern here. The skin can be stripped rather easily 
from the abdomen of the frog and stretched in vitro as 
a diaphragm between two chambers. It will then 
survive in this isolated state, perform its metabolic 
operations, and retain a well-maintained capacity to 
perform active transport for some hours. The sheet of 
tissue thus obtained consists of a loose layer of connec- 
tive tissue, a continuous basement membrane, and 
several layers of epithelial cells. As one proceeds from 
the basement membrane toward the outer skin surface, 
the cells progressively become more flattened and 
keratinized and, presumably, lose their metabolic ac- 
tivities. In any case, it may be assumed for present 
purposes that all of the observed phenomena are attrib- 
utable to the basal layer of cells—an assumption to 
some extent supported by the fact that toad bladder 
has been shown to exhibit many similar properties but 
to have only the single layer of cells corresponding to 
the basal layer of cells in the skin.® 

Now, it has been known for many years that the frog 
skin generates an electrical potential, the solution in 
contact with the inner surface being positive with re- 
spect to the outside. It has also been known that the 
frog in the intact state is able to extract salt from its 
surroundings even though the outside concentration of 
salt is far lower than in the tissue fluids of the animal. 
For instance, Krogh’ found that the frog which has a 
concentration of sodium in its extracellular fluid of 
something over 100 mEq/L (milliequivalents per liter) 
was able to take up salt from its surroundings when the 
outside concentration was as low as one hundredth of a 
milliequivalent per liter—a concentration ratio of 
10000:1. Furthermore, the uptake was completely 
specific for sodium—no potassium or calcium was taken 
up even when the concentration was many times higher. 
One other unusual observation was made by Hevesy 
et al. when the technique of using isotopic tracers was 
introduced. They found that the frog sitting in water 
had a net uptake of water several times greater than 
predicted from the measured influx of tagged water and 


TRANSPORT 343 


the assumption that the flux in each direction across 
the skin should be proportional to the water activity in 
the solution of origin—in this case, the lymph or ex- 
tracellular fluid of the frog and the water in which the 
frog was immersed. Observations such as this one led 
to the hypothesis that water was actively transported 
across such membranes (similar observations were 
made by Visscher and his associates using the intestine 
of the dog®)—however, a simpler physical explanation 
is shown to make the assumption of active water trans- 
port unnecessary. One further observation on the intact 
frog might be mentioned before going on to the more 
detailed observations—this is the fact that the rate of 
uptake of water from the surroundings can be enhanced 
greatly by injection of the frog with posterior pituitary 
extracts.” Hence, the term amphibian water-balance 
principle was introduced to describe this activity, which, 
however, has been since shown to be at least qualita- 
tively reproduced by the use of the purified polypeptide 
hormones of the posterior pituitary. 

These, than, are the gross observations which Ussing 
and his collaborators set about to dissect by the applica- 
tion of isotopic tracers. Disregarding for the time being 
the net movement of water which in any case is negli- 
gible when the skin is bathed on both sides with solu- 
tions of equal osmotic pressure, Ussing showed that the 
forces acting on an ion to cause movement across the 
intervening membrane would produce a flux from one 
surface to the other described by the equation 


dā ART zF 
M=—— exp— 
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where M is the unidirectional flux across the membrane, 
ā the chemical activity of the species under considera- 
tion, A the area of the membrane available for penetra- 
tion by the species under consideration, G is a frictional 
coefficient describing the interaction between the mem- 
brane and a diffusing ion, Vo is Avogadro’s number, 
z the valence of the ion, Y the electrical potential, and 
J, £, R, and T have their usual meaning.’ The equation 
contains several unknowns which are not determinable, 
and, therefore, it is not too useful in this form. However, 
if attention is directed to the ratio of fluxes from one 
side of the membrane to the other, the indeterminate 
quantities cancel out leaving upon integration the 
relatively simple relationship, 


=——exp| —(yı— y) 
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which contains only experimentally determinable quan- 
tities, except perhaps the activity coefficients, which, 
however, also cancel out when the solutions on the two 
sides of the membranes are essentially the same. 

The skin was set up between two chambers, each 
containing solutions vigorously stirred and oxygenated 
by a stream of air or oxygen. When the inside solution 
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Fic. 1. Sodium fluxes across isolated frog skin as functions of 
outside sodium concentration [from H. H. Ussing, Cold Spring 
Harbor Symposia Quant. Biol. 13, 193 (1948)]. 


was Ringer’s solution and the outside various dilutions 
of Ringer’s, it was found! that the influx of sodium 
always was greater than the outflux so long as the out- 
side concentration was not less than about 1 mM/l. 
The influx of sodium increased in nonlinear fashion as 
outside sodium was increased; there also was some 
small increase in outflux but this always remained small 
(Fig. 1.). 

These, of course, represent merely some quantifica- 
tion of what was already well known—that sodium 
could be transported inward against combined chemical 
and electrical gradients and was, therefore, an active 
process. However, it was found that the chloride fluxes 
corresponded well to the predictions of the flux-ratio 
equation and that chloride movements could be con- 
sidered entirely passive. The sodium influx was gen- 
erally greater than that of chloride so that there could 
be no question of bulk movements of elements of out- 
side solution to the inside. In general, the potential 
was higher when the sodium influx was greater and 
higher when chloride influx was decreased as could be 
achieved by treating the skin with 107° MCut** solu- 
tions. It was proposed that the potential was produced 
by the transport of sodium and that the passive move- 
ments of other ions tended to reduce the potential.“ 


TABLE I. Unidirectional fluxes, net flux, and electric current in 
short-circuited frog skin [from T H. Ussing, The Rear between 

live Ion Transport and Bioelectric Phenomena stituto de 
nee U F ea . do Brasil, Rio de Janeiro, 1955)]. 


AD Na 


pa/cm? 
ANa Current 
17.7 17.8 
9.6 9.9 
39.2 38.6 
60.3 56.8 
45.4 44.3 
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Studies of the short-circuited frog skin served to 
show that only sodium was actively transported and 
that, when the potential across the skin was kept at 
zero, there was a remarkable correspondence between 
the current flowing through the skin and the net 
movement of sodium ions" (Fig. 2 and Table I). 

These observations, then, were compatible with the 
hypothesis that the only active process involved was 
the transport of sodium ions from outside to inside, 
No other ion, except—to a small extent—lithium could 
substitute for sodium,* and, in particular, potassium 
could not, although the complete absence of potassium 
from the bathing solutions very markedly reduced the 
transport of sodium—a circumstance to be mentioned 
again later. 

These studies, then, have defined the behavior of the 
frog skin, but the question remains as to what kind of 
process might be involved in the transport of sodium 


Fic. 2. Apparatus used for determining sodium fluxes and 
current in short-circuited frog skin [from H. H. Ussing and K. 
Zerahn, Acta Physiol. Scand. 23, 110 (1951)]. 


which, of course, is the major question in the field of 
active transport at present. Two additional types of 
study involving the amphibian skin have yielded infor- 
mation pertinent to this question. 

One of the favorite theories of ion transport has 
involved the so-called redox or electron-transport pump, 
variations of which have been suggested by a number of 
investigators. These hypotheses are based on the as- 
sumption that oxidation and reduction of cytochrome 
iron are spatially separated and that hydrogen ions 
formed in the reductive step are available for exchange 
for some cation from outside the cell. The electrons 
then are passed along the chain, the cations taken up 
possibly moving with them, and the electrons finally 
are donated to oxygen to yield hydroxy! ions. Such a 
scheme has the stoichiometric limitation that only four 
electrons and four univalent ions can be transported 
per oxygen molecule consumed. By careful measure- 
ment of oxygen consumption and current flow (net 
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sodium transport) in the frog-skin system, both Zerahn!® 
and Leaf and Renshaw!® have been able to show that 
more than four ions are transported per oxygen con- 
sumed. The average figure obtained by Leaf and 
Renshaw (Fig. 3) was close to seven and some of the 
ratios were in excess of ten. Thus, even if all of the 
oxygen consumed were utilized in ion transport—a 
doubtful assumption—the ratio of transport to oxygen 
consumption is too great to be attributed to a redox 
pump mechanism. (Even higher ratios, going up to 20, 
can be obtained, if one divides the change in transport 
by the increase in oxygen consumption when sodium 
transport is increased in one of several ways.) 

An alternative type of mechanism is suggested by 
other recent observations of Koefoed-Johnsen and 
Ussing. The experiments were done with membranes 
in which maximum potentials were induced by reducing 
to a minimum the effective anion permeability. This 
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Fic. 3. Relationship of sodium transport to total oxygen con- 
sumption of isolated frog skin. Frequency distribution of 120 
periods of simultaneous measurement [from A. Leaf and A. 
Renshaw, Biochem. J. 65, 82 (1957) J. 


was done by treating the skin with Cut, or by replac- 
ing the chloride in the outside solution with sulfate to 
which the skin has a very low permeability. The con- 
centrations of sodium and potassium on each side of 
the membrane were then varied systematically and the 
effect on the potential observed. Under these circum- 
stances, the inside of the skin behaved very closely as 
if it were a potassium electrode, the potential across the 
skin decreasing approximately 60 mv with each tenfold 
increase in potassium concentration. Changing the 
potassium concentration on the outer surface of the 
skin had no effect, but changing the sodium concentra- 
tion did, the potential increasing as sodium concentra- 
tion was increased. This strongly suggested that the 
two surfaces of the cells making up the skin had quite 
different and specific properties—a fact that is apparent 
also from other properties. The two potentials thus 
behaved as diffusion or junction potentials at bound- 
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aries specifically permeable to single cations. The ex- 
planation offered is as follows: A sodium ion is extruded 
from the inner cell surface by forced exchange for 
potassium; that is, a charged carrier drops off a sodium 
ion and takes up a potassium ion. At the inner surface, 
by some exergonic reaction, the specificity of the 
carrier is changed and it gives up the potassium in 
exchange for another sodium ion. Thus, the sodium 
concentration inside the cell is reduced and the potas- 
sium concentration raised in a process which, in itself, 
produces no electrical potential. However, since the 
inner surface membrane is highly permeable to potas- 
sium, potassium diffuses out leading to a diffusion 
potential with the cell contents negative. The nega- 
tivity of the cell contents tends to produce the uptake 
of a cation through the outer surface, but this surface 
being permeable only to sodium, it is a sodium ion which 
enters. The net effect is a cycling of potassium at the 
inner surface and a net movement of sodium ion from 
outer bathing solution to the inside of the skin. In the 
presence of chloride and a normal permeability to it, 
the potential is maintained at a lower level by the 
diffusion of chloride from outside to inside, and in the 
normal state the net movement of sodium and chloride, 
of course, must be very nearly equal. 

This model is supported further by potential meas- 
urements made by Hoshiko and Engbaek" in Ussing’s 
laboratory. By thrusting microelectrodes progressively 
through the skin, Hoshiko found the change to occur 
in two jumps which can be interpreted as representing 
changes occurring at the two surfaces of the active 
layer of cells. 

This, then, provides a well-studied example of elec- 
trolyte transport by a membrane and it is generally 
hoped that, in principle, sodium transport is similar in 
other systems which are less accessible to isolation and 
study—such as the electrolyte transport in the renal 
tubules. Of course, here as elsewhere, very big questions 
remain—namely, what is the nature of the ion carrier? 
How does it derive its unusual specificity, and how is 
this specificity modified to provide the specific orienta- 
tion of the transport process? Incidentally, mention 
should be made of the fact that the specificity for 
sodium on the outward trip (across the inner surface) 
in frog skin is not matched by a similar specificity for 
potassium on the inward trip, since varying the pH of 
the inside solution has an effect on the frog skin po- 
tential similar to that of changing potassium concen- 
tration," suggesting the possibility that hydrogen and 
potassium may be interchangeable in this process. A 
similar situation appears to hold in the renal tubule.!8 

So much for the potential-generating and ion-trans- 
port properties of frog skin. There remains one aspect 
not mentioned yet—namely, the roles of diffusion and 
flow in the movement of water across the skin and the 
effect of such movements on the movement of solute. 
These phenomena again can be used to illustrate more- 
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general phenomena characteristic of biological mem- 

branes. Actually, even in the conditions already dis- 

cussed, there is some net movement of water inward in 

an amount such as to make the net transport of salt re- 

sult in no net change in solute concentration. However, 

this is so small as to be justifiably neglected. Under other 

conditions, however, there may be appreciable net 

movements of water. How may such water movements 

be brought about and what might be their effects on the 

fluxes of water and various solutes? Aside from the 

possibility of active water transport, a phenomenon 

existence of which is doubtful and which certainly is 

not required to explain anything in connection with 

frog skin, movements of water might be the result of 

hydrostatic or activity gradients. There are no appre- 

ciable hydrostatic gradients involved in the skin system 

so one is left with gradients of water activity or osmotic 

pressure. If one has only osmotic gradients, does this 

exclude the possibility of hydrodynamic flow through 

pores? It has been claimed that this is the case." 

However, Ussing has illustrated with a simple model a 

mechanism by which hydrodynamic flow can result from 

a gradient of solute concentration only.! There are un- 

doubtedly more-rigorous approaches,” but this is an 

easily visualized model. Consider a pore permeating a 

membrane impermeable to some solute contained in a 

solution on one side and not on the other. Consider the 

boundary of the pore at the end facing the solution. 

Just inside the pore, which the solute cannot enter, is a 

pure water solution of molar fraction one while just 

across the boundary of the pore is a solution containing 

a finite concentration of solute and having a molar 

fraction of water something less than one. Now clearly 

there will be a net diffusion of water from just inside 

the pore into the solution. But since the loss of water 

from the end of the pore cannot change its molar frac- 

i tion, there is no reason why water should diffuse from 

i deeper in the pore. Nevertheless, there will be a net 

movement of water through the pore. This then must 

be the result of a decrease in pressure at the end of the 

pore from which water has been lost and, hence, there 

actually is a hydrostatic gradient causing the water 

movement and, of course, hydrostatic gradients can 

cause flow through pores. It has been reported by 

Mauro that gradients of osmotic pressure across artifi- 

cial membranes produce flow equal to that produced 
by the equivalent hydrostatic pressure.”! 

i The effect of flow through a pore is to accelerate the 

he § diffusion of molecules in the direction of flow and to 

= retard it in the opposite direction. Since the solvent 

= drag force is inversely proportional to the diffusion 

coefficient, the effect in inducing asymmetry should be 

greatest for molecules as large as possible which can 

= still pass readily through the pores. Andersen and 

= Ussing have used thiourea and acetamide to examine 

the phenomenon.” Thiourea or acetamide were added 

to lution on both sides of the skin (in this case, 
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toad skin because it showed more marked responses to 
the addition of posterior pituitary extracts) in equal 
concentration, but the molecules in the outside solution 
were labelled differently from those in the inside solu- 
tion. It thus was possible to measure the flux in each 
direction. Experiments were also done with labeled 
water. The rate of net movement of water was varied 
by diluting the outside solution. When the solution on 
both sides was Ringer’s, there was very little net move- 
ment of water and the fluxes of the test substance were 
very nearly symmetrical (Table II). When the outside 
solution was diluted, there was a net movement of 
water inward and a marked asymmetry of the fluxes 
of thiourea and acetamide. When the log of the ratio of 
inward permeability constant to the outward permea- 
bility constant was plotted against the net water trans- 
fer, a linear relationship was obtained as predicted 
theoretically and the ratio of the slopes of 2.3 fits 
closely with predicted ratios of 2.4 and 2.7 for aceta- 
mide and thiourea, respectively (Fig. 4). 

Of particular interest was the effect of adding neuro- 
hypophyseal hormone (Table II) which greatly in- 
creased the net water movements and the asymmetry 
of the fluxes, but had only a very small effect on the 
total water flux. Thus, it seemed to increase the area 
available for flow with very little change in the area 
available for diffusion. Such an effect might be the 
result of changing the shape of pores without changing 
their total area. In any case, these experiments seem to 
provide a clear indication of the existence of pores and 
of the production of bulk flow by osmotic gradients. 
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Fic. 4. Relationship between the rate of osmotically induced 
net water transfer (abscissa) and the logarithm of the ratioĵo 
inward to outward rate constants (ordinate) in isolated toad 
bladder Paa B. Andersen and H. H. Ussing,#Acta‘Physiol. Scand. 
39, 228 (1957)J. es 
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TABTE IT. Effect of neurohypophyseal hormone on permeability of isolated toad skin. Inside bathing solution Ringer’s solution, outside 
solution as specified. Kin=inward permeability coefficient (cm/hr), Kouz=outward permeability coefficient (cm/hr), and AW =net 


water transfer (u1/cm?/hr), [from B. Andersen and H. H. Ussing, Acta Physiol. Scand. 39, 228 (1957)]. 


Before hormone After hormone 


r Kin Kout Kin Kin Kout Kin 
Date Outside medium Test substance X101 X101 Kou AW X10 X101 Kout AW 
March 26 Ringer’s Thiourea 6.63 4.72 1.40 0 
June 11 10.5 8.08 1.30 0 278 278 1.00 0 
June 14 11.5 Ons MAR O 106 109 0.97 2 
June 22 11.5 13.7 0.84 3 247 223 1.11 12 
June 23 12.4 14.1 0.88 2 64.7 62.3 1.04 1 
June 25 3.17 447 0.71 1 7.71 8.45 0.91 1 
1.05 * 0.98 £ 
June 15 Acetamide 537 516 1.04 14 
June 22 363 321 1.13 10 
March 25 1/10 Thiourea 2.90 3.03 0.96 8 20.6 16.2 1.27 27 
April 6 Ringer’s 6.48 6.51 1.00 14 17.4 14.4 1.21 45 
April 8 11.4 8.73 133 14 49.3 31.2 1.58 74 
April 26 4.00 4.02 1.00 8 7.52 6.42 1.17 33 
April 29 4.25 4.07 1.04 17 15.4 12.3 1.25 38 
May 3 9.86 6.52. 1.52 14 156 113 1.38 71 
May 25 Acetamide 251 121 2.08 105 
June 1 175 105 1.67 80 
June 6 299 180 1.66 100 
June 10 530 239 2.22 135 
June 29 129 100 1.29 24 ý 
April 6 81.5 76.1 1.16 16 196 81.5 2.40 108 
April 26 Heavy water 3730 3510 1.06 22 5550 4510 1.23 104 
May 3 4090 3890 1.05 20 4250 3750 1.13 50 
May 9 3630 3420 1.06 21 4270 3730 1.14 54 
4040 3760 1.07 28 4560 3850 1.18 71 
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They also make unnecessary any assumed active water 
transport by the amphibian skin. Similar phenomena 
also have been demonstrated to occur in some marine 
eggs,” and it is a reasonable conclusion that the earlier 
observation of unexplained high net water fluxes in 
intestinal loops has a similar basis. 

The phenomenon of exchange diffusion is of interest 
with respect to the mechanism by which ions cross cell 
membranes. The concept was invented by Ussing! to 
rationalize the very high rate of isotopic exchange 
between the sodium in cells with that in their surround- 
ings. Thus, with the potential inside negative relative 
to the outside and with the sodium concentration much 
higher outside, work is required to extrude sodium. If 
it is assumed that the inward movements are entirely 
by diffusion and that the outward movement proceeds 
at the expense of metabolic energy, then the energy 
requirements would exceed half of the total metabolic 
energy production of muscle in the resting state. Ussing 
proposed that sodium ions attached to a charged carrier 
might exchange for other sodium ions in the external 
solution or in the cell interior, the carrier making the 
round trip with sodium attached so that no net change 
in sodium concentration on either side of the membrane 
would occur. Yet, if the sodium on one side of the 
membrane were isotopically labeled, the process would 
result in movement of label. 

The concept of exchange diffusion has been useful in 
the interpretation of a number of other phenomena 
some of which more directly are evidence for the idea 


of reversible combination of the same ion with a carrier 
so that exchange is greater than actual transport or 
movement of the free ion. For instance, Hogben** has 
studied the secretory membrane of the frog stomach 
using the Ussing technique. The stomach secretes hy- 
drochloric acid, chloride against both electrical and 
concentration gradient, hydrogen ion with the electrical 
gradient which, however, is smaller than the very steep 
opposing concentration gradient. Sodium behaves pas- 
sively. However, the striking phenomenon in connec- 
tion with the phenomenon of exchange diffusion is the 
very large flux in the direction opposite to active trans- 
port (Table IIL). The flux from the secretory surface 
corresponding to the inside of the stomach, to the 
nutrient surface corresponding to the blood, is more 
than twice the membrane conductance as determined 
by the relationship between imposed current and de- 
veloped potential. A considerable part of the chloride 

TaBLeE III. Chloride fluxes in stomach mucosa of the bull frog 
[from A. Hogben, Electrolytes in Biological Systems, A. Shanes, 


Soar (American Physiological Society, Washington, 1955), 
p. i 


Net chloride transfer 


Ba he Net charge transfer 
pE 


u Eq cm~? hr 


Flux N—S 10.65 Current 3.05 
Flux S-N 6.38 H* secretion 1.20 
4.27 4.25 


Mean conductivity: Start 3.21 m mhos 
End 2.28 m mhos 
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TABLE IV. Na active transport and exchange diffusion 
in HK sheep red cells. 


External cation Na outflux Difference 
Na+K 3.5 \ SEEKS 
Mg+K 0.78 At) YAO 
Mg+K-+Stroph 0.16 0.62 A.T.> 
Mg 0.16 

a E.D, =Exchange diffusion. 
b A.T. =Active transport. 


movement must occur in a form which does not con- 
tribute to the transfer of charge—presumably the same 
carrier complex by which the active transport is effected. 

Another similar case involves the movement of 
chloride across the large intestine of the frog and has 
been studied by Cooperstein and Hogben.” In this case, 
there is no active transport, chloride fluxes being 
symmetrical when the electrical potential is zero. How- 
ever, at various levels of potential, the flux ratio is 
lower than predicted from the electrochemical potential 
ratio. This could be explained if some part of the flux 
were owing to exchange diffusion and thus was not 
affected by the electrochemical potential gradient. 

Finally, an example of quite a different sort involving 
sodium transport by the red blood cell. The red cell is 
rather unusual, in a number of respects, in that it is 
non-nucleated, lacks the machinery for oxidative me- 
tabolism, and, in general, transports sodium and po- 
tassium at rates several orders of magnitude lower than 
most body cells. Also, they contain a rather high 
chloride concentration, 60 to 70% of that in the plasma, 
i whereas most other cells contain very little; since they 
have an extremely high permeability to chloride, with 
a half-turnover time, measured in milliseconds,?* the 
SE ; presumption is that the distribution is passive and, 
E therefore, that the electrical potential can be estimated 
; from the chloride-distribution ratio. If this is true, the 

accumulation of potassium in this cell cannot be con- 

sidered as a passive consequence of sodium extrusion, 

as is frequently considered the case in other cells. In 

any case, the slow rate of transfer and the ease of 

direct analysis make the red cell a suitable object for 
= study and the relative simplicity of the red-cell me- 
tabolism somewhat facilitates the study of the meta- 
bolic correlates of ion transport. 

Some data obtained with sheep red cells by Tosteson 
and Hoffman?’ in our laboratory illustrate the phenom- 
= enon of exchange diffusion (Table IV). When cells are 
= incubated in normal plasma, an outflux of sodium of 
3.5 mM/I/hr is observed. The removal of sodium and 
its replacement with magnesium very markedly de- 
resse flux of sodium. Since the only type of 


pears that close to 80% of the sodium 
. sheep red cell is attributable to this 
e potassium also is removed from the 
put is reduced even further to a 


h cozeal lete Apne Gob active, tra echo pioz Glynn, dae ysi E, E (1057). 
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port and which is the same low level reached when 
transport is inhibited with strophanthanthidin, a glyco- 
side related to digitalis and a member of a group of 
powerful and apparently specific inhibitors of sodium 
transport. These results again suggest that the sodium 
extrusion is effected by a carrier-linked exchange for 
potassium. From the amount of glycoside required to 
inhibit potassium uptake by human red blood cells, 
Glynn?” has estimated that there are less than 1000 
transport sites on the surface of the human red blood 
cell, a disk 7 u in diameter and perhaps 1 y thick. 
These similarities between membrane-transport phe- 
nomena in diverse systems and even in orders of 
vertebrates encourage one to believe that the processes 
involved fundamentally are limited in number and are 
not different in each cell type and species, and that the 
study of simpler systems may yield information appli- 
cable to more-complex and less-accessible ones. : 
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RGANIZATION and the interaction of organized 

structures at all levels of complexity are of car- 

dinal significance in life processes. This organization 

may be manifested within the protein or other biomole- 

cule or may depend upon the interaction of many 
individual entities. 

lor purposes of biophysical and biochemical investi- 


gation of such interaction of organized systems, it is 
profitable to deal with relatively simple systems, prefer- 
ably with pure substances. Such systems are available 
in materials which are primarily one-dimensional (fi- 
brous), two-dimensional (membranous), or three-di- 
mensional (crystalline). The present paper is concerned 


exclusively with the properties of highly elongate, 
fibrous macromolecular systems. In a study of the forces 
between such highly organized fibrous systems lies rich 
opportunity for fundamental biophysical research. 

Many substances of crucial importance in cells and 
tissues occur as very thin (10 to 30 A), highly elongate 
(1000 to 5000 A) particles. Proteins (such as myosin, 
paramyosin, actin, fibrinogen-fibrin, collagen, and the 
nerve-axon protein), nucleic acids (DNA and RNA), 
and polysaccharides (such as cellulose, hyaluronic acid, 
chondroitin sulfate) are examples. Many of these sub- 
stances are themselves polymers (as the protein macro- 
molecules are polymers of many amino-acid residues) 
but, as monomers, these elongate macromolecules 
polymerize end-to-end and laterally to form fibrous 
structures. 

These macromolecular systems lend themselves to 
detailed physicochemical, crystallographic, and electron- 
optical study. Polyelectrolyte theory may be applied 
fruitfully to them. In many cases, energy is coupled 
with the macromolecular system by interaction with 
“energy-rich”’ substances such as adenosine triphos- 
phate (ATP), but the mechanism by which this avail- 
able energy is caused to do work (mechanical, chemical, 
electrical, or osmotic) is still poorly understood. 


SOME TYPICAL BIOLOGICAL SYSTEMS IN WHICH 
FUNCTION DEPENDS UPON SPECIFIC 
INTERACTION OF ELONGATE 
MACROMOLECULES 


It will help to identify the types of bfophysical prob- 
lems involved by mentioning a few typical examples. 


* These studies were aided by a research grant from the Na- 
tional Institute of Allergy and Infectious Diseases, National 
Institutes of Health, Public Health Service, U. S. Department of 
Health, Education, and Welfare. 


1. Functions Involving Mechanical Properties 


In this category, mention is made of but three types 
that may serve to illustrate fundamental problems: 
fibrous systems designed to afford high tensile strength; 
those which produce tension or undergo contraction; 
and those which, by forming gelled clots, occlude 
regions such as blood vessels and prevent bleeding. 

An excellent example of a fibrous system that achieves 
high tensile strength (ca 100 000 !b/in.2) by a lateral 
bonding of macromolecular polymers are the structures 
such as skin, tendon, and other forms of connective 
tissue involving the protein collagen. The macromole- 
cules are synthesized within fibroblast cells, find their 
way into the intercellular tissue spaces, and eventually 
aggregate at the appropriate place and time to form 
fibers. The mechanism of fibrogenesis may be very 
complex, involving processes of activation and homeo- 
static control of such processes so as to facilitate 
fibrogenesis when needed (as in wound repair) and to 
prevent excessive fibrinogenesis (such as occurs in 
aging and in certain pathological processes as in athero- 
sclerosis, and in certain rheumatoid and so-called 
“collagen diseases”). 

The second mechanical function, that of contraction 
or tension production, poses a problem as to whether 
the shortening occurs essentially as an intramolecular 
process of superfolding of polypeptide chains, or is 
rather an intermolecular process involving a rapid and 
reversible change in the affinity or interaction between 
two or more species of fibrous proteins leading to a 
shortening of the fibrous system without substantial 
change in the helical configuration of the intramolecular 
chains characteristic of the native macromolecules. 

Another rather striking problem is well illustrated in 
the embryogenesis of fibrous structures such as striated 
muscle, in which the axial repeat (sarcomere length) is 
so precise as to give several orders of diffraction with 
visible light. This production of supermacromolecular 
patterns well may involve the specific aggregation of 
several species of macromolecules, each having lengths 
of the order of several thousands of Ångström units. 
Perhaps some day it will be possible to produce such 
super-repeating patterns by interacting several kinds of 
fibrous macromolecules (protein, nucleic acid, or poly- 
saccharide) under appropriate conditions in vitro. 

The third type of mechanical function mentioned in 
the foregoing is that of the clotting of fibrous protein, 
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as in the transformation of the soluble fibrinogen in the 
blood into the insoluble fibrin of blood clot. This in- 
volves an enzymatic activation of the soluble protein 
monomers by means of an enzyme (itself activated by 
a series of interdependent processes) to produce “‘inter- 
mediate polymers” several thousand Angstrém units 
long which then polymerize spontaneously to form the 
clot. The enzymatic activation itself is controlled by a 
highly complex system of kinases and antikinases by 
means of which a high degree of homeostatic control of 
this vital process is achieved. 


2. Functions Involving Enzyme Action 


As was emphasized by Engelhardt and Ljubimova,! 
large macromolecules may function enzymatically. 
This may occur either because the macromolecule as a 
whole acts like an enzyme or because a portion of the 
molecule is enzymatically active. Myosin, with which 
Engelhardt and Ljubimova were concerned,! was found 
to exert enzymatic action in splitting ATP. It is known 
now that only one component (“heavy meromyosin”) 
is enzymatically active as ATPase. It remains to be 
seen how many other kinds of elongate macromolecules 
will be found to have enzymatic properties. It may 
be mentioned that, when an enzymatic group forms 
part of a large macromolecular complex, the configura- 
tion of the macromolecules or smaller molecules in the 
environment, for steric reasons, may strongly influence 
the availability of the enzymatic site. Such regulatory 
action well may be involved in muscle contraction. 


3. Functions Involving the Maintenance of a 
Specific Linear Sequence of Chemical 
Groups as in Genetic Determiners 


It is believed that in the linear sequence of nucleotide 
residues in the double helix of DNA is to be found the 
coding responsible for transmitting genetic information. 
The DNA occurs as highly elongate macromolecules 
extractable from the chromosomes by mild methods 
(such as by treatment with hydrogen bond breakers). 
It seems probable that the ability of such macromole- 
cules to exert their specific controlling influence during 
development and differentiation must depend impor- 


Epgebardt suggested that macromolecules may per- 
of s such as osmotic and electrical. 


( on of certain fibrous proteins, such 
protein of nerve, remains completely 
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action postulated by Weiss? as important in determining 
the ordering of cells into typical tissues by means of the 
interaction of specific types of macromolecules at the 
surfaces of the cells that form the tissues. Presumably 
because of a complementarity or ordered type of inter- 
action of the surface molecules, the cells are caused to 
aggregate in patterns characteristic of each tissue. 

From this very brief description, it is obvious that 
crucial biological functions depend importantly upon 
the ways in which highly specifically structured macro- 
molecules interact with one another and upon the 
manner in which environmental conditions change or 
regulate this interaction. It is the purpose of this paper 
to illustrate this specificity of interaction by a consider- 
ation of the properties of one particular type of macro- 
molecule, collagen, chosen because of its specially 
favorable chemical and structural properties. Like many 
other biological macromolecules, collagen is very asym- - 
metric (ca 142800 A). It has, however, the valu- 
able property that, in its aggregation patterns, it forms 
fibrous structures characterized by highly specific band 
patterns as seen in the electron microscope. By analysis 
of these band patterns, it is possible to deduce the 
type of macromolecular interaction responsible for each 
characteristic pattern. From such studies, valuable 
lessons are learned that have direct application to other 
types of macromolecular systems in which no band 
patterns exist to serve as guides. 


BIOPHYSICAL AND BIOCHEMICAL PROPERTIES 
OF COLLAGEN 


In contradistinction to most proteins, collagen (or 
perhaps more correctly the collagen class) possesses 
characteristic structural and chemical properties which 
permit its definitive identification. For present purposes, 
it is necessary only to sketch those properties necessary 
for an understanding of the internal structure, and the 
chemical properties of the collagen macromolecule. For 
recent excellent surveys of these properties, see Gus- 
tavson,* Highberger,® the CIOMS Symposium on Con- 
nective Tissue,® and that on Gelatin and Glue Research.’ 

Collagen occurs in dense fibrous tissue of high tensile 
strength, as in tendons, or less tightly woven tissue 
fabrics, as in skin, or in more sparse distribution as in 
loose connective tissue. The fibrous protein occurs in 
various hierarchies of fiber size, including the following: 
fibers, visible macroscopically or microscopically and 
having diameter of the order of micra; fibrils with 
widths of the order of a few hundred to several thousand 
Ångström units, observable in the dark-field microscope 
and resolvable in the electron microscope; the prolo- 
fibrils which originally were defined as constituting 
“The unit coltimnar arrays which, when associated 
laterally, form the collagen fibril’’;’ and the collage 
(or ‘“‘tropocollagen”’) macromolecules which constitule — 
the monomeric units of the protofibrillar polymer. 

For x-ray diffraction and for chemical-analyti 
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Fic. 1. Large-angle x-ray diffraction patterns of collagen from rat-tail tendon. (a) Unstretched; (b) stretched 8% 
(from J. T. Randall, J. Soc. Leather Trades’ Chemists 38, 362 (1954) ]. 


studies, gross macroscopic fibers or whole tissues are 
used. For electron-microscopic investigation, the fibrous 
category of interest is the fibril which manifests a de- 
tailed and characteristic band pattern which, as is 
brought out below, results from a specific pattern 
of aggregation of the elongate native collagen 
macromolecules. 

The collagen class of proteins, as Astbury® referred 
to them, is uniquely characterized by its amino-acid 
composition, its x-ray diffraction pattern, and its 
banded appearance in the electron microscope. These 
characteristics may be described briefly as follows. 

The collagenous proteins differ from other proteins 
in that they contain the amino acids hydroxyproline and 
hydroxylysine. In mammalian collagen, about one- 
third of the amino-acid residues are glycine. Proline and 
hydroxyproline together make up almost another third, 
leaving approximately one-third for other amino-acid 
types. From a determination of the hydroxyproline 
and glycine content of a given preparation, one can 
make a good estimate of the collagen content. 

Perhaps the most distinctive characteristic of collagen 
is its large-angle x-ray diffraction pattern (Fig. 1) which 
reflects the internal organization of the collagen macro- 
molecule and is, therefore, characteristic of this class 
of proteins. Astbury® early called attention to the 
2.86-A meridional reflection, which he considered to 
represent the length of the amino-acid residue, along 
the fiber axis, in a coiled polypeptide chain, and to the 
equatorial reflections at 10 to 15 A (depending upon the 
degree of hydration) which he attributed to the separa- 
tion between main chains. With more refined technique, 
it has been possible to obtain far more feflections in the 
pattern and to achieve a higher degree of orientation by 
stretching fresh tendon. From such patterns, it has 
been possible for several groups of workers to agree 
that the diffractions are interpreted best in terms of a 


macromolecule containing three chains coiled in helical 
fashion about each other to form a coiled coil (see 
particularly the papers of Crick and Rich). = The 
proposed triple-stranded structure is shown schemati- 
cally in Fig. 2. 

It has been proposed also that there are only two 
types of three-stranded helical models of collagen struc- 
ture, based on the so-called structure I and structure II, 
derived from a consideration of polyglycine, and com- 
patible with the x-ray, infrared, analytical, and physico- 
chemical data. In these models, the axial repeat occurs 
at 28.6 A. In collagen II, to agree best with the diffrac- 
tion data, the OH groups of the hydroxyproline residues 
extend radially from the three chains, making it pos- 
sible to form hydrogen bonds with CO groups of adja- 
cent three-stranded macromolecules. In the collagen I 
structure, the hydrogen bonds from hydroxyproline are 
directed internally, bonding the three chains intramolec- 
ularly. The hydroxyproline content appears to be 
determinative of the denaturation temperature, which 
is a measure of the energy needed to disrupt the internal 
organization of the macromolecule. This fact tends to 
support the collagen-II type of structure. Rich (p. 50) 
has suggested that one type of structure might be 
convertible into the other and that this may result 
from application of stress to the fiber. 

Although there is fairly general agreement that the 
collagen macromolecule is a three-stranded helix, it is 
not certain that the macromolecule is thus constructed 
over its entire extent. Gallop™ suggests that as much 
as 30% may have a different configuration. 

Treatment of soluble collagens with hydrogen-bond 
breakers like urea, or by heating, causes denaturation 
with the liberation of the constituent chains to form 
“parent gelatin.” From the original macromolecule, 
having a weight of 360 000, there is formed, accordin 
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CC-0. Gurukul Kangri University Haridwar Collection. Digitized by S3 Foundation USA 


Te i EEE EE , 


352 


28.6A 


Hydroxy- 
æ proline 


© Proline 
© Glycine 


x Fic. 2. Three-stranded helical structure of collagen 
s macromolecules (courtesy A. Rich). 


120 000 and another with a weight of 240 000. These 
authors believe that an alkali-labile ester bond links 
two chains of weight 120 000 to form the heavier chain 
obtained from denatured collagen. This is to be com- 
pared with the corresponding values of Orekhovitch and 
Shpikiter.!6 

In addition to the large-angle x-ray pattern, arising 
from the internal, presumably three-stranded, structure 
of the macromolecule, collagen also manifests a well- 
developed small-angle x-ray pattern (Fig. 3) consisting 
= Of many (ca 50) orders of a large axial repeat which 
= Bear819 showed to be 640 A in air-dried fibers and 

= nearer to 700 A in moist fibers. Although all native 
collagen fibers from a wide variety of sources showed 
_ this axial repea its significance in terms of molecular 
-structur ot obvious. The simplest early interpre- 
on was that it represents the molecular length of the 
molecules, an assumption that seemed to gain 
rt from the fact that a similar axial repeat was 
in the band pattern observed in electron 
hs. Bear and Morgan” attempted to relate 
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the positions of the intraperiod bonds observed electron 
optically with the characteristic intensities of the various 
orders of the small-angle x-ray pattern. As is shown in 
the following, the collagen macromolecule probably 
has a length of four times the 700-A period, i.e., about 
2800 A. 

The band pattern observed in high-resolution electron 
micrographs of teased collagen fibrils stained with 
phosphotungstic acid (PTA), or other heteropolyacid, 
is uniquely characteristic of collagen (see Fig. 4). This 
axial pattern repeats at about 700 A and contains a 
number of bands and interbands of characteristic den- 
sity and position. It was suggested by Bear?! that the 
bands represent regions of relative disorder due to the 
interaction of side chains of relatively large size while 
the interbands represent regions of relative order due 
to the interaction of the smaller side chains which are 
found in considerable abundance in collagen. Another 
interpretation of band structure depends upon the 
characteristic interaction of groups such as the guani- 
dino groups of the arginine side chains with PTA, as 
suggested by Kühn et al.” It is noted from Figs. 4 and 
5 that the band pattern—i.e., the intraperiod positions 
and the relative densities of the bands—is a polarized, 
asymmetric pattern. The significance of this pattern 
was discovered only after it became possible to take the 
native fibrils apart into their constituent macromole- 
cules and to cause these to re-aggregate in characteristic 
and new band patterns. These results may now be 
described briefly. 


FORMATION OF ORDERED AGGREGATION STATES 
OF COLLAGEN BY PRECIPITATION 
FROM SOLUTION 


A formidable difficulty in the characterization of the 
collagen molecules lay in the relative insolubility of 
collagen fibers. However, certain types of collagen, such 
as in rat-tail tendon and in the fish swim bladder, are 


(b) ž 
Fic. 3. Small-angle x-ray diffraction patterns of kangaroo-tail ta 
tendon collagen; (a) moist preparation; (b) after brief exposure ba 
to water and drying under tension,!? Layer line indices are indicat È 
[from R. S. Bear, O. E. A. Bolduan, and T. P. Salo, J. Am 
Leather Chemists’ Assoc. 46, 107 (1951)]. 


Fic. 4. Electron micrograph 
of calfskin collagen reconsti- 
tuted from solution, stained 
with phosphotungstic acid. 
Band pattern is of the native 
type. Axial repeats marked 
(courtesy A. J. Hodge). 


soluble in dilute acid. From the classical early work of 
Zachariades, Nageotte, Fauré-Fremiet, Wyckoff, and 
Corey, and others, it is known that, by appropriate 
adjustment of the pH and ionic strength of such acid 
solutions, the collagen can be precipitated reversibly in 
fibrous form. Examined in the electron microscope 
after staining with PTA, the reprecipitated fibrils were 
found to possess structure, the type of which depends 
upon the conditions of precipitation. With increasing 
ionic strength, the band pattern may be that character- 
istic of native fibrils (period 700 A), it may be about 
one-third this value, or the precipitate may have 
tactoidal appearance, showing no bands at all. These 
different forms can be produced reversibly from acid 
solutions of highly purified collagen; presumably the 
different ordered states depend only upon the collagen 
and require no additional organic material. 

When certain types of extracts are made from con- 
nective tissue or when certain organic substances, par- 
ticularly highly negatively charged substances, are 
added to the collagen solutions and the conditions are 
adjusted appropriately, a new modification is found 
which manifests an axial repeat or identity period 
about four times that of normal collagen (i.e., about 
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Fic. 5. Diagrammatic illustration of patterns of aggregation of 
tropocollagen macromolecules in native, FLS, and SLS types. 
Polarization of macromolecules indicated by arrow. 
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2600 to 3000 A) and which, therefore, are called “long- 
spacing” types. Two such forms, called ‘‘fibrous long- 
spacing” (FLS) and “segment long-spacing” (SLS) are 
shown diagrammatically in Fig. 5. The FLS modifica- 
tion is produced routinely by addition of a-1 acid 
glycoprotein to an acetic-acid solution of collagen, 
followed by dialysis against water. The SLS modifica- 
tion is produced routinely by addition of ATP to the 
acid solution of collagen; the precipitate forms directly 
without further adjustment of conditions. 

It is noted that the FLS type has a symmetrical, 
nonpolarized band structure while the SLS has an 
asymmetrical, polarized pattern of banding. 

Each of the five band patterns described may be 
produced reversibly from an acid solution of collagen. 
The particular patterns produced depend for their 
specificity upon the collagen rather than upon the 
other substances added or conditions imposed; rather 
these substances and conditions serve to evoke the 
structure inherently characteristic of the collagen itself. 


MACROMOLECULAR MONOMER OF COLLAGEN— 
THE ‘“TROPOCOLLAGEN” HYPOTHESIS 


The structures described in the foregoing, discovered 
in collaboration with Gross and Highberger, were in- 
terpreted as follows (see summaries of this work by 
Schmitt, Gross, and Highberger;?* Schmitt ;** Gross; 
and Highberger.® It is assumed that the long-spacing 
(about 2800 A) represents the length of the native 
collagen macromolecule which has a three-stranded 
helical internal structure, as deduced from the large- 
angle x-ray pattern (see the foregoing). The long, thin 
macromolecules were given the term “tropocollagen” 
(TC), because they are capable of “turning into” or 
forming the native collagen structure, and also to dis- 
tinguish them from various other collagen fractions f 
(such as procollagen) previously described. ‘a 

The TC macromolecules are assumed to be essenti- Pe 
ally identical in structure and composition and tobe __ 
themselves polarized in the sense of the linear sequence 
of amino-acid residues in the constituent intramolecul: 
strands. This is indicated by the arrows on the 

{The unit of native collagen structure is referred to as 


macromolecule rather than as a molecule because it a 


composed of several covalent polypeptide chains bon 
by hydrogen bonds. f 
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Fic. 6. Electron micrograph of tropocollagen macromolecules 
prepared by the method of Hall?*.*? (courtesy C. E. Hall). 


macromolecules in Fig. 5. The hypothesis assumes that 
the various types of ordered patterns of TC aggregation 
occur by virtue of relatively stable bonding between 
terminal groups on the side chains of laterally adjacent 
macromolecules. Each type of ordered aggregation type 
represents a particular pattern of interacting side 
chains. In the SLS form, it is assumed that the TC 
macromolecules are essentially in register with respect 
to their ends and are “pointing” all in the same direc- 
tion; i.e., they are in parallel array. The SLS pattern, 
therefore, provides a molecular “fingerprint” of the 
sequence of amino-acid residue types along the TC 
macromolecule—information not deducible by examina- 
tion of the band pattern of native fibrils. The FLS is 
assumed to be formed by an antiparallel packing of TC 
in which the macromolecular ends are approximately in 
register (see Fig. 5). 

Since the axial repeating pattern of native fibrils, 
both as seen in electron micrographs and as measured 
in the small-angle x-ray pattern, is about a quarter of 

í that of the length of the TC macromolecules, it was 
3 assumed that the latter are arranged in parallel array 
OR but are displaced in the axial direction by one-quarter 
of a length in adjacent macromolecules (see Fig. 5). A 
specific suggestion somewhat along the same line has 
been proposed by Tomlin and Worthington.” 
This concept of the structure and properties of the 
native macromolecule of collagen was deduced from the 
_electron-optical observations of the various ordered 
aggregation types observed. The hypothesis received 
confirmation from the physicochemical studies by 
Boedtker and Doty” performed on solutions which 
were highly monodisperse with respect to the monomer 
macromolecules (achieved by centrifuging out the 
i Jarger aggregates). These data indicated that the 
-mac omolecules behave like rigid rods with dimensions 
al ; 14X 2800 A and molecular weight about 360 000. 
us estimates of other workers about particle 


s were heterodisperse, containing polymers 
ree i- 


lagen as well as the monomers. 
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Finally, the tropocollagen macromolecules were yis- 
ualized directly in the electron microscope by a method 
developed by Hall.” This consists in depositing the 
molecules upon the atomically smooth surface of 
freshly cleaned mica by spraying a very dilute solution 
of the protein. After drying, this surface is shadowed 
by evaporation of platinum at a small angle. The 
metalized layer then is backed with a thin collodion 
supporting film, stripped from the mica, and examined 
in the electron microscope at high resolution. From 
such electron micrographs (see Fig. 6), Hall” found the 
fibrous particles to be about 15 A in width but their 
lengths to be somewhat smaller than had been predicted 
by the physicochemical data of Boedtker and Doty. 
Subsequently, with improved technique, Hall and 
Doty” obtained a weight-average length of 2820 A, in 
good agreement with the physicochemical data and 
with the lengths determined in this laboratory on the 
same solutions used by Hall and Doty by conversion 
to the FLS modifications and measurement of the axial 
period (average value was 2700 A). 

The problem of the nature of the precursor of fibrous 
collagen in the fibrils of connective tissue has been the 
subject of much investigation. Orekhovitch et al?! 
suggested that the fraction soluble in citrate buffer 
(u=0.2, pH=3.5) is such a precursor and, therefore, 
gave the material the name ‘‘procollagen.”’ However, 
from turnover studies or the incorporation of C"- 
labeled glycine, Harkness eż al.** suggested that the 
precursor is to be found in the material soluble in 
slightly alkaline buffer, with a much shorter half-life 
than citrate-soluble collagen. This conclusion was con- 
firmed by Jackson* using other methods. The pos- 
sibility that tropocollagen macromolecules soluble 
in neutral salt solutions (Gross, Highberger, and 
Schmitt,*4) may be the precursor of fibrous collagen 
has been discussed in some detail by Gross.** Orekho- 
vitch and Shpikiter!® have concluded that procollagen 
and tropocollagen are, in fact, identical. The possibility 
that collagen, as synthesized in the fibroblasts, requires 
activation before it is capable of being incorporated 
into fibrous tissue has been much investigated, but the 
details of the process remain to be disclosed. 


FRAGMENTATION OF TROPOCOLLAGEN 
MACROMOLECULES BY SONIC 
IRRADIATION 


The discovery by Nishihara and Doty** that sonic 
irradiation of tropocollagen rapidly reduces the viscos- 
ity without substantial reduction in optical rotation 
suggested that the irradiation fragments the macro- 
molecules into shorter pieces, actually into halves and 
quarters, which retain the triple-chain helical structure 
characteristic of the native macromolecules. This pos- 
sibility was confirmed by Hodge and Schmitt?! by 
electron-microscopic examination of irradiated collagen. k 
The loci along the macromolecules which undergo scis- 
sion could be determined with considerable precision 
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Fic. 7. Diagrammatic illustration of the chief effects of sonic irradiation on the tropocollagen macromolecules 
[from A. J. Hodge and F. O. Schmitt, Proc. Natl. Acad. Sci. U. S. 44, 418 (1958)]. 


by reference to the band patterns of the SLS-type 
aggregates produced by the addition of ATP to the 
acid solutions after irradiation. 

It was discovered that sonic irradiation produces 
profound effects in addition to that of scission of the 
macromolecules into smaller fragments. The most 
striking of these is an alteration of “end regions,” 
produced by relatively short periods of irradiation, 
without change in the length of the macromolecules. 
The results are shown schematically in Fig. 7, wherein 
the native TC macromolecules are represented by an 
arrow with A and B ends, indicative of the asymmetric 
distribution of amino acid residues reflected in the SLS 
type of aggregation pattern. It is thus possible to tell 
at a glance which is the A and B end of any particular 
SLS; in addition to the specific band pattern at each 
end, the polarization of the TC is shown at once by the 
position of the broad, slightly off-center interband 
(labeled F-G by Schmitt, Gross, and Highberger’’). As 
was indicated earlier, the formation of the native type 
(700-A axial repeat) involves the formation of proto- 
fibrils that are actually linear polymers of TC by end- 
to-end interaction of the A-B type; lateral aggregation 


of such protofibrils occurs in a manner such that 
adjacent protofibrils are displaced axially with respect 
to one another by a quarter of a macromolecular length 
(ca 700 A). It is this A-B type of interaction of macro- 
molecular ends that is first affected by sonic irradiation. 
In the diagrammatic representation, the altered ends 
are designated A’ and B’. 

Following irradiation, sufficient to prevent the for- 
mation of native-type fibrils (tested for by dialysis us 


Fic. 8. Dimeric aggre- 
gation form of tropocol- 
lagen macromolecules of 
the type 


B’/—A/—A’—B’. 


Produced by sonic ir- 
radiation of calfskin col- 
lagen for 20 min [from 
A. J. Hodge and F. O. 
Schmitt, Proc. Natl. 
Acad. Sci. U. S. 44, 418 
(1958)]. 


CC-0. Gurukul Kangri University Haridwar Collection. Digitized by S3 Foundation USA 


FRANCIS 


356 


1% NaCl), i.e., by alteration of macromolecular ends, 
two pronounced changes are found in the ATP pre- 
cipitates: (1) an increased side-to-side interaction, pro- 
ducing a highly exaggerated lateral aggregation into 
long ribbons of SLS forms; and (2) a progressive in- 
crease in the amount of A’—A’ and B’—B’ types of 
interaction. As a result, the formation of dimeric and 
polymeric forms is favored (see Figs. 8 and 9). With 
longer irradiation, the macromolecules are fragmented, 
the locus of the scission being indicated by the band 
pattern of the fragments. 

It is noteworthy that end-to-end polymerization of 
scission products never involves ends produced by the 
fragmentation (such as those designated C, D, or E in 
Fig. 8). Apparently, the original ends of macromole- 
cules are different from those produced by sonic scis- 
sion. From the high density of bands (i.e., regions which 
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matic illustration of suggested end-to-end 
F. interaction of normal collagen macromolecules by formation of 


- ili i ains (at A-B junctions). Abnormal 
k Ana aE aera in ere irradiated preparations. 
Sissi ai chains at C illustrates effect of longer periods of sonig 
; E C Note that no C—C linkage occurs [from AT ees e 
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Fic, 9. Whole polymeric aggregation types of 
SLS aggregates from a solution of calfskin collagen 
treated with sonic irradiation for 240 min. Locus of 
A’ and B’ ends of macromolecules labeled. Arrow 
points to dense band at the junction between macro- 
molecules at the B’ end [from A. J. Hodge and F. O. 
Schmitt, Proc. Natl. Acad. Sci. U. S. 44, 418 (1958)]. 


combine preferentially with phosphotungstic acid) in 
end regions of SLS, it seems clear that certain amino- 
acid side chains (possibly the guanidino groups of 
arginine, as suggested by Kühn, Grassmann, and Hof- 
mann,” or the e-amino groups of lysine) may be con- 
centrated in the end regions of the macromolecule. 

Because of the special properties of the end regions 
in end-to-end polymerization, it is important to obtain 
evidence concerning the structure in these regions. 
Clues are afforded by a study of the band fine structure 
in A’—A’ and B’—B’ linkages in the dimeric and 
polymeric forms. As shown in Fig. 9, the first bands at 
the A’ ends are separated by a region, about 100 A 
long, which is a typical interband (i.e., shows no dense 
band, hence presumably contains relatively few side 
chains reacting with the phosphotungstic-acid “stain”). 
In the case of the B’— B’ junctions, however, the separ- 
ation between the first bands is about 180 A, and a 
darkly staining band occurs in the middle of the 
junctional region. 

This behavior is highly suggestive concerning the 
nature of macromolecular ends and of end-to-end pol- 
ymerization of macromolecules as follows: (1) chain 
appendages may occur at both ends of the native TC 
macromolecule; (2) these appendages may have lengths 
of about 100 A and 200 A at the A and B ends, respec- 
tively; (3) the amino-acid composition of the terminal 
chain appendages resembles typical interband regions— 
i.e., lacking concentrations of basic amino-acid side 
chains thought to characterize the band regions (except 
for a portion of the B end as mentioned in the 
foregoing). 

Such considetations led Hodge and Schmitt? to 
suggest that normal end-to-end polymerization of TC 
monomers may involve a specific type of coiling of the 
terminal chains at A and B ends about each other to 
form a highly ordered, possibly helical, structure (see 
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Fig. 10). The “coiling energy” of such interaction in 
fact may be represented by the difference between the 
thermal shrinkage and denaturation temperatures 
(found by Doty and Nishihara! to be constant and 
equal to 29°C in three different types of collagen). It 
seems probable that such end-chain interaction involves 
primarily rather weak, hydrogen bonding. This would 
be consistent with the findings of Gross” that thermal 
gelation of salt solutions of collagen are inhibited by 
hydrogen-bond breakers, such as urea. In addition to 
such weak bonds, more stable bonds may be formed 
between terminal chains in the collagens of certain 
kinds of connective tissue, such as those which resist 
acid solution. 


DISCUSSION 


It is apparent from the behavior of TC macromole- 
cules already described that the interaction between 
these macromolecules and the various types of ordered 
structures that result from such interaction is deter- 
mined by two important factors: 


a 


(1) The specific properties and reaction potentialities 
that are built into the macromolecules by virtue of the 
linear sequence of amino-acid residues in the covalent 
chains, the number of chains in the macromolecule and 
the specific type of coiling of these chains. 

(2) The chemical environment of the macromole- 
cules which may evoke one or another of the various 
types of interaction patterns made possible by virtue 
of the internal structure of the macromolecules. 


A few fibrous proteins other than collagen have been 
investigated along similar lines and results consistent 
with the foregoing conclusions obtained. Thus, Hodge 
(p. 409) found paramyosin to be a long (ca 1400 A) 
macromolecule capable of forming an FLS type of 
structure when packed in antiparallel array, though 
in the native fiber giving rise to a 145-A repeat or a 
larger period five times this long. Tropomyosin also 
appears to behave somewhat similarly (Hodge*’). Also, 
the repeat pattern in fibrin fibrils is considerably less 
than the length of the fibrinogen molecules (Hawn and 
Porter” and Hall"), and it has been suggested that the 
latter are staggered with respect to molecular ends 
(Ferry et al“). 

Glimcher, Hodge, and Schmitt* presented evidence 
in support of the concept that the initiation of minerali- 
zation involves the nucleation of inorganic crystals by 
a precise Juxtaposition of groups in the organic matrix 
to form specific stereochemical arrays. In the nucleation 
of hydroxyapatite in calcification, only when the puri- 
fied TC macromolecules were precipitated as native- 
type fibrils (640-A repeat) did the system induce the 
formation of hydroxyapatite crystals when exposed to 
otherwise stable solutions of calcium and phosphate 
ions. If the TC macromolecules were allowed to assume 
> other types of aggregation patterns, no nucleation 


x 
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occurred. In the most general case, an example is seen 
here of how specific macromolecular interaction serves 
to govern very fundamental processes, such as that of 
the deposition of inorganic material of specific crystal- 
line form. 

If macromolecules also possess enzymatically active 
sites, it is obvious that even more-complex interaction 
behavior may occur, particularly if products of the 
enzyme reaction themselves strongly influence the be- 
havior of one or more macromolecular species in the 
system (e.g., ATPase in myosin). 

Crucial clues to many vital biological processes, such 
as those briefly mentioned at the beginning of this 
paper, eventually may be found if such biophysical and 
biochemical properties of elongate macromolecules are 
kept in mind. 


BIBLIOGRAPHY 


1W. A. Engelhardt and M. N. Ljubimova, Nature 145, 668 
(1939). 

2W. A. Engelhardt, Bull. Acad. Sci. USSR Ser. biol. 5, 182 
(1945). 

3 P, Weiss, Int. Rev. Cytol. 7, 391 (1958). 

4K. H. Gustavson, The Chemistry and Reactivity of Collagen 
(Academic Press, Inc., New York, 1956). 

5 J. H. Highberger in The Chemistry and Technology of Leather, 

F. O'Flaherty, W. T. Roddy, and R. M. Lollar, editors (Rheinhold 
Publishing Corporation, New York, 1956), p. 65. 

êR. E. Tunbridge, editor, Connective Tissue: A CIOMS Sym- 
posium (Blackwell Scientific Publications, Oxford, England, 1957). 

7G. Stainsby, editor, Recent Advances in Gelatin and Glue Re- 
search (Pergamon Press, London, 1958). 

8 F, O. Schmitt, C. E. Hall, and M. A. Jakus, J. Cellular Comp. 

Physiol. 20, 1 (1942). 

9 W. T. Astbury, J. Int. Soc. Leather Trades’ Chemists 24, 69 
(1940). 4 
10 J. T. Randall, J. Soc. Leather Trades’ Chemists 38, 362 | 
(1954). | 

u F, H. C. Crick and A. Rich, Nature 176, 780 (1955). 

12 A. Rich and F. H. C. Crick, Nature 174, 915 (1955). 

13 F, H. C. Crick and A. Rich, see reference 7, p. 20. 

1 P, Gallop (personal communication, 1958). 

1$ P, Doty and T. Nishihara, see reference 7, p. 92. 

16 V, N. Orekhovitch and V. O. Shpikiter, Science 127, 1371 
(1958). 

17 R, S. Bear, O. E. A. Bolduan, and T. P. Salo, J. Am. Leather 
Chemists’ Assoc. 46, 107 (1951). 

18 R, S. Bear, J. Am. Chem. Soc. 64, 727 (1942). 

19 R, S. Bear, J. Am. Chem. Soc. 66, 1297 (1944). 

2 R. S. Bear and R. S. Morgan, see reference 6, p. 321. 

a R. S. Bear in Advances in Protein Chemistry, M. L. Anson, 
K. Bailey, and J. T. Edsall, editors (Academic Press, Inc., New 
York, 1952), Vol. VII, p. 69. l 

22 K, Kühn, W. Grassmann, and U. Hofmann, Naturwissen- NEGY 
schaften 44, 538 (1957). RAN | 

23 F, O. Schmitt, J. Gross, and J. H. Highberger in Fibrous Pro- 
teins and Their Biological Significance, Symposia Soc. Exptl. Biol. 
9, 148 (1955). 

2 F, O. Schmitt, Proc. Am. Phil. Soc. 100, 476 (1956). 

25 J. Gross, J. Biophys. Biochem. Cytol. 2, 261 (1956). > ie 
26 S, G. Tomlin and C. R. Worthington, Proc. Roy. Soc, (Lon- — 
don) A235, 189 (1956). er 

27 H. Boedtker and P. Doty, J. Am. Chem. Soc. 78, 4267 (1956). 

28 C. E. Hall, J. Biophys. Biochem. Cytol. 2, 625 (1956), 

» C. E. Hall, Proc. Natl. Acad. Sci. U. S. 42, 801 (1956) ie Pe 

2C. E. Hall and P. Doty, J. Am. Chem. Soc. 80, 1269 (1° 


= 


FRANCIS O. SCHMITT 4 
 Orekhovitch, A. A. Tustanowsky, K. D. Orekhovitch, 37 A. J. Hodge and F. O. Schmitt, Proc, Natl. Acad. Sci. U. §, 
. Plotnikova, Biokhimija 13, 55 (1948). 44, 418 (1958). 
). Harkness, A. M. Marko, H. M. Muir, and A. Neuberger, 38 A, J. Hodge. Proc. Natl. Acad. Sci. U. S. 38, 850 (1949). 
1. J. 56, 558 (1954). C. V. Z. Hawn and K. R. Porter, J. Exptl. Med. 86, 285 


; Jackson, see reference 6, p. 62. (1947). 
oss, J. H. Highberger, and F. O. Schmitt, Proc. Natl. 10 C. E. Hall, J. Biol. Chem. 179, 857 (1949). 

. S. 41, 1 (1955). 4 J. D. Ferry, S. Katz, and J. Tinoco, J. Polymer Sci. 12, 509 
Gross, see reference 6, p. 45. (1954). 

T. Nishihara and P. Doty, Proc. Natl. Acad. Sci. U. S. 44, #2 M. J. Glimcher, A. J. Hodge, and F. O. Schmitt, Proc. Natl. 
Acad. Sci. U. S. 43, 860 (1957). 


>a 


9 


~ 


REVIEWS OF MODERN PHYSICS 


VOLUME 31, 


NUMBER 2 APRIL, 1959 


42 
Molecular Biology of Mineralized Tissues with 
Particular Reference to Bone“ 


MELVIN J. GLIMCHERT 


Department of Biology, Massachusetts Institute of Technology, Cambridge 39, Massachusetts 


INTRODUCTION 


HERE are several reasons for presenting the topic 
of biological mineralization. In the first place, the 
organization of these tissues, from the macromolecular 
level to the level of the macroscopic tissue elements, 
offers superb examples both of structural order in bio- 
logical systems and of the relation between tissue archi- 
tecture and tissue function. Secondly, the process of 
*mineralization illustrates the way in which the physio- 
logical function of a tissue or organ may be interpreted 
on the basis of the molecular structure and macro- 
molecular organization of its components, and in terms 
of fairly well-characterized physicochemical phenomena. 
Thirdly, many of the unsolved phenomena in mineral- 
ization provide challenging problems, particularly for 
those with training in the physical sciences. 

Certain tissues of the organism must perform a 
variety of mechanical functions, such as the mainte- 
nance of the form and shape of the organism against 
the forces of gravity, and the protection of certain 
delicate organs (or the organisms themselves) by en- 
closing them in rigid vaults. As sites of attachments for 
muscles and by virtue of articulations, they also provide 
the organism with a system of movable but structurally 
rigid levers. Thus, certain organisms have not only 
definite form and shape, but also flexibility and a means 
for locomotion and prehension. 

Nature has devised a number of different ways of 
differentiating these specialized tissues. In the case of 
certain insects, the polymeric chitinous-protein procu- 
ticle becomes highly crosslinked providing the structural 
properties required for the exoskeleton. In the case of 
certain Elasmobranchii, such as the shark, a fibrous, 
gel-like structure, cartilage, provides both resiliency and 
structural rigidity. A third method, the subject of this 
paper, is the deposition of a substantial amount of 
inorganic crystals within an organic matrix: tissue 
mineralization. This process is widespread in biology 


* These Studies were aided by research grants E-1469 (C1) 
from the National Institute of Allergy and Infectious Diseases, 
and A-2317 from the National Institute of Arthritis and Metabolic 
Diseases of the National Institutes of Health, Public Health 
Service, U. S. Department of Health, Education, and Welfare, 
and by. research grant 12A from the Orthopedic Research and 
Education Foundation. Many of the major concepts involved have 
already been dealt with in an earlier paper [M. J. Glimcher, 
A. J. Hodge, and F. O. Schmitt, Proc. Natl. Acad. Sci. U. S. 43, 
860 (1957) ]. The detailed evidence and documentation for these 
concepts are currently in the press. 

t Special Postdoctoral Research Fellow of the National Heart 
Institute, National Institutes of Health, Public Health Service, 
U. S. Department of Health, Education, and Welfare. 


(Table I), both in the plant and in the animal kingdoms, 
ranging from the most primitive to the most highly 
ordered species. Classical examples are the exoskeletons 
of certain marine mollusks (the shells of clams, oysters, 
etc.) and the endoskeletons of the vertebrates (calcified 
cartilage, and bone). Also, a combination of methods is 
used, such as the crosslinking of the chitinous-protein 
shell of the lobster and the collagen matrix of bone, 
both of which are also mineralized. 

The inorganic crystals not only serve to confer new 
structural properties on the tissues, but also provide a 
storehouse of inorganic ions which may be used to help 
maintain the constancy of the ionic environment of the 
organism. 

Space does not permit the discussion of the molecular 
biology of all of the mineralized tissues. Illustrated here 
are some of the major points and problems in one 
representative tissue, bone. 


COMPOSITION AND MOLECULAR STRUCTURE OF 
THE MAJOR COMPONENTS OF BONE 


Analytically, on a dry-weight basis, bone consists 
roughly of 65 to 70% of the inorganic crystals of the 
calcium-phosphate salt, apatite, and 30 to 35% of 
organic matrix of which collagen makes up the major 
fraction (95 to 99%).! The collagen in bone appears to 
be structurally similar to that in other tissues, although 
it is probably highly crosslinked.* The other components 
in the organic matrix include a number of ill-defined 
proteins, as well as the acid mucopolysaccharides, and 
constitute part of the ground substance. Some of the 
mucopolysaccharides probably are present as muco- 
proteins (noncollagenous protein complexes). The exact 
anatomical location of these components at a macro- 
molecular level is not certain, and their state of aggre- 
gation and polymerization is not well known. 

Although it had been known for nearly two hundred 
years that bone contains calcium and phosphate, it was 
not until 1926 that DeJong demonstrated by x-ray 
diffraction that the crystal structure was similar to 
that of the apatites, more specifically hydroxyapatite 
[Ca1o(PO4)6(OH)2], the structure of which is illus- 
trated in Fig. 1. The exact nature of the apatite in bone, 
however, is still being debated." This is partly because 
of the very broad x-ray diffraction reflections resulting 
from the extremely small crystal size, preventing 
crystallographic differentiation between a number of 
very similar proposed structures. a 


Further difficulties arise because the stoichiometry 
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TABLE I. Examples of biologically mineralized tissues. 


Major organic matrix 


Mineral form components 


Tissue Crystal 
Species mineralized chemistry 
Plants Cell wall CaCO; 
Radiolariens Exoskeleton SrSO,4 
Diatoms Exoskeleton Silica 
Mollusks Exoskeleton CaCO; 
Arthropods Exoskeleton CaCO; 
Vertebrates Endoskeleton 
Bone Cajo(PO;)5(OH)» 
Cartilage Cajo(PO;4)6(OH)» 
Tooth 
Dentin Cajo(PO4)s(OH)» 
Cementum Caio(PO4)s(OH)> 
Enamel Caio(PO4)5(OH)2 


Calcite Cellulose, Pectins, lignin 
Celestite ?) 

(?) 
Calcite, aragonite 
Calcite, aragonite 


Pectins 
Chitin, protein 
Chitin, protein 


Collagen 
Collagen, acid mucopoly- 
saccharides 


Hydroxyapatite 
Hydroxyapatite 


Collagen 
Collagen 
Eukeratin 


Hydroxyapatite 
Hydroxyapatite 
Hydroxyapatite 


of the bone mineral (and many synthetically prepared 
apatite crystals) departs from the theoretical value of 
hydroxyapatite (lower Ca/P ratio). Because of their 
extremely small size and, therefore, large surface area, 
attempts have been made to explain this nonstoichi- 
ometry on the basis of the surface adsorption of excess 
phosphate on stoichiometric hydroxyapatite.!!3 Aside 
from a number of theoretical objections to this explana- 
tion, direct experimental evidence has not verified this 
hypothesis." 

Other investigators have felt that lattice defects 
account for the aberrant stoichiometry. In most 
instances, these investigators have proposed that cal- 
cium atoms were missing from the atomic structure 
(either internally or on the surface) and were replaced 
either by other cations, water, or by protons (as hydro- 
gen bonds between the oxygens of orthophosphate 
groups and protons). The latter have been referred to as 
defect apatites. However, defect crystals in which 
there are vacancies in certain lattice positions are ones 
in which the order of magnitude of these “holes” or 
“defects” is possibly one per thousand or ten-thousand 


1G. 1. i t of the constituents of hydroxy- 

Bate easier gi the O01. plane. The numbers refer to the 

M eight in the unit cell of the atoms in the plane perpen- 

ie paper (c-axis), as reported by Carlstrém® [from 
Jin, orthoped. 9, 5 (1957)]. 
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atoms. To satisfy the requirements in this particular 
case, however, one or even two calcium atoms out of 
every ten would have to be absent from the lattice 
structure. It is very doubtful that so many calcium 
atoms could be absent or substituted for without some 
structural change in the lattice. 

Such structural changes have been postulated in the 
case where water or hydronium ions replace Ca atoms 
in the lattice structure. Experiments have shown (1) 
that a considerable amount of water is lost both from 
bone apatite crystals and in vitro prepared apatite 
crystals when subjected to progressively increasing 
temperatures after initial dehydration, and (2) 
that apatite crystals precipitated from aqueous solu- 
tions have associated with them an amount of water 
many times greater than that adsorbed by initially dry 
crystals from a vapor phase of water.!® This “excess” 
water cannot be separated from the crystals by me- 
chanical centrifugation at 80 000 g.!6 

Since the structure of hydroxyapatite allows for no 
water of crystallization, several explanations have been 
offered to account for this “excess” water. 

Thus, the substitution of water (or hydronium ions) 
for Ca atoms, would not only explain the low Ca/P 
ratios, but would also be consistent with the data on 
water discussed above. 

The objection that such a relatively large replace- 
ment of Ca atoms would lead to structural changes in 
the lattice (vide supra) is acknowledged by one group 
which believes that there are structural differences be- 
tween hydroxyapatite and the bone apatite crystals, 
and refers to the latter as a-tricalcium phosphate or 
hydrated tricalcium phosphate without specifying 
either the exact differences in the structural parameters 
or the position of the water in the new structure.2-"" 

Considering the water as specifically replacing col- 
lumnar calcium atoms in the apatite lattice, it has been 
proposed also that certain low-ratio apatites are layered 
structures consisting of “sheets” of hydroxyapatite held 
together by hydrogen bonds between the phosphate 
groups and water.!8 

This layered apatite structure referred to as octo- 
calcium phosphate (OCP) gave an almost identical 
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x-ray diffraction pattern as that of “pure” hydroxy- 
apatite except for several additional reflections,*!7 pre- 
sumably thought to be owing to the superlattice effect 
of the layered water. 

On the other hand, the “excess” water (particularly 
in the case of in vilro precipitated apatite crystals in 
aqueous solutions) has also been postulated to be the 
result of a large hydration shell composed of about 60 
molecular layers of water “bound” to the crystals be- 
cause of an electrical-field asymmetry of the crystal 
surfaces.!® This amount of water is many times greater 
than that adsorbed by initially dry crystals from a 
vapor phase of water and consists of many more molecu- 
lar layers of water than usually considered possible for 
surfaces to bind as a result of electrical-field effects. 

It appears possible from the data on the replacement 
of calcium atoms by water (or hydronium ions) that, 
»when apatite crystals are precipitated in aqueous solu- 
tion, part of the “excess” water may be accounted for 
by its incorporation into the crystal lattice, resulting in a 
phase change of the solid. 

Another factor in determining the amount of water 
associated with apatite crystals, particularly those pre- 
cipitated in aqueous solutions, is the size, shape, and 
interaction properties of the crystals which are discussed 
later in this paper (page 386). Recently, however, 
single crystals of OCP have been prepared, and it has 
been shown to be a distinct compound whose structure, 
while closely related, is not identical to hydroxyapa- 
tite. These investigators also felt, however, that OCP 
was also a layered structure with the layers separated 
by water molecules. 

Dehydration of OCP caused a characteristic x-ray 
reflection (18.4 A) to shift to progressively lower spac- 
ings and to disappear completely upon elimination of 
approximately two-thirds of the water. The crystals 
then showed a typical apatite pattern, indicating that, 
with loss of a certain amount of water, the OCP struc- 
ture is unstable and a true phase change occurs from 
OCP to apatite. 

Octocalcium-phosphate (OCP) crystals prepared in 
the laboratories of A. S. Posner, American Dental As- 
sociation,” Research Division, National Bureau of 
Standards, have been well characterized by x-ray dif- 
fraction and have shown minor differences both in the 
lattice spacings and intensities of the reflections as 
compared to those of hydroxyapatite. These OCP crys- 
tals have been examined in this laboratory by electron- 
microscopy and electron diffraction. The electron- 
diffraction patterns were indistinguishable from those 
of hydroxyapatite indicating that, under the conditions 
of vacuum and temperature in the electron microscope, 
a phase change had occurred from an OCP lattice to an 
apatite lattice by the dehydration process, confirming 
the report” referred to earlier. 

Another problem is the determination of the position 
and state of aggregation of carbonate, which has been 


| : postulated to exist in the crystal lattice per se (as a 
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carbonate apatite) as a separate phase (calcium and 
magnesium carbonate) or adsorbed on the crystal sur- 
faces as carbonate ions.™ 10.21.22 

These examples illustrate that, although the evidence 
is quite substantial that the lattice structure of the 
inorganic crystals in bone is either that of hydroxyapa- 
tite or of something similar, the intimate details such as 
the relationship between OCP, other hydrated calcium 
phosphate compounds, defect apatites, hydroxyapatite, 
and the structure and nonstoichiometry of the bone 
crystals and synthetically prepared “apatite” crystals 
is obviously not yet clear. 


ANATOMICAL STRUCTURE 


Figure 2(a) demonstrates the gross appearance of the 
upper end of a femur. The honeycomb-like appearance 
of the head and metaphysis is referred to as the spon- 
giosa and is composed of delicate spicules of bone called 
trabeculae. The bone in the cortex of the shaft is much 
more densely packed and is referred to as compact bone. 
In both cases, the structure of the adult bone consists 
of a series of layers or lamellae. In the cortex of many 
long bones, the lamellae are further arranged concen- 
trically around a central canal forming hollow “cylin- 
ders” containing small blood vessels. These small 
cylinders of bone are the basic units of such compact 
bone and are called osteones or Haversian Systems 
(Fig. 3). They are longitudinally oriented in the general 
direction of the long axes of the long bones (Fig. 4). 

The collagen fibers in a lamella are arranged in small 
bundles which encircle the canal in continuous spirals 
crossing one another and resulting in a trellis-like 
arrangement.”*4 Although the general direction of the 
bundles in the same lamella is similar, it varies from 
one lamella to the next, giving a characteristic appear- 
ance when viewed in polarized light.” This arrangement 
of the fibers within any one lamella and in consecutive 
lamellae imparts maximum structural properties to the 
tissue. 

The inorganic crystals of apatite are deposited in this 
highly organized and ordered matrix of collagen and 
ground substance. Their distribution can be visualized 
by the use of microradiography (Fig. 5). Note the in- 
homogeneity, not only with respect to the entire section, 
but even within any one Haversian system. 

The trabeculae, or “bone girders” of the spongiosa 
are oriented functionally in that they closely parallel 
the trajectories of maximum stress [compare Figs. 2(a) 
and 2(b) ]. This allows the structure to resist mechanical 
stress and strain in the most efficient manner and with 
the greatest economy of material in accordance with 
sound engineering principles. 

The size, number, disposition, and orientation of 
trabeculae of the spongiosa also can change in response 
to altered mechanical demands. This functional adapta- 
tion of bone to mechanical stress was emphasized first 
by Wolff in 1892 (Wolff’s law), yet, to date, no ade- 
quate explanation of the mechanism has been offered. 
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The structural order evident in the arrangement of 
the macroscopic trabeculae and in the microscopic 
Haversian systems of bone is also evident at a lower 
order of magnitude. In Fig. 6, an electron micrograph 
of a longitudinal section of cortical bone, the collagen 
fibrils are well oriented and the 640-A axial repeat of 
the fibril is accentuated in many areas by the specific 
location of the dense inorganic crystals. The crystals 
are well oriented with their long axes parallel to the 
collagen fibrils and intimately related to them. Electron 
diffraction (Fig. 7) confirms that it is the crystallogra- 
phic c-axis of the inorganic crystals which is parallel to 
the fibril axis of the collagen. 

Most workers have felt that the inorganic crystals of 
adult bone were in the ground substance between the 
collagen fibrils,?’:*5 although more recent work indicates 
that some of the crystals might be within the fibrils.” 
Interpretation of electron micrographs has been diff- 
cult because of the dense packing of the collagen fibrils 
(Fig. 5), making it impossible to say with certainty 
whether the crystals were within or on the surface of 
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the collagen fibrils, or just generally associated and 
oriented with them by virtue of the over-all organiza- 
tion, orientation, and dense packing of the fibrils in the 
tissue. Our laboratory has recently resolved this issue 
by studying bone in which the collagen fibrils are widely 
separated, and it is possible to see the relationship 
between individual fibrils and individual crystals by 
high-resolution electron microscopy. 

Figure 8, an electron micrograph at low magnifica- 
tion, indicates the regular arrangement of the collagen 
fibrils in alternate layers and the relative looseness in 
the packing of the individual fibrils. It is quite obvious 
that the crystals are within the fibrils and not in the 
intervening ground substance. Figures 9-12 are longi- 
tudinal and cross sections at higher magnifications, and 
more clearly emphasize the position of the inorganic 
crystals within the collagen fibrils. That this is not an 
initially random crystal precipitation between the 
fibrils, later reorganized within the fibrils by recrystalli- 
zation, is evident from an examination of electron 
micrographs of the earliest stages of calcification in 


Spongiosa 


Compacta 


i h end of a human femur [from D. W. Fawcett, in Histology, R. Greep, editor (The Blackiston 
Fic. 2(a). Coronal Fe ed var 1954), p. 133; reproduced from original kindly supplied by the author]. 
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embryonic bone.**! Figures 13 and 14 are electron 
micrographs of the initial stage of calcification in em- 
bryonic bone. These clearly show that the earliest 
crystals are regularly spaced and deposited within the 
collagen fibrils. 

The location, distribution, and orientation of the 
crystals, by virtue of their position within the fibrils, 
provide the most efficient arrangement for effectively 
resisting mechanical stresses. 

The extremely small size of the crystals (200 to 400 A 
X15 to 30 A) and the fact that one and probably two 
of their dimensions consist of only a few unit cells, 
result not only in a tremendous surface area, but also 
in a large percentage of the atoms being in surface or 
near-surface positions. This easy accessibility to the 
crystal interior allows for a rapid exchange of ions 
„between the crystals and the interstitial fluids. 

In summary, starting with the major components 
themselves (collagen and apatite) which are ordered 
materials, the organization of the tissue, from a macro- 
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molecular level to the arrangement of the macroscopic 
trabeculae, constitutes a highly ordered, well-organized 
structure, superbly adapted both mechanically and 
chemically to perform its biological functions. 


MECHANISM OF CALCIFICATION 


Introduction 


The many theories of calcification may be classified, 
in general, into two groups. The first, referred to as the 
“booster” theory,!® proposed that a specific enzyme (or 
enzymes) exists in the areas undergoing calcification 
and that it splits off inorganic phosphate from an 
organic substrate.” This local “boosting”? of the con- 
centration of phosphate then exceeds the level of spon- 
taneous precipitation and results in crystallization. 

A second concept, first proposed in 1921% and recently 
revived, suggests that the organic matrix of calcifiable 
tissues initiates the formation of crystals. Recent bio- 
chemical evidence” and the inadequacy of the “booster” 
theory in explaining the precise localization of the 


© = compression 


@ = tension 
©=compression and tension 
—— neutral axis 


Fic. 2(b). Diagram of the lines of stress in the upper femur (after Koch) [from Gray's Anatomy, W. H. Lewis, editor 
(Lea and Febiger, Philadelphia, 1942), twenty-fourth edition, Fig. 245]. 
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Fic. 3. Cross section of decalci- 
fied compact bone X80. Note the 
alternating dark and light layers in 
the Haversian systems and inter- 
stitial lamellae attributable to con- 
Taversian secutive changes in collagen-fiber 
YSLems orientation [from D. W. Fawcett in 
Histology, R. Greep, editor (The 
Blackiston Company, New York, 
1954), p. 134; reproduced from original 
kindly supplied by the author]. 


crystals at an ultrastructural level strongly support the which follow. Crystallization, or the formation of in- 
role of the organic matrix in calcification. Opinion as to organic crystals from solutions where none previously 
what substance (or substances) in the organic matrix is existed, represents a phase change. This physical change 
responsible for the induction of crystallization and the in state can be divided arbitrarily into crystal nucleation, 
nature of the mechanism of induction has varied.**—* the process of forming the initial fragments of the new 

Because of the complexities in biological mineraliza- phase, and crystal growth, the subsequent growth of 
tion, it is important that the physicochemical nature of these fragments into clearly defined crystals. In addi- 
the mineralization phenomenon be clearly defined before tion, the phenomena of recrystallization (the growth of 
proceeding to the experimental evidence and hypothesis large crystals at the expense of smaller ones), may also 

play a role even after the solid state has been achieved. 

Haversian Interstitial Failure to distinguish these interrelated but separate 

canal lamellae phenomena and failure to differentiate between the 

many regulatory processes controlling them and their 

underlying mechanisms, can be confusing. A brief review 

of some of the thermodynamic and kinetic principles 

related to phase transition is given in the Appendix at 
the end of this paper. 

Certain misconceptions concerning the formulation 
of apatite crystals from solution are clarified by the 
phenomenological description in the Appendix regarding 
the manner in which phase changes occur. 

It has been assumed®!64%44 that, in the precipi- 
tation of apatite crystals, the formation of brushite 
(CaHPO,-2H,0) must occur first, later to be hydrolyzed 
or otherwise converted to apatite [Ca19(PO4:6(OH2) }. 
This assumption has been based on the incorrect premise 
that an 18-body collision is required for hydroxyapatite 
formation whereas only a two-body collision is required 
for brushite formation. 

Several conceptual errors need clarification. In the 
first place, when calcium and phosphate ions in solution 
aggregate to form a crystal, the change in state which 
occurs is a physical change in state—i.e., a change in the 
state of aggregation of the ions from a solution phase to 
a solid phase. The empirical formulas of the solids in 
such cases [CaHPO,-2H20O or Cazo(PO;)-6(OH2) ] do 
not refer to a molecule or molecules of brushite (con- 

Bese fon Scores pe SNe, (caning 18 ions), but merely represent the rakasa 
Haversian systems (original magnification X120) [from J. P. 8 , y rep 


Weinmann and H. Sicher Bone E O e o ‘Bone all of the constituent ions of the solid phase in terms of 
ae osby Company, St. Louis, 121); 
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line solid containing /housands or usually millions of ions 
arranged in a definite spatial configuration character- 
istic for the particular crystal. 

Secondly, even in chemical reactions where a physical 
change of state does not occur, neither the reaction 
order nor the reaction mechanism can be deduced simply 
on the basis of the chemical formulas of the reactants 
or the reaction products. In the nucleation of new 
phases, nuclei probably arise by the stepwise addition 
of single molecules, atoms, or ions**46, and there is no 
obvious association between the empirical formula of 
the solid and the size and composition of the nucleus or 
the order of the reaction. 

Inspection of the empirical formulas of various pos- 
sible solids in no way permits one to predict the prob- 
ability that the formation of one particular solid is more 
likely than another. This, of course, will depend on the 
relative amount of work which is necessary for the 
formation of a cluster of a particular composition, size, 
shape, and configuration under the specific conditions 
of the experiment, i.e., pH, temperature, etc. 

Some simple examples illustrate the foregoing points. 
The formation of ice from water may be written 
H:O (liquid)—H:0 (solid). It is obvious that one is not 
dealing with single “molecules” of ice, but with a solid 
containing many water molecules in a specific steric 


Fic. 5. A microradiogram of a thin cross section of normal 
cortical bone. Note the variation in density (indicating inorganic- 
crystal Concentration) within any one Haversian system, as well 
as within the section as a whole. Reproduced from original kindly 


supplied by A. Engström, Karolinska Institutet, Stockholm, 
weden. 
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Fic. 6. Electron micrograph of a longitudinal section of com- 
pact bone. Note the accentuation of the characteristic axial repeat 
of the collagen fibrils in some areas by the small inorganic crystals, 
whose long axes approximately parallel the collagen-fiber axis. 
X 100 000. 

Fic. 7 (lower right-hand corner). Selected-area electron diffrac- 
tion of the specimen in Fig. 6, showing characteristic pattern of 
apatite. Arcing of the 002 and 004 reflections indicates that the 
crystallographic c-axes of the crystals are oriented parallel to the 
collagen-fiber axis, and, therefore, correspond to the long axes of 
the crystals. 


configuration. It is also apparent that neither the size 
of the critical cluster nor the reaction kinetics can be 
deduced, either from the chemical formula or from the 
equation which describes the change in state. Physico- 
chemical evidence* suggests that the nucleus which 
initiates this phase change is composed of about 80 to 
100 water molecules. These nuclei do not form by the 
simultaneous collision of 80 to 100 water molecules, but 
are believed to arise as the result of the stepwise addi- 
tion of single water molecules. The reaction is, therefore, 
bimolecular in mechanism, and of the 80th to 100th 
order. 

As a second example, the formation of sodium chloride 
crystals from solution may be written as Na+ (ion 
aqueous) + Cl— (ion aqueous)—NaCl (solid). This is a 
physical change in state—i.e., a change in the state of 
aggregation of the sodium and chloride ions from ions 
in solution to ion constituents of a crystal. In this solid, 
there are no NaCl “molecules,” but an array of Na+ 
and Cl— ions, with each sodium ion equally shared by 
six chloride ions and each chloride ion by six sodium 
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Fig. 8. Electron micrograph of a cross section of fish bone. Note the alternating direction of the collagen fibers in consecutive layers 
and the packing of the fibrils. Even at this low magnification it is obvious that the dense inorganic crystals are within the fibrils and not 


in the intervening ground substance. X58 000. 


ions. The chemical formula of the solid (NaCl) obviously 
does not indicate that this change occurs as a result of 
a 2-body collision. 

There appears to be no doubt that, under certain 
physicochemical conditions (low pH, 6.0-6.2), one can 
precipitate the calcium phosphate solid, brushite, and 
that this can in turn be hydrolyzed or otherwise con- 

verted to apatite under the proper physicochemical 
conditions (raising the pH, for example). However, one 
cannot infer that, when a calcium-phosphate salt is 
precipitated, the formation of brushite is more probable 
or more likely than apatite, simply because its empirical 
formula contains fewer atoms and ions. 


MECHANISM OF CRYSTAL INDUCTION 


The paper by F. O. Schmitt (p. 349) notes that not 
only is the collagen macromolecule a “crystalline” 
Sein but also the collagen fibril—by virtue of its 
/ ed aggregation of such macromolecules, and high 
d egree of structura 


l regularity, as seen both by electron 


microscopy and by low-angle x-ray diffraction—may 
be considered “crystalline.” 

These facts and the obvious association between the 
apatite crystals and the collagen fibrils in normally 
calcified tissues recommended investigation of the pos- 
sibility that this intimate anatomical relationship was 
evidence that either the macromolecular structure of 
collagen or the macromolecular aggregation stale of 
the collagen fibrils was responsible for the induction of 
calcification,” by acting as a catalytic heterogeneity for 
the nucleation of apatite crystals. Other studies had also 
suggested a role for collagen**:*9 or collagen chondroitin 
sulphate complexes in calcification.** #6 

For such a mechanism to be operative, however, the 
solutions from which the crystals are formed must be in 
metastable equilibrium. It is, therefore, important to 
establish the state of equilibrium of the fluids immedi- 
ately surrounding the fibrils in relation to the formation 
of this new phase (apatite crystals) in vivo. Although it 
has been demonstrated that the bulk of the extra- 
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Fic. 9. Electron micrograph of fish bone in a region of loosely packed fibrils such as in Fig. 8. Despite variations in individual fibril 
direction, long axes of inorganic crystals are parallel to the individual collagen fibrils with which they are associated by virtue of their 
position within the fibrils. X75 000. 


RL ET 
= Fic. 10. Higher magnification of an area in Fig. 9. The crystals appear to be rod-shaped, approximatel 
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Fic. 11. Higher magnification of cross sections of fibrils in fish bone. The apatite crystals are obviously 
within the collagen fibrils. 165 000. 
Fic. 12. Cross section of two fibrils in an area similar to Fig. 11. The crystals have appearance of rods viewed 
on end and appear to be hexagonally packed. X430 000. 


cellular interstitial fluid is metastable with respect to 
the formation of apatite crystals,ft the local values 
immediately surrounding the collagen fibrils in any 
particular tissue might be markedly different because 
of active cellular-controlled compositional changes, or 
active or passive transport and diffusion phenomena. 
Although it is informative to examine the data with 
regard to the state of equilibrium of the unaltered 
extracelluar fluids, not only with reference to the mecha- 
nism of induction but also particularly with regard to 
the regulation and control of this process in other non- 
mineralized tissues, the immediate objective, however, 
was to establish whether or not collagen could induce 
the formation of apatite crystals from solutions which 
i į The ion product (aca++Xanro, -) in the serum of many 
vertebrates is considerably higher than the same product in solu- 
tions of equivalent ionic strength, pH, temperature, etc., where 
equilibrium has been approached either through precipitation or 
through dissolution of apatite crystals.*®:® Since this product 
appeared to be critical in determining whether or not precipitation 
of a solid phase occurred, it was concluded that serum and inter- 
stitial fluid were supersaturated with respect to the bone mineral. 
While these thermodynamic data are not conclusive, direct 


i tal evidence supports this supposition. Inorganic solu- 
aia ion products (aca + +X aupo. -) similar to serum were 
table for indefinitely long periods of time, but showed rapid 
more solid when exposed to bone mineral‘? or to 
48 Tt would appear, therefore, that the 
] fluid is metastable with respect 


to the formation of apatite crystals. 


were metastable, and to determine the nature of this 
specificity (if any) and its mechanisms. The experiments 
discussed in the following were conducted at the 
Massachusetts Institute of Technology in collaboration 
with A. J. Hodge and F. O. Schmitt, and were designed 
to answer these questions.§ 

The collagen of many connective tissues can be dis- 
solved in a number of weak acids and neutral buffers 
yielding viscous solutions of the macromolecules (see 
F. O. Schmitt, p. 349). These can subsequently be 
reaggregated and reconstituted into fibrils with the 
typical and characteristic axial repeat and intraperiod 
fine structure of native fibrils. By appropriate treatment 
of the connective tissue prior to dissolving the collagen 
and by several recrystallizations, the collagen fibrils 
can be prepared relatively pure, with only minute traces 
of ground-substance constituents. 

§ Although we are fully aware of the limitations of a model 
system which is admittedly a great deal simpler than the process 
as it occurs im vivo, it has enabled us to characterize and distin- 
guish what appears to be the mechanism of crystal induction as 
well as several closgly allied phenomena in the process of calcifi- 
cation and to gain an insight into their mechanisms. Many of the 
minute details and a quantitative description of these phenomena 
and the manner in which they are biologically regulated will, of 
course, have to be gathered from the intact organism. However, 
one must remember that it is difficult even to know which details 


to search for if the basic underlying mechanisms are not at least 
clearly defined conceptually. 
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Fics. 13 and 14. Electron micrographs of avian embryonic bone in the earliest stages of calcification. Note the regular and periodic 
arrangement of the dense apatite crystals within the collagen fibrils [from S. Fitton-Jackson, Proc. Roy. Soc. (London) B146, 270 (1957); 
reproduced from originals kindly supplied by the author]. X100 000. 


The first experiments were carried out by exposing 
such reconstituted 640-A axial-repeat collagen fibrils to 
calcium-phosphate (Ca-P) solutions shown previously 
to be metastable with respect to the formation of apatite 
crystals. Because of the technical difficulties involved, 
and because the thermodynamic properties of calcium- 
phosphate solutions and their relation to the formation 
of solid phases are not well-enough characterized, it has 
not yet been possible to procure quantitative data on 
nucleation rates, crystal growth rates, free energy of 
nucleation, etc. The more qualitative methods used were 
designed to detect the ability of the test materials to 
induce the formation of apatite crystals, and depended 
mainly on the identification of the formed crystals. They 
did not distinguish between nucleation rate, crystal 
growth, recrystallization, etc. 

The results of these experiments demonstrated that 
native-type, 640-A axial-repeat, reconstituted collagen 
fibrils, prepared from normally uncalcified tissues such 
as rat-tail tendon, calf skin, guinea-pig skin, fish swim 
bladder, etc., were able to nucleate apatite crystals 
from metastable calcium-phosphate solutions. Figure 
15 is an x-ray diffraction pattern of such a calcified 
collagen preparation showing the typical diffraction 
rings of apatite. 

Ironically, we were not able to prepare reconstituted 


collagens from bone, presumably because of its highly 
crosslinked nature. However, bone could be decalcified 
under a variety of conditions, which maintained the 
640-A structure of many of the collagen fibrils. This 


ws a ae ee Ea 


Fic. 15. X-ray diffraction pattern of in vitro calcified collagen. 


Note the lack of crystal orientation (evidenced by complete rin i 


and the broadness of the diffraction lines attributabl = 
to small crystal size. able primarily 
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Fic. 16. Modes of aggregation of collagen macromolecules in vitro [from F. O. Schmitt, Proc. Am. Phil. Soc. 100, 476 (1956)]. 


preparation was also able to induce crystallization of 
apatite. Although it is very difficult to make quantita- 
tive comparisons, it appeared that many of the decalci- 
fied bones were not grossly as potent in this respect as 
the reconstituted collagens, presumably because of 
changes in the structure of many fibrils and to alteration 
of certain important functional groups during 
decalcification.9—-! 
a Since, in nucleation phenomena, a number of dis- 
i continuities are able to induce phase changes, the next 
Ok ats problem was to determine whether or not this property 
i was specific for collagen and, if so, where the specificity 
resided. Paramyosin fibrils, another well-organized and 
ordered fibrous protein structure, showing a regular 
periodic pattern by electron microscopy and low-angle 
x-ray diffraction, were obtained from the adductor 
muscles of clams, and used in similar experiments under 
identical physiochemical conditions. They were not able 
to induce this phase change. 
A number of other fibrillar forms can be reconstituted 
in vitro from the same solution of collagen macromole- 
cules (see F. O. Schmitt, p. 349). These include fibrils 
with 220-A axial periods, structureless fibrils ye Be 
discernible band pattern, fibrous long-spacing (FLS), 
IEF 


ae 


and segment long-spacing (SLS) (Fig. 16). These are 
different ordered-aggregation states of the same macro- 
molecules and reflect differences in the packing and 
intermolecular geometry of the fibrils. These differences 
are, of course, also reflected in differences in the kinds 
of amino-acid side chains which interact, and in dif- 
ferences in their stereochemical relations to one another. 
In addition, the isolation of the collagen macromolecules 
and linear polymers of collagen macromolecules (proto- 
fibrils), both in solution and in the solid state, provided 
a unique system with which to determine the steric 
nature of the specificity. 

Figure 17 is a schematic representation of the basis 
for these experiments. Using SLS as a typical example 
with which to compare the native-type fibril, it is 
apparent that within any region of the fibril, although 
the same macromolecules and, therefore, the same 
amino acids are present, the steric relations between 
amino-acid side-chain groups are different. 

The macromolecules, protofibrils, and the various 
aggregation states were exposed to metastable Ca-P 
solutions. Neither the macromolecules, linear aggrega- 
tion of macromolecules, nor any of the fibril forms, 
other than those with the native-type 640-A axial repeal, 
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Fic. 17. The steric differences 
between the reactive side-chain 
groups of adjacent macromolecules 
in the native and SLS forms of col- 
lagen diagrammatically portrayed. 


640A fibril 


SLS fibril 


were able to initiate this phase change under identical 
physicochemical conditions. Since it was possible to 
pass reversibly from one fibril form to another and to 
demonstrate nucleating ability only while in the native 
640-A form, it was evident that failure to initiate crystal 
formation was the result of the unfavorable configura- 
tions of the other types of reconstituted collagen fibrils 
and not the result of denaturation or other changes in 
the macromolecules produced in aggregating them from 
solution, or of the nature of the metastable solutions. 

These experimental findings indicated that the 
process of induction of crystallization was a heterogene- 
ous nucleation of apatite crystals from metastable Ca-P 
solutions by the collagen fibrils, and was not dependent 
on the macromolecules per se or on single linear adlinea- 
tions of macromolecules, but on groups of macromolecules 
and protofibrils polymerized laterally and longitudinally 
in a highly specific fashion characteristic of native collagen. 
Thus, it would appear that the particular type of aggre- 
gation and packing of macromolecules characteristic of 
native-type fibrils creates highly specific steric relation- 
ships between reactive amino-acid side-chain groups 
from adjacent macromolecules within the fibril which 
serve as centers for nucleation. 

Additional confirmation of the necessity of a specific 
juxtaposition of certain reactive groups was obtained by 


=| 2900A [= 


subjecting aliquots of reconstituted native-type collagen 
fibrils and demineralized bone to a number of physical 
and chemical agents, without altering the reactivity of 
the amino-acid side chains, and demonstrating that the 
ability to induce crystallization had been lost. 

These included the effect of heat, acid, and alkali. In 
the case of heat, no change was noted until the thermal- 
shrinkage temperature was reached. Thermal shrinkage 
of collagen results in a phase change as regards its state 
of aggregation similar to gelatinization.®: This dis- 
rupts both the molecular structure and the macro- 
molecular-aggregation state of collagen. 

In the case of acid-treated demineralized bone, or 
alkaline-treated reconstituted fibrils, although the 
collagen fibrils undergo a good deal of swelling and 
distortion, leading to loss of the typical 640-A repeat by 
low-angle diffraction and electron microscopy, the mac- 
romolecular structure remains intact. In both cases, the 
amino-acid side chains are not chemically altered, but 
only their steric interrelations are changed. 

With reference to the phenomenon of nucleation, 
three distinctly different aspects of the relation between 
collagen and apatite must be considered. The first is the 
stereochemical relationship between the reactive side- 
chain groups which constitute a nucleation center. The 
second is the demonstration that there are preferred 
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_Fic. 18. Unstained, unshadowed preparation of an early stage of in vitro calcification of collagen fibrils. Note the regular and periodic 
distribution of the crystals along the collagen-fibril axis corresponding to the intraperiod fine structure of the fibrils and occurring pri- 


marily once per axial period. 


_ Fic. 19 (lower right-hand corner). Selected-area clectron-difiraction pattern of preparation similar to Fig. 18, showing the character- 
istic apatite reflections. There is no evidence of preferred crystal orientation, even though the diffraction pattern was obtained from an 


area where the fibrils were well oriented. X42 000. 


nucleation centers and the determination of their loca- 
tion with respect to the known electron-microscopic 
intraperiod fine structure of the collagen fibrils. The 
third is the nature of the chemical groups in the nuclea- 
tion centers, and the nature of the intermolecular forces 
between these groups and the mineral ions. 
An undue emphasis seems to have been placed on the 
absolute value of 640 A, characteristic of native-type 
fibrils, and its relation to nucleation and other calcifica- 
tion phenomena. It should be apparent, however, that 
there is nothing specific about this absolute value, since 
it is merely a visible manifestation of the particular 
aggregation state of the collagen macromolecules and 
acts as a “fingerprint” for the recognition of the native- 
type fibrils. Its importance lies in the fact that when the 
macromolecules are arranged in a specific three-dimen- 
sional array so that this characteristic electron-density 
distribution occurs along the fibril, certain regions are 
created within the fibril whose reactivity and steric 
relations permit them to act as sites of heterogeneous 
nucleation. It is this stereochemistry between reactive 
ithin the nucleating regions which is 


side-chain groups w ; } 
directly related to the mechanism of nucleation, and 


this has no direct relation to the absolute value of 640 A. 


In attempting to correlate the molecular structure of 
collagen and the property of nucleation, therefore, one 
must distinguish clearly between the structural charac- 
teristics of the macromolecules and the higher-ordered 
configurations which are the result of the specific 
arrangement of the macromolecules in the fibril. 

Unfortunately, unlike the case of nucleation of inor- 
ganic crystals by other inorganic crystals, a direct 
correlation cannot be made between the wide-angle 
diffraction spacings of collagen and the known lattice 
spacings of apatite, as has been attempted by one group 
of workers.'* The wide-angle diffraction pattern of 
collagen cannot be treated as arising from simple 
Bragg planes as in inorganic crystals, but must be 
interpreted on the basis of the theory of helical dif- 
fractors.*'5> Except for the 10- to 12-A equatorial spac- 
ings, indicating the distance between the centers of 
adjacent macromolecules, the wide-angle reflections 
are characteristic of the triple-chain helical-coiled 
structure of the macromolecule and give no information 
as to the interatomic distances and configurations of 
the side-chain groups in the fibril, which information 15 
directly related to the nucleation phenomena. It should 
be clearly understood that not only do the 640-A-typé 


CC-0. Gurukul Kangri University Haridwar Collection. Digitized by S3 Foundation USA 


MOLECULAR BIOLOGY OF 


fibrils give the characteristic wide-angle x-ray diffrac- 
tion pattern, but also the macromolecules themselves 
and all other fibrillar forms, as well as cold gelatin films. 

To date, neither the exact sequence nor stereochemis- 
try of the amino acids in the collagen macromolecule are 
well enough known to permit determination of this 
specific configuration, although some general statements 
may be surmised from other data presented below. 

As regards the second point—the demonstration of 
specific nucleation centers and their location—we have 
attempted to elucidate this aspect by several different 
methods. Direct visualization by electron microscopy 
and correlation of the changes in electron density, 
either in electron micrographs or from low-angle x-ray 
diffraction patterns, have been tried.®® Interpretations 
based on electron-micrograph density changes and low- 
angle x-ray diffraction density changes are fraught with 
technical and theoretical difficulties, and no definite 
statement is as yet possible from these data. 

Electron micrographs taken during the course of 
time-sequence studies of in vilro calcification (Figs. 
18-20) and during the earliest stages of in vivo calcifica- 
tion of embryonic bone (Figs. 13, 14) provide direct 
evidence that there are specific regions in the fibrils 
which act as nucleation centers. These show that, 
during the earliest stages of calcification i» vitro and in 
vivo, small dense particles, varying in size from 20 to 
150 A, are deposited regularly spaced along the fibrils. 
Selected-area electron-diffraction of fhese particles in 
both cases revealed that they are crystals of apatite 
(Fig. 20). 

Since the smallest crystals visualized in electron 
micrographs represent the summation of nucleation. and 
crystal growth, interpretations based on such visual 
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lic. 20. Higher magnification of an area in Fig. 18. X80 000. 


observations must be viewed with some caution. It is 
entirely possible that regions most favorable for crystal 
growth may be quite different from those where nuclea- 
tion occurs. Also, there may be different areas capable 
of nucleation but differing in their degree of catalytic 
potency. 

In addition, it should be pointed out that the nucle- 
ating center is not represented by an entire “band” or 
“interband”’ in the collagen fibrils, but only by regions 
within such transverse sections of the fibril. This can 
be deduced easily if one considers that it takes a certain 
number of repeating groups from adjacent macro- 
molecules in just the right configuration to make a 
nucleation center, and that statistically there will only 
be a finite number of such groups with the proper steric 
configurations within any one transverse section. 

Although it has not been possible to identify the 
exact location of the nucleation centers with respect to 
the intraperiod fine structure of the collagen fibrils, 
observations thus far indicate that these centers do 
correspond to certain of the so-called band (electron 
dense) regions. This would correlate well with related 
chemical and low-angle diffraction studies5?58 which 
have shown that the band regions are the most reactive 
sites for many electron stains, tanning agents, etc., and 
that these areas probably contain many of the amino 
acids with long, reactive, polar side-chain groups. 

In this respect, it is interesting to note the recent 
demonstration of the direct correlation between the 
degree of mineralization and the reactivity of the 
«amino groups of lysine in bone and tooth during 
demineralization,” since lysine is considered to be a 
major constituent of the band regions.57-® 

Whereas these experiments have indicated the Sleric 


CC-0. Gurukul Kangri University Haridwar Collection. Digitized by S3 Foundation USA 


oe 


374 MELVIN J. 


specificity in the heterogeneous nucleation of apatite 
crystals by native-type collagen fibrils, they do not 
provide information concerning the manner in which 
these steric factors operate or concerning the nature 
and importance of the chemical interaction between 
collagen and the mineral ions, and the specific amino 
acids and mineral ions involved. 

In this regard, certain general principles may be 
stated. Thus, while it is obvious that some sort of 
chemical interaction must occur between certain re- 
active groups in the collagen fibrils and the appropriate 
ions for nucleation to occur, it is imperative that the 
nature of the forces involved be such that the ions are 
still capable of interacting with the other constituent 
ions of the crystal lattice. For, if either calcium or 
phosphate ions were very strongly bound by the collagen 
fibrils, collagen would act as a demineralizer similar to 
chelating agents, unless a second mechanism were in- 
voked which released the ions after they were bound. 

There are several distinct ways in which the steric 
factors could operate to induce crystallization. The 
simplest case would be by facilitating the local concen- 
trations of calcium and/or phosphate ions without 
requiring that the ions themselves be arranged in any 
particular steric fashion. This local increase in ion con- 
centration would lead to crystallization by exceeding 

i the metastable limit. 
| Another possibility is that the precise array of groups 
necessary for nucleation imparts a specificity for the 
seleciive adsorption or binding of the calcium or phos- 
i phate ions, either randomly or in specific steriochemical 
| fashion, and that they then interact with the other con- 
; stituent ions of the lattice to produce the first fragments 
of the new phase. 
| A third method would be the most specific of all, and 
would statistically require the fewest number of atoms. 
Tn this case, the precise array of reactive groups would 
sterically closely approximate certain low-index planes 
of apatite in a fashion similar to that proposed by 
Turnbull et al,®=® and discussed in the Appendix. 
: Thus, calcium, phosphate, and hydroxyl ions, either 
A singly or in combination, depending upon which atomic 
i plane of apatite is involved, would be “lightly” bound 
Fa in such a geometric configuration that they would con- 
$ stitute a reactive nucleus. 
Of course, one must consider the possibility that 
-~ certain clusters of ions are bound by the reactive groups 
= in the collagen structure. This is really a semantic issue, 
| however, since the binding of such clusters, which pre- 
sumably have the configuration of the bulk-solid phase, 


_ would require similar steric considerations. 
= eriments in which nucleation failed to occur, after 


: E » * 
aA Se eye reconstituted fibrils and demineralized bone 


e alternately exposed many times to solutions con- 
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molecular forces between collagen and the inorganic 
ions must be just strong enough to affect their inter- 
action energies without firmly binding them to the 
collagen structure. 

As to the specific organic groups and mineral ions 
involved in calcification, most investigators have felt 
that the initial step in calcification involves the presence 
in the organic matrix of an anionic group which com- 
bines with calcium ions. The anionic site has been 
variously assigned to chondroitin sulfate? to chon- 
droitin sulfate-collagen complexes,**** or to a phos- 
phorylated polysaccharide.“ In addition to the fact 
that the experimental evidence indicates that the 
nucleating centers consist of reactive amino-acid side- 
chain groups in the collagen fibril, other considerations 
cast some doubt on the hypothesis of the primary role 
of calcium-binding in crystal induction. 

In the first place, the apatites are phosphate salts, and 
the structural characteristics of the apatite lattice are 
primarily attributable to the phosphate groups, not to 
the calcium atoms which can be replaced by a number 
of other cations (Sr, Pb, etc.) without changing the 
major features of the crystal structure and symmetry. 
Since the phosphate groups are the “backbone” of the 
lattice, their role in the formation of the initial crystal 
structure would appear to be equally as important, if 
not more important, than that of the calcium ions. 

The findings of Solomons et al.,®° referred to earlier, 
which showed a direct correlation of the available 
€-amino groups of lysine (and hydroxylysine) and the 
degree of mineralization in bone and tooth also is very 
suggestive, and may indicate that the primary collagen- 
mineral-ion interaction is between the e-amino groups 
of lysine and the phosphate ions. Phosphorylation of 
NH: groups has been proposed also by Polonovski and 
Cartier® as the initial step in calcification. Experiments, 
now under way in this laboratory, in which specific 
amino-acid groups are being blocked singly and in com- 
bination, should provide the necessary data for the 
interpretation and elucidation of the actual molecular 
mechanism of the nucleation process. 


LOCALIZATION, REGULATION, AND INHIBITION 
OF CRYSTALLIZATION 


The phenomena of localization, regulation, and inhibi- 
tion of the physicochemical mechanism which initiates 
crystallization both in normally mineralized and un- 
mineralized tissues are different but closely related. On 
the basis of the hypotheses and experimental data 
presented, it would appear that all collagenous matrices 
are inherently capable of nucleating apatite crystals from 
metastable solutions. Since collagen is the major fibrous 
protein of all of the connective tissues (skin, tendon, 
ligaments, etc.), the questions arise: Normally, why do 
all of these tissues not calcify? Why do they calcify in 
certain pathological conditions? Also, even under 
normal circumstances, apatite crystals are deposited in 
tissues which do no/ contain collagen fibrils (enamel), 
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and in abnormal states in other noncollagenous tissues 
as well. 

It is impossible to answer all these questions defin- 
itively at the present time, but the controlling factors 
and the circumstances under which they may be opera- 
tive can be developed within the framework of the 
physicochemical concepts of solution metastability and 
heterogeneous nucleation sites. 


Localization of the Crystals to 
Specialized Tissues 


It is not suggested that collagen is the only organic 
compound capable of nucleating apatite crystals. This 
is quite unlikely, as, in all nucleation processes, many 
materials can act as nucleation catalysts with varying 
degrees of potency. In the case of tooth enamel, the 
apatite crystals are closely associated with another 
°” structural and “crystalline” protein, a eukeratin,®&—® 
and it is likely that a similar mechanism for crystal 
induction exists here. 


Regulation and Inhibition 


Since the native collagen fibrils of most connective 
tissues do xot calcify under normal conditions, one of 
several general situations or a combination of them 
must exist. Either the collagen in those tissues not 
normally mineralized is different from that of bone or 
calcified cartilage; or the degree of metastability of the 
extracellular fluids immediately surrounding and within 
the collagen fibrils in the various tissues is different; or 
other local phenomena increase the catalytic potency 
of the collagen fibrils in the normally mineralized tissues. 

Since reconstituted native-type collagen fibrils from a 
wide variety of tissues normally not calcified were able 
to initiate crystal formation in vitro in our experiments, 
failure to mineralize, if attributed to the collagenous 
component, would involve subtle s/ructural differences 
between native and reconstituted fibrils, as yet un- 
resolved by the physical and chemical methods em- 
ployed to date. However, it is possible that, during the 
extraction and reconstitution of the fibrils, parts are 
lost of the collagen macromolecules which normally 
inhibit calcification in vivo. There are no data available 
with which to evaluate the feasibility of this suggestion. 

With respect to the second possibility, that differ- 
ences in the degree of solution metastability account 
for the specific localization and regulation of calcifica- 
tion, two different points of view are possible. If one 
assumes that the degree of metastability of the un- 
altered interstitial fluid is sufficiently great so that 
collagen can induce perceptible rates of nucleation, 
some mechanism must be operative in the normally 
unmineralized tissues which prevents cxystallization. If 
the degree of metastability of the unaltered interstitial 
fluids is not sufficiently high, either an increase in the 
degree of metastability or an increase in the catalytic 
potency of the collagen fibril is necessary for crystalliza- 
tion even in the normally mineralized areas, and a 
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minimal protective mechanism is required in the nor- 
mally uncalcified regions. 

These variations in the degree of metastability could 
result either (a) from cellular-controlled compositional 
changes in the interstitial fluids, or (b) from the pres- 
ence of other substances in the tissue which actively or 
passively controlled diffusion and specific ion transport 
and transfer, or competed with the mineral ions for 
active sites in the collagen fibril. 

Cellular-controlled compositional changes of the 
interstitial fluids include variations in the calcium and/ 
or phosphate concentrations, calcium to phosphate 
ratios, pH, ionic strength, ion complexes, etc. These 
could be mediated directly by the cells or by substances 
secreted by the cells, such as enzymes. For example, 
the demonstration that phosphorylative glycogenolysis 
can produce a local increase in phosphate concentration 
in epiphyseal cartilage* illustrates the role of such a 
device in the regulation of calcification in normally 
mineralized tissues, either by merely making more 
phosphate available, or by participating in another 
enzymatic cycle® which actively transfers phosphate 
ions to specific groups in the collagen fibril. 

As to another method of regulating the degree of 
metastability (particularly in normally unmineralized 
tissues) by controlling diffusion, ion transport etc., the 
possible role of the ground substance and certain of its 
components have also been investigated. 

Although much research has been done on the ground 
substance of connective tissues, its exact composition, 
state of aggregation, and anatomical distribution are 
not clear. Prominent among its components are the 
various acid mucopolysaccharides. These are thought 
to exist in the tissues as complexes with noncollagenous 
proteins.”°:7! In the past, most investigators have postu- 
lated that, of the ground-substance components, the 
chondroitin sulfates specifically, either alone or in com- 
bination with collagen, have played a role in the initia- 
tion of calcification.*>—7 

That it is not solely the amount of the acid muco- 
polysaccharides which determines whether calcification 
is initiated is obvious from the fact that hyaline cartilage 

—one of the richest sources of this material—is normally 
uncalcified, whereas bone, which does calcify, contains 
extremely small amounts of these substances. 

Differences in the kinds of mucopolysaccharides 
present is not a plausible explanation, since the various 
acid mucopolysaccharides found in adult bone or in the 
growing ends of bone (bone, epiphyseal cartilage, etc.) 
are present in other tissues.” 

It is instructive to review the findings as to the nature 
of the ground substance in cartilage, since a transition 
occurs from the normally uncalcified hyaline portion to 
the normally calcified epiphyseal region over a relatively 
short anatomical distance. 

When tissue sections of hyaline cartilage, epiphyseal 
cartilage, and bone are stained with metachrom: tic 
dyes or by the Hotchkiss procedure, definite differen, 
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are noted. The staining characteristics of hyaline 
cartilage gradually change as the epiphyseal cartilage is 
approached. This change is also evident in bone which 
is being actively deposited or resorbed. Since meta- 
chromatic dyes react with negatively charged, high 
molecular-weight compounds, these staining character- 
istics have been linked primarily to the sulfated muco- 
polysaccharides in such connective tissues. The change 
in metachromasia and in the staining properties of these 
tissues by the Hotchkiss procedure has been interpreted 
as evidence for the depolymerization of the anionic 
mucopolysaccharides, as the zone of calcification is 
approached and reached.*-* 

Although there is obviously a number of other’ pos- 
sible reasons for this change in staining properties,. the 
important point is that ‘here is some alteration, either in 
the amount, state of aggregation, or reactivity of 
charged groups, etc., in the ground substance, and that 
this change accompanies calcification. Analytical data’ 
which demonstrate a very marked loss of organic sulfate 
during cartilage calcification and during the formation 
of bone matrix indicate that it is the depolymerization 
and subsequent removal of these compounds which are 
related to calcification. 

In attempting to assign a specific role to the ground 
substance in calcification, two properties stand out as 
important. One is that its state of aggregation is prob- 
ably that of a gel inwhich the mucopolysaccharides exist 
complexed with a protein moiety as a mucoprotein.”?:7! 
Such a physical state of aggregation might well limit 
the diffusion of interstitial fluids and ions to the collagen 
fibrils. 

The anionic groups of this mucoprotein are free and 
reactive,“ which accounts for the second property of 
importance—the large cation-binding capacity of the 

ound substance. Under normal circumstances, there- 
ee the mucoproteins may help to inhibit calcification 

yy limiting the available mineral ions, both by diffusion 
and by selective cation (calcium) binding. It is also 
possible, but less likely (since collagen and the chon- 
droitin sulfates, for example,” do not react with collagen 
above pH 4.0), that the reactive groups of the muco- 
polysaccharide portion of the mucoprotein compete 
with the mineral ions for positions in the collagen fibrils, 
but it is possible that the noncollagenous protein portion 
of the mucoprotein may interfere with the process by 
just such a mechanism. 

Depolymerization and removal of these compounds 
by eliminating the diffusion barrier, decreasing the 
cation-binding capacity of the ground substance, 
and possibly “freeing” some of the reactive groups of 
the collagen from their interaction with the profes 
moiety of the mucoprotein, would allow the feet 
ions to react with the collagen. The cation-binding 

Biers f the remaining depolymerized mucoprotein 
capacity ok ae i d ed, since 
Id also most likely be decreased, 
tallic cations are more strongly 
weight acids and bases than 
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In vivo, the rapid depolymerization of the mucopoly- 
saccharide-protein complexes and the subsequent de- 
crease in their cation-binding properties might well lead 
to a local release of free cations including calcium, so 
that the resultant increase in calcium-ion concentration 
might then actually aid the initiation of crystallization 
by increasing the degree of metastability and making 
additional calcium ions available. 

This hypothesis as to the role of the ground substance 
is supported by a series of in vitro experiments, con- 
ducted in our laboratory, in which native collagen-rich 
tissues, such as rat-tail tendon, calf skin, guinea-pig 
skin, etc., failed to mineralize under physicochemical 
conditions identical with those used in testing recon- 
stituted native-type fibrils from these same tissues. In 
addition, when these tissues were treated so as to extract 
many of the components of the ground substance, 
including the chondroitin sulfates, either directly or by 
enzymatic depolymerization, these same native tissues 
readily mineralized. 

Further support for this hypothesis comes from the 
experimental observation that hyaline cartilage, rich in 
the acid mucopolysaccharide-protein complexes fails to 
mineralize im vilro but selectively removes Cat* from 
the calcium-phosphate mineralizing solutions. This 
property has been shown to be related to the sulfate 
groups of the chondroitin sulfates.” 

This “shielding” of reactive sites in the collagen fibrils 
by .ground substance components in intact tissues has 
also been notéd in the tanning and dying of skins, a 
process which depends on the interaction of certain 
dyes, complex metallic ions, etc., with specific groups in 
the collagen. It was found that such interactions were 
markedly facilitated when the tissues were first tested 
by procedures designed to extract the ground substance 
components, and even more so by the prior depoly- 
merization of the chondroitin sulfates by testicular 
hyaluronidase.’* The latter method, in addition to being 
more specific, was also carried out under milder condi- 
tions so that the collagen fibrils were presumably less 
distorted and less swollen. 

Figure 21 is a schematic representation of many of 
the experiments described. It illustrates both the speci- 
ficity of the collagen macromolecular-aggregation state 
in initiating calcification, and the possible role of the 
ground substance as a factor in inhibiting it. 

Since nucleation rate is so markedly dependent on the 
degree of metastability (p. 390), the organism has the 
dual problem of keeping the metastability of the extra- 
cellular fluids at a sufficiently high level for collagen to 
induce a perceptible rate of nucleation in normally 
mineralized tissues, and at the same time not so high 
that aberrant calcification cannot be safely controlled. 
It seems likely that a compromise is obtained by main- 
taining the degree of metastability in unaltered extra- 
cellular fluid just below the point where collagen is very 
effective. The rate of nucleation in bone and cartilage 
could then be increased and controlled by very small 
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Fıc. 21. Composite and diagrammatic illustration of the experiment demonstrating the specificity of the macromolecular aggregation 
state of native-type collagen fibrils in calcification, and the possible role of the ground substance (heavily stippled regions surrounding 
fibrils in native tissues) in inhibiting and controlling it. Enzymatic treatment of the native tissue is not shown, since it is not yet certain 
whether depolymerization is itself effective, or whether depolymerization and subsequent removal from the tissue is necessary. 


transfer of mineral ions by enzyme mechanisms, such 
as the one mentioned earlier for phosphate ion, and the 
prevention of pathological calcification assured by de- 
creasing the degree of metastability by utilizing the 
properties of the ground substance, as enumerated in 
the foregoing. 

There is every reason to expect that, like other func- 
tions of such vital importance to the organism, mineral- 
ization is under the control of many factors, delicately 
balanced to provide biological and cellular regulation 
of the physiochemical mechanism which initiates 
crystallization. 


COLLAGEN-INORGANIC CRYSTAL RELATIONSHIPS 


Location of the Inorganic Crystals with Respect 
to the Collagen Fibrils 


The location of the crystals within the collagen fibrils 
supports the theory of heterogeneous nucleation as pre- 
sented, since the probability of arranging a number of 
side-chain groups in the proper steric configuration 
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statistically higher within the fibrils, and where the 
density, intermolecular forces, and interaction energies 
are highest. 

The longitudinal position of the crystals tends in 
many instances to be localized to certain regions accen- 
tuating the 640- to 700-A axial repeat of the collagen 
fibrils. It is difficult to say exactly where these regions 
are in relation to the intraperiod fine structure of the 
fibrils, although density considerations and shadowed 
preparations indicate that they are most likely in those 
regions where the density of the bands is greatest. This 
is especially true in the less heavily calcified fibrils. In 
many other fibrils, however, especially where calcifica- 
tion is quite high, no such localization is apparent. It 
would appear, therefore, that—although initially there 
are preferred regions in the fibril for nucleation and for 
crystal growth—as the fibril continues to calcify, 
crystals are deposited throughout the fibril. 


X-ray diffraction evidence that the average size of — 
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Fics. 22, 23 and 24. X-ray diffraction patterns of longitudinally oriented native, calcified fish bon 
b)] decalcified, showing 


and embryonic metatarsal rudiment [Fig. 24(a) ], showing apatite orientation; same specimens [Figs. 22(b)-24 
collagen orientation. 


speculate that three such crystals were aligned longi- crystals, while closely parallel to the fibril axis in bone, 
tudinally per axial period of the collagen fibril.” Such is for the most part randomly oriented in calcifying 
an arrangement would result in the structure having an cartilage.® Since the process of orientation was felt to 
axial repeat of 220 A. This has not been borne out by be a true-oniented overgrowth directed by certain 
direct visualization of the fibrils and the inorganic crystallographicplanesin the collagen fibrils (epitaxy) °" 
ae a crystals by electron microscopy. and directly related to the process of mineral phase 
= l apatı e Cry: KE PE a tati induction, it was suggested that possibly different 
; zstal-Collagen Fibril Coorientation mechanisms were involved in the initiation of calcifica- 
A have demonstrated by electron tion in these two closely related tissues.!® 
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Ftc. 25 X-ray diffraction patterns of in vitro (a) recalcified fish bone and (b) rat-femur bone. Note that, although there is some orienta- 
tion of the apatite crystals as evidenced by the arcing of the OOZ reflections, it is not as prominent as in the original native, calcified 
bone [compare Fig. 22(a) with Fig. 25(a), and Fig. 23(a) with Fig. 25(b)]. 


that, in the zone of provisional calcification in cartilage, 
the size, appearance, and organization of the collagen 
fibrils were markedly different from those in bone.® 
More specifically, the collagen fibrils in this calcifying 
zone of cartilage are quite thin (50 to 250 A) and do not 
show any visible interperiod fine structure. The fibrils 
are, in addition, widely separated and randomly ori- 
ented. Correlations of this type, however, depend both 
on an intimate knowledge of the mechanism of crystal 
orientation and its relation to the induction of crystal- 
lization, and on information concerning the macro- 
molecular organization of the apparently “structureless” 
collagen fibrils. Furthermore, several recent studies also 
have demonstrated that the embryonic bone of some 
species (and even early postfetal bone in others) does 
not show any preferred orientation of the inorganic 
crystals5 =% (as deduced by x-ray diffraction, based 
primarily on the arcing of the OOZ reflections of apatite), 
and that crystal orientation varies not only from bone 
to bone, but also in different areas of the same bone.®:® 

As in other problems in calcification, the term “ori- 
entation” must be defined carefully. Orientation of the 
inorganic crystals has been related to the long axis of 
the entire bone and to the various hierarchies of collagen 
structure. These definitions obviously are not identical, 
and the orientation of the crystals with respect to the 
collagen fibrils with which they are associated is the rela- 
tionship of interest. 

The evaluation and interpretation of such co- 
orientation by x-ray diffraction evidence alone is limited 
and may be quite misleading. The difficulties stem both 
from the basic nature of the method and from the failure 
to interpret the results in the light of the gross and 
microscopic arrangements of the collagen fibers in the 
tissue. 

An x-ray beam, regardless of how small a collimator 
is used, integrates the orientation of the collagen and 
crystals over an enormous volume as compared with 
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the order of magnitude discussed here. Since the x-ray 
diffraction data are summations from many crystals 
and fibers, it is not possible to discern individual inor- 
ganic crystal-collagen fibril relationships, particularly if 
the collagen fibrils are themselves randomly dispersed 
or if account is not taken of the higher-ordered organiza- 
tion of the structures in the tissue. 

Electron microscopy, on the other hand, gives a 
direct result if the crystals can be visualized with respect 
to individual fibrils. In cases where small local areas are 
in question, selected-area electron diffraction—since it 
integrates over a much smaller volume than x-ray dif- 
fraction—also can be of great help. 

In order to evaluate whether the marked variation 
in crystal orientation as deduced by x-ray diffraction is 
indicative of varying degrees of collagen fibril-inorganic 
crystal orientation, or whether it is the result of the 
geometric factors in tissue organization and of x-ray 
technique, a number of specimens showing the complete 
spectra of orientation was examined by electron micros- 
copy, electron diffraction, and x-ray diffraction. Fig- 
ures 22-24 are x-ray diffraction patterns of fish bone, 
rat bone, and 16-day-old embryonic chick bone, calcified 
and decalcified. In the case of fish bone, there is marked 
orientation of both the apatite crystals and the collagen 
fibrils. The rat bone shows less orientation of both the 
apatite crystals and the collagen fibrils, and in the 
embryonic bone, no orientation of the apatite is present 
and there is very little evidence of collagen orientation. 

Electron micrographs and selected-area electron dif- 
fraction of the two extreme cases (fish bone and em- 
bryonic chick bone) clarified the issue. Figure 6 demon- 
strates that the fish bone consists of extremely well- 
oriented collagen fibrils with the inorganic crystals 
lying with their long axes (crystallographic c-axis) 


parallel to the collagen-fibril axis. Figure 26 is an elec- ae 
tron micrograph of embryonic bone which reveals the 


general randomness of the collagen fibri 
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Fic. 26. Electron micrograph of a section of the metatarsal rudiment of embryonic chick bone (16 days). Despite the general random- 
ness of the collagen fibrils, the crystals are oriented in local regions corresponding to the individual collagen-fibril directions. The size 
and shape of the crystals are similar to those in adult bone. X40 000. 


G. 27. Selected-area electron diffraction of specimen shown in Fig. 26. The preferred orientation of tt ystals is 
Lo evident by the arcing of the 002 and 004 reflections, pees ERS 


local areas, however, the same general parallel align- The orientation of individual crystals with the fibrils 
crystals and the collagen fibrils is present, con- with which they are associated is also apparent in certain 


t of i . ; ; ; 
ERE F elected-area electron diffraction (Fig. 27). adult bones where the collagen fibrils are clearly sepa- ~ 
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Fic. 28. Unstained, unshadowed preparations of in vitro calcified collagen at an early stage. Although the crystals are situated regu- 
larly spaced in certain regions of the fibrils, the haphazard arrangement of the crystal reflections (white spots) indicates lack of preferred 
orientation of individual crystals with the fibrils with which they are associated. X70 000. 


rated, as shown in Figs. 8 and 9. It is clearly evident, 
therefore, that x-ray diffraction evidence alone cannot 
be used as a criterion in assessing or interpreting crystal 
orientation, it may, as in this case, be quite misleading. 
Thus, the complete lack of preferred crystal orientation 
as evidenced by x-ray diffraction data in the embryonic 
bone is simply the result of the randomness of the 
collagen fibrils themselves, and, as far as the crystals 
within any one fibril with which they are associated are 
concerned, they are well oriented similar to adult bone. 


Mechanism of Crystal Orientation 


As further experimental results unfolded, it became 
apparent that the reasons for crystal orientation were 
twofold: the size and habit of the inorganic crystals, and 
the location of the crystals within the fibrils. 

Electron micrographs taken during time-sequence 
studies of in vitro calcified reconstituted collagen have 
shown a striking similarity in the appearance of the 
inorganic crystals to those of embryonic bone during 
the initial stages of calcification (Figs. 13, 14, 18, and 
20). In both cases, the crystals appeared dot-like with 
none of the axes elongated in any direcjion, and in both 
cases selected-area electron diffraction (Fig. 19) showed 
no preferred crystal orientation from areas where the 
collagen fibrils were relatively well oriented. On the 
other hand, in later stages of embryonic bone, once 
crystal growth (and/or recrystallization) has occurred, 


leading to small asymmetric crystals within the fibrils, 
the crystals do become oriented with their long axes 
approximately parallel to the collagen fibrils with which 
they are associated (Figs. 26 and 27). 

Although it was impossible to obtain single-crystal 
patterns from the in vitro calcified specimens, individual 
diffraction spots, rather than complete rings, were 
obtained in many instances from small local areas with 
relatively few crystals and in regions where there were 
several well-oriented fibrils. 

In order to obtain more information about the rela- 
tionship of individual crystals to the collagen fibrils, 
recourse was had to another experimental device. The 
objective aperture of the electron microscope was 
removed and the specimen photographed slightly out 
of focus. The resultant crystal reflections, seen as white 
dots in Fig. 28, are from a definite set of crystal planes, 
confirmed by measuring the distance from the crystals 
to the reflections. One easily can see the haphazard 
patterns of these reflections, which clearly indicate that 
the individual crystals are not oriented with their crystal 
axes parallel to the fibril axes, bul are randomly oriented. 
These data indicate that the initial crystals in vivo and 
in vitro have not grown in an epitatic fashion and that 
their subsequent orientation must be attributable to 
other factors. 


A detailed electron-microscopic examination of calci- 


fication in cartilage has been made by Robinson et al. 
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Fic. 29. Electron micrograph of in vitro apatite crystals pre- 
cipitated under conditions similar to those of the in vitro collagen- 
nucleation studies. Note large hexagonal-type crystals. X50 000. 


They observed that, in the initial stages of cartilage 
calcification, the crystals ave directly related to the 
collagen fibrils and are propagated in the intervening 
interfibrillar areas. The crystals are not preferentially 
oriented, except for a few which lie directly on, or pos- 
sibly in, the fibrils. 

The other point concerning calcifying cartilage—that 
is, the lack of the usual interperiod fine structure of the 
collagen fibrils—appears to be attributable to the very 
small size of the fibrils and possibly to the large amount 
of ground substance surrounding them. The question of 
collagen-fibril size is also important from the standpoint 
of the number of collagen macromolecules that must be 
polymerized laterally in order to constitute a nucleus. 

With these factors in mind, reconstituted collagen 
fibrils were prepared, approximately 50 to 100 A in 
diameter and not showing cross-striations when stained 
with phosphotungstic acid (PTA). Well-oriented fibers 
of this material, however, demonstrated the character- 
istic 640-A low-angle x-ray diffraction pattern, indicat- 
ing the macromolecular organization to be similar to the 
larger fibrils. Preparations of these fibrils were able to 
initiate crystallization of apatite in vilro from meta- 


vt stable solutions in experiments similar to those described. 
ef The data may be summarized as follows. In calcifying 
Ey cartilage, embryonic bone, and in vitro calcified col- 
z lagens, crystallization is initiated in direct relation to the 
+, collagen fibrils. During the first stages of mineralization 


in embryonic bone, reconstituted collagens, and calci- 
fying cartilage, the inorganic crystals are nol oriented 
with respect to the collagen-fibril axis. In both em- 
and reconstituted collagens, the crystals 
‘similar: dot-like with none of the crystal- 
yes elongated. 

r crystal growth results in assymetric crystals 
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zation of the collagen fibrils in the tissue, but is depen- 
dent upon the location of the crystals within the fibrils, 
is shown by the observations on embryonic bone and on 
certain adult bone. In the former (Fig. 26), despite the 
general randomness of the fibrils, and in the latter 
(Figs. 8 and 9), although the collagen fibrils are clearly 
separated and somewhat randomly directed, the inor- 
ganic crystals are well oriented with their long axes 
parallel /o the collagen fibril axis with which they are 
associaled. In calcifying cartilage, however, because of 
the very small diameter of the fibrils, crystal growth 
is propagated primarily belween the widely spaced 
fibrils, and these crystals, therefore, incapable of being 
oriented by the fibrils, are randomly oriented. 

The conclusions are reached that: (1) the basis for 
the orientations of the inorganic crystals is the asym- 
metric growth of small crystals within the collagen fibrils, 
between tightly packed, longitudinally oriented chains of 
macromolecules (protofibrils), which necessarily results 
in the crystals being similarly oriented; (2) the process 
of crystal orientation, therefore, is unrelated to the 
mechanism of induction of crystallization; (3) since 
crystal orientation is a function of the size and shape of 
the crystals and their position within the fibrils and not 
part of the crystal induction mechanism, the lack of 
preferred orientation in cartilage in no way suggests 
that the mechanism of crystal formation (heterogeneous 
nucleation by the collagen fibrils) is different in cartilage 
from that in bone. 

In fact, the theory of heterogeneous nucleation of 
apatite crystals by collagen fibrils may be more con- 
vincing in cartilage than in bone, since the collagen 
fibrils are so closely packed (particularly in compact 
bone) that there is very little space for crystal formation 
except within the fibrils, whereas in cartilage, the fibrils 
are relatively far apart and randomly oriented with 
large amounts of intervening interfibrillar ground sub- 
stance available for crystallization. Despite this, calci- 
fication does not begin randomly in the tissue, but is 
initiated in direct relation to the very thin fibrils and is 
then propagated throughout the intervening ground 
substance. 


CRYSTAL HABIT AND SIZE 


A good deal of disagreement exists as to the exact 
habit and size of the apatite crystals in bone. Electron 
micrographs obtained in our laboratory, in agreement 
with those of Speckman and Norris,** and Fernández- 
Morán and Engström, show what appear to be rod- 
shaped particles, the majority varying in thickness from 
15 to 30 A and in length from 200 to 400 A. It is pos- 
sible, however, that these represent extremely thin, 
Jathe-like crystals, several of which are stacked together. 
Verfect cross sections of well-oriented fibrils are ob- 
tained with great difficulty, but a number of fibrils in 
many fields appear to show a dot-like appearance in 
cross section consistent with a rod-like habit of the 
crystals (Fig. 12). 
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Because the size of the crystals approaches the limit 
of resolution of the electron microscope, particularly in 
a tissue technically difficult for sectioning such as bone, 
and because of other factors such as overlap, etc., it is 
not possible to be absolutely certain of the exact size 
and shape of the inorganic crystals. In any event, the 
issue is somewhat semantic: when does a rod become a 
plate? There may be variations in the habit of the 
crystals from true rods to a somewhat lathe-like habit 
where one of the faces is slightly larger than the other. 
But from many cross sections, longitudinal sections and 
oblique sections, one can say with certainty that the 
crystals are zot the large hexagonal plates (500 250X 
100 A) reported by some workers on the basis of the 
examination of blended or autoclaved bone.27:28 

On the other hand, the apatite crystals formed by 
simple precipitation in the test tube,** or crystals nu- 
*cleated and grown in the presence of collagen, are usually 
hexagonal plates quite unlike those occurring in bone.*7 
Occasionally, lathe-like crystals can be formed in the 
test tube, but again these are several orders of magni- 
tude larger than the bone crystals. Thus, although the 
early crystals nucleated by collagen in vitro are similar 
to those nucleated in vivo, further crystal growth 
appears to be quite different. 

This difference in crystal habit and size between bone 
crystals and precipitated calcium phosphate crystals 
might be explained on the basis of the mechanical 
factors resulting from crystal growth within the fibrils, 
or by an active process controlled by the structure of 
collagen such as has been suggested by others.” It is 
also possible that it is entirely unrelated to the structure 
of collagen. 

The position of the bone crystals within the collagen 
fibrils suggests that the collagen fibrils play a primary 
role in determining crystal size and shape. The finding, 
however, that collagen fibrils were not able to alter the 
size or the habit of the crystals under the physico- 
chemical conditions of the in vitro experiments casts 
some doubt on the validity of this hypothesis. This is 
also confirmed by an examination of electron micro- 
graphs of calcifying cartilage which reveals that the 
crystals lying in the interfibrillar space and anatomically 


Fic. 30. Under altered physicochemical con- 
ditions, crystal size decreases markedly, but 
crystal habit remains the same. X300 000. 
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unrelated to the collagen fibrils are the same size and 
shape as those within the collagen fibrils of bone. 

Since it is a well-established fact that the size and 
shape of crystals can be altered by changes in the 
physicochemical environment—including the addition 
of a number of substances bound on certain crystal faces 
and limiting their further growth—studies were carried 
out on the precipitation and growth of apatite crystals 
from solution under a wide variety of experimental 
conditions. 

Figure 29 is an electron micrograph of a calcium phos- 
phate precipitate in which the physicochemical condi- 
tions were similar to the in vitro collagen-nucleation 
experiments. These large, hexagonal plates are some- 
what larger than those usually seen under these circum- 
stances, but serve as a control to illustrate the pro- 
gressive change in size and habit that can be produced 
in vitro in the absence of collagen. X-ray diffraction of 
this preparation revealed a typical apatite pattern 
without any evidence of the presence of octocalcium 
phosphate or brushite which also crystallize in large 
plate-like form. 

Figure 30 is another preparation in which an attempt 
was made to alter crystal size and shape. One can easily 
see the marked difference in the size of the crystals, 
although they are still in the form of hexagonal plates. 
Note that where the crystals overlap there is a sugges- 
tion of a dense “rod,” particularly if viewed without 
reference to the neighboring crystals. 

Figures 31-33 are electron micrographs of apatite 
preparations where the crystal size was further reduced, 
and it appears that the habit has also been altered so 
that a number of rod-like crystals have formed. It is 
impossible, however, to tell whether these are truly rods 
or represent extremely small, thin plates stacked to- 
gether or supporting each other and viewed along their 
edges. The “rods” are now approximately the size of 
the bone crystals. 

Figure 34 is an electron micrograph of another prepa- 
ration in which the crystals are even thinner (many 
~10 A or less) and very similar in size and appearance 
to the bone crystals, and in which there is no apparent 
evidence of hexagonal plates. If one examines the elec- 
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Fic. 31. Further alteration in the physicochemical environment produces not only a further decrease in crystal size but also, 
for some, a change in the crystal habit to a rod-like appearance. X 180 000. 


tron micrographs carefully, however, it is obvious that 
there is a minimal but definite background electron 
density which may indicate that these crystals are 
extremely thin plates, some of which are stacked to- 
gether or are upright and supporting each other, and, 
when viewed along their edges, give the appearance of 


area in a sample similar to 


cation of an 


‘Fic. 32. Higher magnið “plates” predominate, X260 000. 


Fig. 31, 


rods. The fact that many of the rods appear to be bent 
in several directions may indicate that this is the correct 
interpretation. This same appearance has been seen in 
very lightly calcified collagen fibrils of bone. 

Although it is still impossible to state dogmatically 
the exact size and shape not only of the bone crystals 
but also of many in vilro precipitated crystals, these 
experiments show that it is possible to grow from solution 
under controlled physicochemical conditions apatite crys- 
tals whose size, shape, and appearance in electron micro- 
graphs and whose x-ray diffraction characteristics are 
similar to those of bone crystals. 

The facts (1) that in vitro collagen per se is not able 
to modify the crystal habit or size, (2) that in vivo 
(calcified cartilage) crystals similar in size and shape to 
the bone crystals lie in the interfibrillar spaces between 
the collagen fibrils and anatomically unrelated to them, 
and (3) that crystals can be grown in vitro similar to the 
size and habit of bone crystals in the absence of collagen, 
indicate that the size and shape of the apatite crystals are 
primarily controlled by other physicochemical factors in 
vivo and not by the collagen fibrils. 

Obviously, the conditions under which crystal growth 
occurs within longitudinally oriented and closely packed 
fibrils may exert some secondary influence on the crystal 
size and habit, both mechanically and by selectively 
facilitating and inhibiting the diffusion of ions; but 
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when the other physicochemical conditions are not met, 
crystal growth is not altered by the collagen fibrils. 

Under such conditions, the crystals ordinarily tend 
to grow to a relatively large size as compared with 
normal bone crystals, and since such crystal growth 
cannot take place within the fibrils, further seeding, 
recrystallization, etc., occurs outside the fibrils. The 
results of experiments in which crystals initially nucle- 
ated within the collagen fibrils in vilro proceeded to 
recrystallize and grow outside of the fibrils, substanti- 
ates this conclusion, and clarifies the observations 
related to crystal orientation in in vitro calcified col- 
lagens including bone. 

Thus, iz vitro calcified, reconstituted collagens showed 
no preferred inorganic crystal orientation by x-ray dif- 
fraction, even though the collagen was moderately 
oriented. On the other hand, specimens of bone recalci- 
fied iz vitro did show some preferred inorganic crystal 
orientation, but not as much as similar samples of native 
bone, as shown in Figs. 22(a), 25(a) and 23(a),25(b). 
In the case of the recalcified bone, the degree of orienta- 
tion of the crystals was related to the degree of orienta- 
tion of the collagen matrix [ Figs. 22(b) and 23(b) ]. 

Electron micrographs of heavily calcified preparations 
of in vitro calcified collagens demonstrated that the 
majority of the crystals were in the interfibrillar space 
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z 43. A i ik 2 rithout evidence of plate formation, except for the 
Fic. 34. In vitro precipitated apatite crystals. Note the rod-like appearance withot À E OS i ereen å 
minimal backgr ne eres seat. Crystals vary from 8 to 20 A in width, and many appear to be “bent” similar to areas of lightly 


calcified bone. X260 000. 


Fic. 33. Higher magnification of an area in a sample similar to 
Fig. 31, where the rod-like crystals predominate. X260 000. 


and not within the fibrils. In addition, the crystal habit 
and size were unlike that of native bone and resembled 
the large in vilro precipitated hexagonal plate-like crys- 
tals. Because of size and shape, crystal growth was more 
favorable in the interfibrillar regions rather than within 
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the fibril. These crystals were, therefore, in no way 
capable of being oriented by the individual fibrils. 

In samples of reconstituted collagens, the inter- 
fibrillar spaces were large, and the over-all organization 
of the fibrils was also not able to influence the orienta- 
tion of the crystals (similar to cartilage). In the case of 
demineralized bone, however, the collagen fibrils are 
more closely packed and well oriented. Under these 
circumstances, the elongated plate-like crystals mechan- 
ically orient with their flat, thin faces between adjacent 
fibrils and their long dimensions approximately parallel 
to the collagen fiber axis. This arrangement gives a 
co-oriented x-ray diffraction pattem of apatite and 
collagen but not as good a one as that in native bone, 
where the majority of the crystals are within the fibrils. 

Experiments are now underway in which the physico- 
chemical conditions during crystal growth as well as 
crystal induction are being carefully controlled in an 
attempt to grow the crystals primarily within the col- 
lagen fibrils similar to that of native bone. 


Crystal Nonstoichiometry 


As mentioned in the beginning of this paper, one of 
the perplexing problems in the study of the nature of 
the bone crystals or of i vitro precipitated apatites has 
been their nonstoichiometry. Recall that, in bone and 
in some în vitro precipitated crystals, the average width 
of the rods or of the stacked plates (or both) varied 
from 10 to 30 A. This dimension corresponds to the a- or 
b-axis of the unit cell which is approximately 9.43 A. 
Thus, the crystals are composed of 1 to 3 unit cellsin this 
dimension. Although the unit cell of hydroxyapatite 
may be represented as in Fig. 1, it must be remembered 
that it depicts only conceptually the spatial relations of 
the constituent atoms and molecules, and that many of 
the individual atoms and molecules are shared by 
adjoining unit cells in an actual crystal. 

Therefore, the stoichiometry of the solid phase may 
be represented by the structure of the unit cell only 
when there is a very large number of unit cells compris- 
ing the crystal, so that the unit cells or the portions of 
them making up the surface are not statistically signif- 

icant. When the entire crystal is composed (in any 
dimension or dimensions) of only a few unit cells, how- 
ever, and if the planes on which crystal growth ceases 
are relatively uniform along the length of the crystal, 
it is noi possible to obtain the theoretical stoichiometry. 
The actual stoichiometry in these cases will be deter- 
mined by the atomic planes in which crystal growth 
ceases. If the crystal surface planes on which crystal 


growth ceases vary a good deal, it is | ossible that a 
fortuitous combination might occ ich would result 
in a theoretical stoichiome is is highly un- 
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would not be very meaningful. Although there is obvi- 
ously a number of other previously mentioned factors 
such as internal lattice substitution, surface adsorption, 
vacant lattice sites, etc., which may influence the 
stoichiometry, the point is that, if none of these existed, 
il still would not be possible to obtain stoichiometric hy- 
droxyapalite with crystals the size of those obtained in 
bone or prepared in vitro. 

The larger “defect” apatite crystals reported by 
Posner and Perloff"! of course poses another problem, 
but it is possible that such large crystals represent a 
somewhat different structure than hydroxyapatite and 
are similar to OCP. 

The small size of the crystals also may be an impor- 
tant factor in determining the amount of water associ- 
ated with apatite crystals precipitated from aqueous 
solutions as discussed on p. 361. This “excess” water is 
many times greater than that adsorbed by initially dry 
crystals from a vapor phase of water and consists of 
many more molecular layers of water than that usually 
considered possible for a surface to bind because of 
electrical-field effects. 

In addition to the fact that some of the water may 
enter the lattice structure itself, since the minute apatite 
crystals in aqueous suspension tend to stick together 
and form stacks and bundles, the very large surface 
tension and capillarity effects between such crystal sur- 
faces could well “trap”? a large amount of water (in 
addition to a bound monolayer) and still resist separa- 
tion from the crystals by mechanical centrifugation. 


Closing Comments 


As regards other mineralized biological tissues, one 
can see from Table I that the various crystals are ana- 
tomically related to specific organic matrices. Note the 
wide variety both of inorganic crystals and of organic 
matrices in which mineralization occurs. In fact, in some 
marine mollusks, calcium carbonate exists in two of its 
three possible crystallographic forms (calcite and arag- 
onite) in the same shell, in well-demarcated regions 
which border on each other. 

Another example of such an intimate chemical and 
ultrastructural relationship between the inorganic and 
organic phases is that which occurs in oyster shells. 
Electron micrographs have shown a close and ordered 
relationship between the calcium carbonate crystals 
and the organic matrix of conchiolin.**-® Here too, 
calcification is initiated by small seed-like crystals which 
are initially deposited in a regular fashion in the organic 
matrix and appear similar to synthetically grown crys- 
tals from supersaturated solutions.® 

Although the mechanisms by which mineralization is 
initiated and regulated in the organism are undoubtedly 
far more complex than those in the relatively simple 
model system described, the process of the heterogene- 
ous nucleation of inorganic crystals by highly specific 
regions in the organic matrix as the result of a character- 
istic stereochemical array of certain reactive groups is 
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probably a fundamental one, not only for the calcifica- 
tion of bone and cartilage but also for biological min- 
eralization in general. 


APPENDIX A. SOME THERMODYNAMIC AND 
KINETIC CONSIDERATIONS IN PHASE 
TRANSFORMATIONS|| 


Crystallization is a specific case of a more general 
phenomenon in which a change in state involves a phase 
transformation. A complete consideration of phase 
transformations would encompass (1) the conditions 
under which a system can or cannot persist in a distinct, 
physically homogeneous state (a single phase); (2) a 
mechanism for the formation of new phases from such 
systems; and (3) a quantitative description of the time- 
dependence of the process. 

Since the primary purpose of this exposition, however, 


is to provide a background of some basic concepts in 


phase transformations in order to understand better the 
general phenomenon of crystallization and of biological 
mineralization, attention is directed primarily to some 
general considerations of the first two aspects. 


Equilibrium and the Stability of Phases{] 


In most general terms, Gibbs®! has defined equilib- 
rium as a state independent of time, that is, a state in 
which all of the’sensible properties which describe the 
system do not vary with time. Thus, in formulating 
both the necessary and sufficient conditions of equilib- 
rium in terms of thermodynamic functions (e.g., 
energy, entropy, free energy, etc.), Gibbs considered 
equilibrium with respect to all possible variations of the 
state of the system, and not with respect to certain 
variations only. In practice, however, it is necessary to 
idealize most systems by imposing certain conditions of 
restraint and then ‘by studying them with respect to 
certain possible variations only. In the case at hand, 
the possible variation of interest is the formation of 
entirely new phases from an initially homogeneous 
phase. 

The usual criteria of equilibrium [for example, that 
for all possible variations in the state of an isolated 
system (5E)s=0,(6S) 20, (5F),,r=0, etc. ] do not give 
the information necessary for the interpretation of phase 
transformations, since many systems which meet these 
general criteria of equilibrium may vary widely in their 
relative tendency or ability to form new phases. On the 
basis of this relative stability, Gibbs” distinguished four 
kinds of equilibrium: (1) stable, (2) neutral, (3) un- 
stable, and (4) metastable equilibria. 

|| The reader is referred to the following article for a more com- 
prehensive review of the subject: D. Turnbull, Advances in 
Solid-State Phys. 3, 225 (1956). 

{ The material presented here has been compiled from a series 
of lecture notes from a graduate course in Chemical Thermo- 
dynamics, based on the writings of J. Willard Gibbs, given at the 
Massachusetts Institute of Technology, Department of Physical 
Chemistry, by Professor James A. Beattie. I would like to thank 
Professor Beattie for permission to use some of this material, 


including Figs. 35 and 36, and for many helpful suggestions and 
criticisms in the preparation of the Appendix. 
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Binodal curve 


Fic. 35. Artists (P. Lund) conception of the Gibbs energy- 
entropy-volume surface for a one-component system. The dotted 
lines on the primitive surface connect points of common tangency 
(neutral equilibrium), and the construction of a derived surface 
from such points is illustrated also. The orthogonal projection of 
the plane triangle on the S-V plane, representing the triple point, 
is shown. bc represents binodal curves. 


The fundamental principles can, with respect to the 
stability of phases, conveniently and easily be visualized 
by reviewing certain pertinent features of the Gibbs 
energy-entropy-volume thermodynamic surface. For 
purposes of exposition, consider the special case of a 
pure substance when effects produced by gravity, elec- 
tricity, capillary tensions or distortion of solid phases 
are absent, or may be neglected. 

For all possible infinitesimal changes of state involv- 
ing only expansion work in a closed system, the first and 
second laws of thermodynamics state that 


dE=TdS— pdV. (A1) 
Integration of (1) gives the equation 
E=K(S,V). (A2) 


Thus, the relations between the energy, entropy, and 
volume may be represented by a surface in space whose 
rectangular coordinates are represented by the volume, 
energy, and entropy of the system (Fig. 35). This 
surface, called the primitive surface, represents equilib- 
rium states of a homogeneous phase. At any point 
on the primitive surface, the equilibrium temperature 
and pressure are defined by the slopes of the surface at 
that particular point: 


og 
— į} =T,{—) =— A3 
OS J y OV/ s 3 oS 
and the tangent plane at the point given by 

E= EotTo(S—So)— po(V— Vo). (A4) 


The intersection of the tangent plane with various other 
planes and the coordinate axes give various thermo- 
dynamic functions. For example, its intersection with 
the E axes gives Fo, the Gibbs free energy of the sub- 
stance at the point Po. 
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States of the system consisting of two or more phases 
in equilibrium (heterogeneous equilibrium) can also be 
represented in Æ-S-V space, since the system as a whole 
has a definite energy, entropy, volume, free energy, etc., 
as well as temperature and pressure. In heterogeneous 
equilibrium, the substance in each of its aggregation 
states has the same temperature, pressure, and molal 
free energy. Thus, planes tangent to points on the 
primitive surface representing, for example, one mole 
of the substance in each of these aggregation states will 
have the same slope (— p,T) and the same intercept in 
the #-axes (free energy). Hence, these points have a 
common tangent plane. 

In the case of two phases (say liquid and vapor) in 
equilibrium at a definite temperature and pressure, 
there are two points on the primitive surface, one 
representing one mole of liquid and the other represent- 
ing one mole of vapor at this pressure and temperature 
which have a common tangent plane. Lying along the 
straight line connecting the points are the states of 
mixtures of liquid and vapor in equilibrium at this 
particular temperature and pressure. By taking all of 
the lines determined by such tangent planes for varying 
temperatures and pressures, another surface may be 
generated called the derived surface which represents 
states of heterogeneous equilibrium. One may conceive 
of this derived surface as being produced by rolling a 
double tangent plane (tangent to the primitive surface 

at two points) on the primitive surface and by connect- 
ing the successive pairs of conjugate points of tangency 
by a series of straight lines (Fig. 35). 

In the case of three phases in equilibrium (i.e., liquid, 
vapor, solid), there are three points on the primitive 
surface with a common tangent plane, and states of the 
substance at the triple point (mixtures of all three 
phases in equilibrium) are presented in E-S-V space by 
points in a plane triangle (Fig. 35). The derived sur- 
faces, therefore, include all of the plane triangles repre- 
senting three phases in equilibrium, and all of the de- 
velopable surfaces representing two phases in equilib- 
rium. It is a continuous ruled surface (generated by 
the motion of a straight line), but it is ot cylindrical. 


Properties of the Thermodynamic Surfaces Which 
Indicate the Stability of Thermodynamics 
Equilibrium 

Consider a specified amount of a pure substance A, 
immersed in a large medium M, at constant pressure 
po and temperature To with an initial energy, entropy, 
and volume (2’,S’,V’) and which undergoes the follow- 
ing change of state 

- A (E'S, V’) — A ES Vv”) 
onditions where the action of the system on the 
antially reversible. It can easily be shown 
and second Jaws of thermodynamics that 


under c 
media is subst 
‘rom the first 

mesons relation must hold: 
Z p" — ToS” + pV )E (E'— ToS’ + poV’). (A5) 
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With reference to the thermodynamic surface, the 
terms enclosed in parentheses are the vertical distances 
of the points (E”,S”,V”) and (1’,S’,V’) representing 
the final and initial states of the system above a plane 
passing through the origin having the slopes — po,7». 
Geometrically, it defines the conditions under which 
changes in state can occur spontaneously: where this 
vertical distance representing the final state is Jess or at 
most egual to the distance representing the initial state. 
Changes in state which result in an increase of this 
distance are, therefore, not possible. This principle is 
crucial to an understanding of phase changes, for it 
clearly delineates the regions in /-S-V space (and, 
therefore, the equilibrium of such states) where phase 
transitions are possible. 

In examining the possible equilibrium states of a pure 
substance in a medium at temperature To and pressure 
po (represented by points on the primitive surface), two 
possible categories of change must be considered. The 
behavior of systems may be related to (1) continuous 
changes—that is, changes considered with respect to 
nearby or adjacent states, and (2) discontinuous changes 
—that is, in relation to states at a finite distance from 
the point in question. The curvalures of the primitive 
surface determine the former, while the over-all relation 
of the tangent plane to the surface determines the latter. 


Stable Equilibrium 


If the primitive surface falls above the fixed tangent 
plane, except at the single point of contact (representing 
the initial state), the state of the system represented by 
that point is “absolutely stable” toward a phase change 
when in a medium of constant pressure and tempera- 
ture. That is, there are no points in E-S-V space for 
which Eq. (5) is satisfied, and there are, therefore, no 
possible changes in state which can occur. Any “un- 
natural” perturbations of the system (such as, a local 
fluctuation in density) would necessarily lead to changes 
in state represented by points in Æ-S-V space where the 
distance of any such point above the tangent plane 
would be greater than that of the original point. Since 
Eq. (5) has shown that such changes in state are nol 
possible, natural processes will occur which will return 


Binodal Curve: 
Limits of absolute stability 


Spinodal Curve: 
Limits of essential instability 


V 


Fic. 36. Orthogonal projection of the limits of absolute stability 
(binodal curve) and limits of essential instability (spinodal curve 
on the S-V plane for a substance having one solid phase. 
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the system to its original state. It can be shown that 
that part of the primitive surface representing such 
stable states of equilibrium is concave upward in both 
of its principal curvatures. 

The points of tangency of the rolling double-tangent 
plane representing two phases in equilibrium trace the 
binodal curves (Fig. 36) or the limits of absolute stability. 
The tangent plane to points on the primitive surface 
which fall outside the binodal curves has the primitive 
surface entirely above it except at the single point of 
contact. That portion of the primitive surface outside 
the binodal curves represents states of absolute stability. 
This part of the primitive surface (referred to as the 
surface of absolute stability) together with the derived 
surfaces constitute the surface of dissipated energy. The 
tangent plane to a point on the surface of absolute sta- 
bility is always below the surface of dissipated energy 
»except at the point of tangency. 

Solutions in such states of equilibria would be abso- 
lutely stable with regard to the formation of new phases. 
The mechanical analog of such a system may be repre- 
sented by a marble in a hemispherical bowl fitted with 
a cover, with the conditions of restraint that the marble 
cannot sink through the bowl or be otherwise removed 
from it [TF ig. 37 (a) ]. 


Neutral Equilibrium 


If the primitive surface does not fall anywhere below 
the fixed tangent plane but meets it at more than one 
point, the equilibrium of such states of the system is 
considered neutral. In such cases, if the system in such 
an equilibrium state is changed from its original state 
to a new state represented by another point of tangency 
to the primitive surface, the distances of both points 
above the fixed tangent plane (at constant T and p) 
obviously will be equal (zero). Therefore, such systems 
will have no tendency to pass into one of the other states 


(d) 


(c) 


Fic. 37. Diagrammatic illustration of the four types of equilibria 
based on their relative stability. 
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(as represented by other points of common tangency) 
or to return to their original state if so displaced. How- 
ever, such systems are stable with respect to continuous 
changes in state similar to systems in stable equilibrium. 

A marble in a horizontal trough is an example of such 
a state of neutral equilibrium [Fig. 37(b) ]. Although 
the displacement of the marble up the sides of the con- 
tainer eventually results in the return of the marble to 
its original position at the bottom of the trough, there 
are a number of positions of the marble along the bottom 
of the trough where no spontaneous tendency to change 
exists and where no spontaneous tendency for a return 
of the marble occurs if such horizontal displacements 
are made. Solutions in such states of neutral equilibrium 
are, therefore, also stable with regard to the formation 
of new phases. 


Unstable Equilibrium 


If the primitive surface be continuous, there must 
necessarily exist regions between the binodal curves 
where the curvature of the primitive surface is concave 
downward in at least one of its principal curvatures. 
Points on the binodal curves (representing states of 
neutral equilibrium) separate such states of stability 
and instability with respect to discontinuous changes. 
The lines on the primitive surface dividing the portion 
which is concave upward in both of its principal curva- 
tures from the portion which is concave downward in 
one or both of its principal curvatures represent the 
limit of essential instability or the spinodal curves 
(Fig. 36). 

In such regions, it is obvious that, where part of the 
surface falls below the fixed tangent plane, it is possible 
to change the initial state of the system such that the 
point representing the final state is now below the fixed 
tangent plane. In this case, according to Eq. (5) natural 
processes occur which cause the system to continue to 
change (represented by moving the point further from 
the tangent plane) until a state is reached which is 
entirely different from the initial state [Fig. 37(d)]. 
Such states of equilibrium are unstable with respect both 
to continuous and discontinuous changes. That part of 
the primitive surface which lies inside the spinodal 
curve represents such states of unstable equilibrium. 
This type of equilibrium is rarely—if ever—attained in 
practice, and solutions in such a precarious state of 
equilibrium would hardly remain so for very long 
periods. 


Metastable Equilibrium 


The primitive surface which lies between the spinodal _ 
and binodal curves is concave upward in both of its 
principal curvatures and is s/able with respect to con- Th? 
tinuous changes (adjacent states). However, it is so 
situated with respect to the over-all thermodynami 
surface that other states do exist a finite distance awa 
where the primitive surface falls below the tangent 
planes drawn through points on the primitive surf: as 
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which lie between the binodal and spinodal curves. Such 
systems which are stable with respect to continuous 
changes (adjacent states) and unstable with respect to 
discontinuous changes (distant states) are said to be in 
metastable equilibrium. If the conditions necessary for 
such a discontinuous change were not present, the 
equilibrium would remain stable indefinitely. But, for 
example, if small portions of the same substance in one 
of the more stable states of aggregation, represented by 
points below the tangent plane, are introduced or other- 
wise caused to form by very small disturbances (perhaps 
ones that cannot be detected experimentally), the 
equilibrium would be destroyed and a change of state 
(a phase change) would occur [ Fig. 37(d) ]. Solutions 
in metastable equilibrium, therefore, although capable 
of remaining stable indefinitely, can form new, more- 
stable phases under certain conditions without changing 
the entire system. 

With reference to the thermodynamic surface, it is 
thus‘apparent, since points on the surface of dissipated 
energy represent stable or neutral equilibrium and since 
points below the surface of dissipated energy have no 
physical significance, that points on the primitive sur- 
face which lie above the surface of dissipated energy 
representing states of unstable or metastable equilibrium 
delineate the regions in E-S-V space from which phase 
changes can occur at constant temperature and pressure. 

On the basis of these conditions, it can be shown that 
a single analytical expression representing both the 
necessary and sufficient conditions of equilibrium for a 
system in a medium of temperature To and pressure po is 


5(E—ToS+ pV) =6F =0 (A6) 


for all variations of the state of the system. The system 
is absolutely s/able with respect to first-order changes 
if 6F is a minimum, and it is unstable if ôF is a maximum. 
Finally, if Eq. (6) holds, but for some finite changes, the 
inequality AF <0 exists, the state is metastable. 

Although the foregoing discussion is limited to sys- 
tems of one component, Gibbs (reference 91, pp. 100- 
115) has extended his treatment to a study of the 
internal stability of homogeneous fluids of many com- 
ponents. 


Mechanisms and Time Dependence of 
Phase Transformations 


The discontinuous change of metastable or unstable 
systems resulting in the formation of the initial frag- 
ments of a new, more stable phase is called nucleation. 
When such a phase-change occurs in the interior of a 
metastable or unstable system in the absence of struc- 
tural impurities, it is referred to as homogeneous nuclea- 
tion, whereas a phase-change initiated by and on foreign 
inclusions extraneous to the system is called heterogene- 


ous nucleation. 
Homogeneous Nucleation 


es of homogeneous nucleation have 


d that phase-changes occur by the formation of 
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intermediaries as a result of local fluctuations of certain 
properties (such as density) in initially homogeneous 
phases in metastable or unstable equilibrium.‘6-22-% 

These intermediaries consisting of clusters of mole- 
cules (or ions) of the initial phase vary in size (and 
possibly composition, shape, and structure). They are 
referred to as “embryos” and are considered to exist as 
a true heterogeneous system with the mother phase. 
The cluster of critical size, composition, structure, etc., 
capable of further spontaneous growth (with a net 
decrease in free energy) and, therefore, capable of initiat- 
ing the formation of the new phase, is called a nucleus 
(Fig. 38). 

The embryos and nuclei are considered to arise by the 
stepwise addition of single molecules, i.e., a bimolecular 
mechanism.*® The order of the reaction, therefore, is 
equal to the number of molecules in the nucleus. 


As first proposed by Volmer and Weber,” the theory. 


considered that nuclei can exist in a state of unstable 
equilibrium with the mother phase, making it possible, 
at least in principle, to define the nucleus in terms of 
reversible thermodynamics. Considering surface effects, 
the free energies of formation of spherical clusters can 
be calculated as a function of the radius (Fig. 38). It is 
obvious from Fig. 38 that the over-all free energy goes 
through a maximum which corresponds to the critical 
size cluster (nucleus) of radius, 7*, which can grow 
spontaneously with a net decrease in free energy. From 
fluctuation theory, the probability of nuclei formation 


and, therefore, also of their concentration can 
be approximated since they are proportional to 


exp(—AF*/kT), where AF* is equal to the free energy 
or the work of formation of a nucleus corresponding to 
the maximum as shown in Fig. 38. The condensation 
velocity or nucleation rate is then evaluated by com- 
puting the collision frequency of single molecules with 
the nuclei. This free-energy barrier is, therefore, similar 
to the activation energy of ordinary chemical reactions 
in permitting the derivation of a nucleation rate.tt 

Becker and Döring“ treated the problem differently. 
They assumed a size distribution of “embryos” and 
evaluated the rate of condensation by the solution of a 
set of equations relating how the number of embryos of 
any particular size changed with time. By assuming 4 
steady state, the mechanism of change was, therefore, 
simply the kinetic process of unit condensation and unit 
evaporation. 

In both of the cases, expressions for nucleation rate 
were quite similar [/~exp(—1/InS?) ], where J is the 
nucleation rate and S the supersaturation ratio p/p 
where # is the actual vapor pressure and pe the equilib- 
rium vapor pressure), and indicated a very marked 


_ tt In more-complicated systems, such as occurs in the case of 
inorganic crystallization, the various-sized embryos may have 4 
number of compositional and structural differences which also 
involve free-energy changes, and, in particular instances, there 
may be important effects owing to strain energy as well. Thus, 
over-all free-energy barrier to nucleation will be provided by & 
combination of the chemical-, interfacial-, and strain-free energies. 
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Single ions or 
Frc. 38. Schematic illustration indi- macats 
cating mechanism of nucleation as 
proposed _ principally by Volmer 
el al.*6.%3."and{by_Frenkel.%** 
AF 


dependence of nucleation rate upon the degree of super- 
saturation, particularly near the critical supersaturation 
ratio (i.e., where the nucleation rate rather suddenly 
becomes sufficiently large for ready measurement). For 
example, in the case of water vapor condensing to 
liquid drops, the time that must elapse for the appear- 
ance of the first drop at a supersaturation ratio of 4, is 
0.1 sec, whereas increasing the supersaturation ratio to 
5 decreases the time to 107" sec, and decreasing the 
ratio to 3 increases the time to 10% years !*® 

Although based on the theory of Volmer and Weber 
(fluctuation theory), it is possible that homogeneous 
nucleation can occur at any level of metastability. 
Certain theoretical considerations (as well as experi- 
mental findings) indicate that this is not possible, except 
at the limit of essential instability or from systems in 
unstable equilibrium rather than in metastable equilib- 
rium. The difficulties arise conceptually from the defi- 
nitions of equilibrium (a state independent of time), of 
metastability (stable with respect to continuous changes 
in state, but unstable with respect to certain discon- 
tinuous changes in state), and of what constitutes a 
discontinuous change in state, and from the applica- 
bility of e~45/* to predict the probability of fluctuations 
which are large enough to be considered a new phase. 
Thus, Frenkel” considers that the density fluctuations 
which lead to the formation of embryos (heterophase) 
transcend the limits usually considered in the ordinary 
statistical theory of homogeneous systems wherein 
ordinary fluctuations (homophase) lie “within the limits 
compatible with the preservation of a given aggregation 
state” (Frenkel,® p. 375). 


** While this is usually considered the mechanism of nucleation 
for most systems, there is recent evidence that, in the nucleation of 
solid-state transformations®® the nucleii arise by a cooperative 
phenomenon and not by discrete atomic jumps. 
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From these considerations, it would appear that 
nucleation from true metastable systems occurs only as 
a result of the introduction of heterogeneities and not 
spontaneously. This would also seem to be Gibbs? 
interpretation of the stability of metastable systems, 
since he states: 

“. .. the mass in question must be regarded as in 
strictness stable with respect to the growth of a globule 
of the kind considered, since W, the work required for 
the formation of such a globule of a certain size (viz, 
that which would be in equilibrium with the surround- 
ing mass), will always be positive. Nor can smaller 
globules be formed, for they can neither be in equilib- 
rium withthe surrounding mass, being too small, nor 
grow to the size of that to which W relates. If, however, 
by any external agency [ Ital. : Ed.] such a globular mass 
(of the size necessary for equilibrium) were formed, the 
the equilibrium has already (page 243) been shown to 
be unstable, and with the least excess in size, the interior 
mass would tend to increase without limit except that 
depending on the magnitude of the exterior mass” 
(Gibbs, pp. 255-256). 


Heterogeneous Nucleation 


The discontinuous change in state which results in 
nuclei formation by the introduction of a foreign inclu- 
sion may be the result of a number of different effects. 
Thus, either by virtue of nonspecific surface forces or 
by very specific interactions between certain groups on- 
the surface and the molecules or ions of the initial phase, 
such heterogeneities may adsorb or bind the molecules © 
or ions of the initial phase on their surface; by acting _ 
either as a core whose surface is now composed of th 
molecules of the initial phase or by binding the n 
cules or ions in a particular steric array, they may fo rm 
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a cluster of critical size, shape, and configuration neces- 
sary for nucleation. Since the critical-size cluster 
(assuming no structural or compositional changes) 
varies with the supersaturation ratio, nucleation rates 
for heterogeneous nucleation will, as in the case of homo- 
geneous nucleation, be strongly dependent upon and 
markedly dominated by the degree of metastability of 
the system. 

With regard to crystallization, it is of interest to note 
the findings of Turnbull and Vonnegut® who have pro- 
posed a crystallographic theory of crystal nucleation. 
The theory emphasizes the importance of geometric and 
structural factors of nucleation catalysts with regard to 
their catalytic potency and ability to initiate phase- 
changes. Based on a number of their own observations 
as well as on experiments of others, they have postulated 
that those substances acting as effective nucleation 
catalysts have atomic arrangements and lattice spacings 
on certain low-index planes which are very similar to 
those of the crystal being nucleated. 

The potency of the catalyst is formulated to be 
directly proportional to the reciprocal of the dis- 
registry between the lattice parameters of the catalytic 
surface and the forming crystal on these planes. Thus, 
in the formation of ice crystals from supersaturated 
droplets of water vapor for example, the most potent 
nucleation catalyst was found to be AgI°S."7 whose lattice 
structure and atomic arrangement are remarkably 
similar to ice. Similarly, in the case of the nucleation of 
hydrated sodium-sulfate (Na:S04: 10H20) by sodium- 
tetraborate (Na2B,O;-10H2O) crystals, it was noted 
that both crystals belong to the same space group and 
that the disregistry was very small between the two 
types of crystals on certain basal planes.” That some 
heterogeneities will not act as catalysts at all is probably 
related to such specific interactions and geometrical 
factors. 

Although many considerations such as specific inter- 
actions, types of chemical bonds, etc., as well as rela- 
tively nonspecific factors such as surface dislocations 
and defects, influence the potency or ability of the 
catalyst in any particular case, the importance of 
geometric and structural factors should be kept in mind 
because of their possible implications for the theory of 
biological mineralization which is presented. 


Other Related Considerations 


Whereas ucleation—referring to the process of 
forming the infinitesimally small fragments of the new 
phase—can im concept be separated from the subsequent 
growth of these fragments into macroscopic particles of 
the new phase leation rate—which is usually deter- 


mined by measuring the appearance of the macroscopic 
particles of the new phase—operationally is not com- 
E nt of the kinetics of such growth. 


i e operational definition of nucleation rale 
TC Os me or tent with the theoretical concept 


e the latter strictly speaking 
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refers only to the rate at which the nuclei are forming 
and not to the subsequent growth of the nuclei to 
crystals, nor to the phenomenon of recrystallization. 

Furthermore, measurements of nucleation rate, par- 
ticularly in the case of heterogeneous nucleation have 
given little or no evidence of the molecular or atomic 
sequence of the nucleation phenomena itself—i.e., the 
mechanism of nuclei formation—which is of particular 
importance in the case of biological mineralization, 
where certain of these steps may be enzymatically con- 
trolled and regulated, for example. 

Moreover, in the case of crystallization, the number 
and size of crystals is determined by the relative rates 
of crystal growth and nucleation rate once nucleation 
has been initiated. Considered independently, crystal 
growth may vary in just the opposite fashion as does 
nucleation rate with respect to certain variables such as 
temperature. In addition, the growth of crystal nuclei 
is dependent not only on diffusion rate (i.e., supersatu- 
ration ratio) but also on the presence of surface defects 
(e.g., growth by screw dislocations) even at low-super- 
saturation ratios.’ 

An interpretation of the entire phenomenon of the 
crystallization of a specific mineral would, therefore, 
require not only a study of nucleation phenomena but 
also considerations of the factors influencing crystal 
growth, crystal size, crystal habit, etc., for a particular 
crystal structure. Furthermore, the conditions influ- 
encing recrystallization (the growth of large crystals at 
the expense of smaller ones), might also be quite im- 
portant in the over-all process even after the solid state 
had been achieved. 
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Structure of Muscle Cells* 


H. STANLEY BENNETT 
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T IS my task to talk about the microscopic aspects 
of muscle structure. Muscle cells are specialized 
structurally for the function of contractility. In a muscle 
fiber, the ratio between deforming force and elongation 
can change rapidly and reversibly. This change charac- 
terizes the phenomenon of contractility. Contractility 
involves changes in the viscoelastic behavior of some 
fibrous protein structures in the muscle. In the relaxed 
state, a muscle fiber subjected to tensile stress along its 
axis will elongate, and a characteristic stress-strain 
diagram can be prepared. If a corresponding curve is 
taken while the muscle is contracted, the slope of the 
curve is much steeper. The shift from one elastic state 
to another can occur in a few milliseconds. 

Contractile structures of this sort are found in a great 
many different forms. Inoué (p. 402) discusses certain 
features of contractile mechanisms other than those 
called muscle. Now, consider primarily a specialized 
class of biological contractile mechanisms called muscle. 
The boundary between muscle and nonmuscle is often 
not easy to draw. There are protozoa, for example, 
which have contractile filaments within certain portions 
of their complicated cells. Many of these contractile 
structures have properties of muscle. 

Metazoans may likewise contain such dual-purpose 
cells. For example, some of the cells which surround the 
blood vessels of the earthworm serve the functions both 
of lining the channel and of controlling the size of the 
lumen. These endothelial cells contain contractile fila- 
ments which resemble those of muscle cells elsewhere 
in the organism. As another example, in hydra, a small 
metazoan organism, certain cells lining the gut cavity 
have an important role in the digestion and in the 
absorption of the food consumed. A part of these cells 
contain contractile filaments, and this part could be 

regarded as being similar to muscle. One might cite 
many other examples of dual or triple function cells, 
one of whose properties is that of contractility. 

But even the well-defined muscle cells found in the 
limbs of a vertebrate organism may display more than 
one physiological function. In addition to the main 
function of contractility, the muscle fibers of the mam- 
mals have a very important role in the production of 
heat. A good fraction of the body temperature is gener- 
ated by combustion in the muscle cells. If the heat 
losses from the body are sufficient, a very special type 
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of contractile response known as shivering occurs. This 
is characterized by very rapidly repeated contractions 
which do not accomplish any useful work but which 
generate a good deal of heat, thus helping to restore a 
satisfactory thermal balance. In addition, there are very 
important electrical characteristics of muscle. The phe- 
nomenon of excitation of muscle is accompanied by 
electrical changes. These electrical changes involve 
voltage transients of 20 to 100 mv, depending on the 
muscle. These are generally only of local importance as 
far as that muscle is concerned. But, in an aqueous 
medium, these electrical transients can escape into the 
surrounding fluid and be detected by sensitive electrodes. 
In electric fishes, the muscle becomes modified so that 
the electrical transients become very highly directional 
and strengthened. In most cases, the contractility of 
such specialized muscle fibers is lost. In such a fish, each 
modified muscle cell is polarized, forming an electroplax. 
Considerable numbers of these electric generating units 
may be stacked up in series and arranged in parallel. In 
the most highly developed forms, the electric organ can 
produce formidable power. The torpedo ray, for example, 
can deliver a current of 60 amp at 80 v as a result of 
series and parallel summation of many of these special- 
ized muscle cells. In the case of the electric eel (Eleciro- 
phorous electricus), more of these cells are summed in 
series and potentials can achieve values of 300 or more 
volts. Currents in this case, with fewer generating units 
in parallel, are of the order of 0.3 to 0.5 amp. Thus, by 
suitable electrical coupling (by giving a suitable vector 
to the electrical discharge of the muscle and connecting 
the modified muscle units in series and in parallel), 
rather astonishing electrical effects can be produced. In 
addition, many fish have very sensitive electric sensing 
organs and can broadcast electric pulses (some of them 
small) to explore the surrounding water. By sensing 
these pulses with their special electric receivers, they 
test the conductivity of the water and use this informa- 
tion for navigation, for escaping enemies, and for 
finding prey. 

Muscle fibers proper contain not only a contractile 
mechanism itself, but also other structures which control 
and fuel the contractile devices and render the whole 
apparatus a workable and effective part of the organism. 
The unit which accomplishes all of these coordinated 
physiological functions is a muscle fiber. These are really 
muscle cells, but since many of them are long and 
thread-like, they are called fibers. There are many kinds 
of muscle fibers. They vary from 10 to 100 yz in diameter 
and from 20,4 to several centimeters in length. The 
smaller muscle fibers usually contain only one nucleus. 
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STRUCTURE OF 
Many of the larger fibers, however, contain very many 
nuclei, up to hundreds. 

Muscles also vary greatly with respect to the fraction 
of their resting length by which they can shorten. At 
one extreme are the flight muscles of insects, such as 
those of the fly or wasp. The change in length between 
the relaxed state and contracted state in this type of 
muscle is relatively small, of the order of 2 to 5%. At 
the other extreme, one might cite the proboscis retractor 
muscles in certain marine worms, where the changes 
upon shortening are of the order of 90% of the original 
length. In vertebrate skeletal muscle such as is found 
in humans, the changes in length between the contracted 
and relaxed state do not exceed 20% under ordinary 
circumstances. 

For over a century, it has been realized that an ex- 
planation of contraction of muscle depended on an 
understanding of its molecular components and in a 
characterization of the actions and interactions of these 
components in terms of the contractile process. This 
concept of the participation of molecular units in con- 
traction was explicitly stated in 1859 by Kiihne,! when 
he made the first attempt to obtain from muscle proteins 
which might be important in the contractile process. 
He pressed out muscle juice and obtained from it a 
crude protein preparation which he called “myosin.” 

In recent years, with improvements in chemical and 
physical techniques, there have been a number of 
attempts to relate the molecular components of muscle 
with the ultrastructural anatomy and the actual process 
of contraction. Two important hypotheses have been 
proposed (see Morales, p. 426): 

The first may be called the one-filament hypothesis, 
and the second the two-filament hypothesis. In the first 
case, contraction is considered to be the result of a 
change in the configuration of a unit consisting of a 
single filament. Shortening of many of these unit fila- 
ments in parallel and in series would occur when a 
muscle cell contracts. So far as I am aware, the clearest 
and earliest explicit statement of this hypothesis can 
be laid to Meyer.? He proposed a number of models in 
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Fic. 1. Diagram of a conceptual model of the simplest possible 
muscle, based on the two-filament hypothesis of Huxley and 
Hanson. Each filament, represented by a horizontal solid line, is 
fitted with a series of binding sites, shown as short vertical dashed 
lines. The filaments constituting a pair overlap to a certain extent, 
and are bound to each other by a number of binding groups. In 
the relaxed state, the length of the overlapping portions is rela- 
tively short, and only a few of the binding sites are in contact. 
In contraction, the interaction between the binding sites changes, 
more binding sites become engaged, and one filament is drawn up 
alongside the other, increasing the proportion of overlap, shorten- 
ing the total length of the system, and exerting tension in a direc- 


tion parallel to that of the filaments. The filaments are of molecular 
dimensions. 
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Fic. 2. Diagram of a conceptual model of a modification of the 
two-filament hypothesis designed to account for a muscle which 
can contract to a small fraction of its resting length. The concept 
holds that in the relaxed state a series of myofilaments, each 
fitted with two sets of bonding sites capable of interacting 
with those of fellow filaments, overlaps a neighboring filament to 
a slight extent at each end. Upon contraction, the length of overlap 
between adjacent filaments is caused to increase as each filament 
is pulled lengthwise along its neighbor by a change in the inter- 
action between binding sites. The assembly thus behaves like a 
multiple set of extension ladders. The terminal extremities of the 
two terminating members of the series of extended relaxed fila- 
ments, however far separated, can be pulled by the contractile 
process to a distance apart limited only by the length of the longest 
filament of the series. 


which attraction and repulsion of charged groups on a 
helix were responsible for contraction. Astbury,’ from 
his x-ray diffraction work, elaborated on this concept. 
The view that the unit of contractility resides in a single 
filament dominated the literature until very recent years. 

The second hypothesis involves the view that the 
unit structure in contractility consists of two parallel, 
overlapping and closely spaced filaments. One of the 
filaments moves parallel to the other during the con- 
tractile process. Thus, the over-all length of the systems 
is decreased, although each of the individual filaments 
maintains its original length (Fig. 1). This two-filament 
hypothesis, which has been most explicitly stated by 
Hanson and H. E. Huxley! and H. E. Huxley,’ has 
been considered from the energetic and physiological 
point of view rather elaborately by A. F. Huxley.® It is 
now an important topic of discussion amongst muscle 
physiologists. This two-filament hypothesis can be re- 
garded as a reversible special version of the concept of 
aggregation of macromolecules and protofibrils which 
F. O. Schmitt discusses in connection with collagen (p. 
349). He points out that there is evidence that collagen 
macromolecules can aggregate in various spatial rela- 
tionships with respect to one another. As applied to 
muscle, the theory would state that the protein fila- 
ments in muscle could shift reversibly from one position 
of parallel alignment to another. 

According to this hypothesis, the simplest muscle 
would consist of two filaments (Fig. 1). An elaboration 
of this would permit one to build an analogous model 
to fit a fiber which contracted to only a small fraction 
of its relaxed length. As mentioned previously, such 
muscles are found in some invertebrates. Such a muscle 
might be composed of a series of filaments arranged in 
staggered array, as in Fig. 2. Upon contraction, these 
filaments would simply slide up on each other like the 
units of a multiple or extension ladder. This provides a 
way in which the two-filament hypothesis can be elabo- 


CC-0. Gurukul Kangri University Haridwar Collection. Digitized by S3 Foundation USA 


IIE E ae Le Pee a 


= 
$ y? K ei to account conceptually for some of the more 
extreme examples of muscular contraction. There is no 

experimental evidence, however, to support this model 
= for muscles which contract to only a small fraction of 

their resting length. 
ec ee eee Bi electron micrographs, the two-filament 
| ted by the visualization in striated 


of 
$ 


kh es Oa 


i a ‘c is suppor 
_ hypothesis 1s SUP p CC-0. Gurukul Kangri University Haridwar Collection. Digitized by S3 Foundation USA 
wie 


i 396 HRSA N IE ITN BENNE TT 


Fic. 3. A diagram to show 
some of the structural fea- 
tures of a typical striated 
muscle fiber, such as might 
be found in the limb muscle 
of a vertebrate. The figure 
represents a semitranspar- 
ent block cut out of the 
muscle. V designates extra- 
cellular space. The sarco- 
lemma S is the plasma 
membrane of the muscle 
cell. Many mitochondria / 
are interspersed between 
the myofibrils M. Within 
the myofibrils are the myo- 
filaments Æ, which are 
of molecular dimensions. 
Forming laceworks sur- 
rounding the myofibrils are 
elements of the sarcoplas- 
matic reticulum œR, also 
called sarcotubules. 


muscle of two sets of filaments.’ In the muscles which 
Huxley studied, one type of filament is thicker than the 
other (Figs. 4 and 5). This is not an essential part 
of the concept, however, and it would not detract from 
the theory if, in some muscles, all of the filaments were 
found by electron microscopy to have geometrically 
similar dimensions. 
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In addition to the contractile elements, muscles con- 
tain a number of other important components. Some 
of these are represented for one type of muscle in Fig. 3. 
The muscle is surrounded by a membrane which corre- 
sponds to the plasma membrane of the muscle and is 
called the sarcolemma (S). The concentrations of con- 
stituents of the fluid encompassed by the plasma mem- 
brane differ from those outside. If one places an elec- 
trode inside the muscle cell and connects it through a 
suitable recording device to an electrode in the fluid 
outside the muscle cell, one finds a potential difference 
between the inside and outside of the muscle. This 
potential difference has been related to the difference 
in ionic concentrations across the sarcolemma. Fairly 
satisfying relationships, based on the differential con- 
centrations of potassium and sodium, have been derived 
to account for this potential difference. In electron 
micrographs, the sarcolemma (the structure responsible 
for separating these two ionic compartments) is of the 
order of 100A thick. A measured 100-mv potential 
difference between the inside and the outside of the 
muscle fiber means that there is a voltage gradient of 
the order of 100 000 v/cm across the sarcolemma. 

Katz (p. 466) presents some of the features of mem- 
branes of this sort in his discussion of nerves and neural 
conduction. If one applies a quick transient electric 
pulse to certain portions of the sarcolemma of many 
muscles, one gets an electrical disturbance which 
sweeps in all directions over the muscle fiber without 
loss of amplitude. Such electrical excitability resides in 
large portions of the sarcolemma, but is not found in 
the immediate region where nerves reach the muscle, 
nor is it found in the region where the tendons attach 
to the muscle fiber. From this difference in electrical 
behavior, one can conclude that the molecular organiza- 
tion of the sarcolemma varies in different portions. 
There is, however, no satisfactory correlation of these 
presumed differences in molecular structure with images 
of the sarcolemma in electron micrographs, nor is it 
known in any detail what these differences might be 
physiochemically. 

The sarcolemma can be traced continuously around 
both ends of the muscle. In general, in most vertebrate 
muscle and in some insect muscle, the fibers of the 
tendons are bonded in some way to the outer surface 
of the sarcolemma. Certain noncontractile intracellular 
filaments continuous with the myofilaments also connect 
with the sarcolemma, but with its inner surface, opposite 
the tendon attachments. This arrangement is capable of 
transmitting the tension generated within the fiber to 
some extrafibrillar unit, such as bone. 

Most muscle fibers, but not all, are equipped with a 
nerve supply which, in more elaborate fibers, serves a 
number of different functions. In some types of verte- 
brate muscle, such as skeletal muscle, there is a main 
motor nerve which supplies each fiber. When this nerve 
is excited, the muscle responds by contracting. In ad- 
dition, in many cases there are sensory organs located 
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in the tendons. These sense organs are similar to strain 
gauges in that they can respond to changes in tension 
and send impulses to the central nervous system. These 
signals can then be processed and correcting directions 
relayed back to the muscle fiber. This provides a type 
of feedback loop which regulates the strength and 
temporal nature of the muscular contraction. In certain 
muscles, much more elaborate types of regulatory 
devices are present. One is known as a muscle spindle. 
This is a complex structure embedded amongst ordinary 
muscle fibers and consisting of a very delicate muscle 
fiber inside a connective tissue capsule. This special fiber 
is fitted with a nerve supply containing motor and 
sensory components. The motor component can actuate 
the contractile mechanism of the very fine fiber within 
the spindle capsule and thus vary the bias on this sen- 
sory device. The sensory fibers reach nerve endings 
associated with the fiber and pick up signals which 
result from changes in tension. These are sent to the 
central nervous system. These examples do not exhaust 
the types of regulatory mechanisms which operate in 
muscle fibers. In certain cases, rather different mecha- 
nisms are found. 

One can divide the contents of the muscle cell into 
two components: the contractile portion and the non- 
contractile portion. The latter is called the sarcoplasm. 
This sarcoplasm contains mitochondria, nuclei, fat drop- 
lets, glycogen, and other structures. The fat and the 
glycogen serve primarily as fuel. Among the other 
structures, muscle contains an elaborate internal system 
of membranes (Fig. 3, R). These have been studied by 
Sjostrand,’ by Andersson,’ by Porter and Palade’, and 
others. These membranes often define tubular struc- 
tures. This has led Sjöstrand to speak of them as 
“sarcotubules.” The structures were seen with the light 
microscope by Retzius (1881, 1890) many years ago 
and termed the “‘sarcoplasmatic reticulum.’ Thus, 
one can choose between two terms, each proposed by 
an eminent Swedish anatomist. 

This internal system of membranes has, in general, 
not enjoyed much attention. It has been ignored com- 
pletely by biochemists, and only recently has received 
the gaze of physiologists. Indications at the moment 
are that it may have a role in transmitting an excitatory 
signal from the sarcolemma to the contractile elements 
within the fiber. 

Many of the features introduced above are diagramed 
in Fig. 3, which represents a semitransparent block cut 
out of a piece of striated muscle. It shows, in a rather 
over-simplified way, many of the features of muscle 
structure discussed in the foregoing. V represents the 
intercellular space outside the fiber. The double lines S$ 
represent the sarcolemma or plasma membrane of the 


muscle. This has been severed by a knife in the direc- 


tions a’~a’. The flesh itself has been cut along 6/-b’. The 


myofilaments / are represented by the very small dots 


in the cross sections at the ends of the block and by the 
very fine striations seen on longitudinal section. In this 
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organized in groups, the 
found in many muscles 
roximately 0.5 to 2 in 


lic. 4. An electron micro- 
graph showing a single sarco- 
mere from the striated muscle 
of a rabbit. This very thin sec- 
tion is cut nearly parallel to the 
axis of the myofilaments. Thin 
filaments 7 and thick filaments 
U can be distinguished and their 
overlap is displayed. A, J, H, 
and Z designate the correspond- 
ing cross bands of the striped 
muscle repeating unit, thesarcc= 
mere. Electron micrograph 
courtesy of H. E. Huxley. 


repeating pattern, which F. O. Schmitt (p. 349) discusses 
earlier. This repeating pattern usually varies in relaxed 
muscle from about 14 to 2 u. However, in invertebrate 
striated muscle, the length of the unit may extend to 
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Fic. 5. A high-power detail of a portion of Fig. 4. The thin filaments T can be seen interdigitating with the thick filaments U, over- 
lapping throughout the A band, except for the region of the light Æ band in the center. Slender bridges can be seen connecti 
thick and thin filaments. These are believed to represent interaction sites, such as those represented by vertical short dashed 
Figs. 1 and 2. Electron micrograph courtesy of°H. E. Huxley. 


10 or 15, and in certain extreme cases (in certain In most muscle, one finds many glycogen granules 
unusual worms) repeating patterns of 50 or more u scattered about in the sarcoplasm. These are not shown 
can be encountered. here. In addition, particularly in certain phases of 
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physiological activity, one can find some intramuscular 
fat. If the animal is starved somewhat, a good deal of 
fat will appear in heart and other muscles. Under those 
conditions, mitochondria are closely associated with the 
fat droplets as though they were burning the lipid as 
fuel. In migrating salmon and in migrating geese, which 
feed extensively before their long journeys, the muscle 
fibers and adjacent cells accumulate very large amounts 
of fat. It is thought that the distance which a flock of 
geese can cover in a single flight is determined by the 
amount of fuel they can carry. It is much more eco- 
nomical to carry it in the form of fat than in the form 
of glycogen, since they can burn the hydrogen in the fat. 
Turning back to Fig. 3, the mitochondria J can be 
clearly seen. Close to them and intertwining with the 
myofibrils are elements of the sarcoplasmatic reticulum 
of Retzius, R. In certain places, it can be seen that this 
reticulum comes in close contact with the mitochondria. 
It forms an extensive plexus here and there within the 
sarcoplasm of the muscle fiber. Thus, in gross outline it 
forms a network appearance. This gave rise to the con- 
cept that it was a reticulum. But if one cuts it in cross 
section, one sees that it is composed of membranes 
which form tubules which are very often flattened. 
Sjostrand’s term of “sarcotubule” is also appropriate, 
as it describes a second important feature of the struc- 
ture. Retzius suggested in 1881 that this reticulum 
might pick up a signal from the sarcolemma and convey 
that signal deep into the interior of the fiber, thus 
exciting myofibrils a long way from the surface. Experi- 
mental justification of this concept, however, was a 
long time in coming. Only within the last two gE three 
years A. F. Huxley and Taylor (see A. F. Huxley®) have 
brought forth evidence that there is indeed a conducting 
forms to these specifications. Huxley 
exploring the possibilities that 
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Fic. 6. An electron micro- 
graph showing a cross sec- 
tion of a portion of a myo- 
fibril of the muscle of a fly, 
The thick myofilaments ap- 
pear in cross section as 
circles arranged in hex- 
agonal array, connected to 
each other by bridges form- 
ing equilateral triangles. In 
the middle of each bridge is 
a dense spot which may 
represent cross sections 
through thin filaments. 
Electron micrograph cour- 
tesy of A. J. Hodge. 


the conduction might in fact reside within this system 
of tubules or membranes. 

Robertson”™ has studied the sarcolemma in the 
electron microscope at high resolution. He has observed 
that it conforms in structure to the “unit membranes” 
found in other cells, presenting two peaks in density 
about 55 A apart, separated by a region of lesser density. 
On the outer surface of the sarcolemmatic unit mem- 
brane is a cloud of material which gives the chemical 
reactions associated with polysaccharides. It is believed 
to contain some glycoproteins. This type of polysac- 
charide-rich coating is found very frequently in associa- 
tion with plasma membranes, so its presence in associa- 
tion with the sarcolemma of muscle is not exceptional. 

There is reason to think that there are probably two 
layers of lipid and some protein in the unit membrane 
comprising the sarcolemma. Various attempts have been 
made to fit these layers into the patterns of density 
seen in Robertson’s micrographs. As Sjöstrand and 
Hodge point out, precise information about this point - 
is not available. My own view is that all of the models 
concerned with the actual disposition of lipid within 
these membranes are so uncertain that one of our major 
tasks is to make a detailed map of the true arrangements 
of the proteins and lipid in biological membranes. I am 
doubtful of the usual presentation of the lipoprotein 
membrane structure as presented by Davson and 
Danielli,“ where the lipid molecules are presented with 
their polar groups out and their nonpolar groups in. 

Figure 4 is one of H. E. Huxley’s elegant electron 
micrographs representing the arrangements of myo- 
filaments in a myofibril. It provides some of the evidence 
for the hypothesis there are two interacting types of 
filaments differing from each other in structure. One, 
the thin filament T, is envisioned as starting at the 
Z band and extending a certain distance. The second 
type, the thick filament U, occurs in the middle of the 
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repeating unit, which is called the sarcomere. Contrac- 
tion is envisioned as involving an interaction between 
these two filaments, so that traction is exerted and the 
ends of the large filaments approach the Z band. 
Figure 5 shows the interdigitation of these filaments at 
higher power. In cross sections, the large filaments are 
arranged in a hexagonal pattern, and the small filaments 
are arranged in special relationship to them. In muscle 
from the rabbit, the small filaments are arranged so 
that each one is located in the center of a triangle formed 
by the large ones. One can see that, in such a case, if a 
section is cut through the muscle sufficiently thin to 
encompass only one layer, a plane can be found which 
will accommodate the thick filaments and two thin ones. 
Figures 4 and 5 are examples of such sections visualized 
in the electron microscope. Of importance are certain 
bridges which can be seen transversely extending be- 
tween the two types of filaments. These may represent 
sites of interaction between the two filaments. 

Hanson and Huxley’ have proposed that the thick 
filaments contain the protein myosin and the thin fila- 
ments contain the protein actin. Huxley interprets 
contraction of the muscle as a change in the bonding 
occurring through reaction sites at the bridges, taking 
place in such a way that the actin and myosin filaments 
slide along each other. 

Figure 6 is an electron micrograph by Hodge.” It 
represents a cross section of insect muscle. The large 
filaments and the cross bridges connecting them in a 
hexagonal pattern can be seen clearly. There are no 
small filaments lying in the center of the triangles 
formed by the large filaments in this particular type of 
muscle. However, in many cases, one can see a dense 
spot in the middle of the line connecting adjacent large 


CC-0. Gurukul Kangri University Haridwar Collection. Digitized by S3 Foundation USA 


MUSCLE CBE LES 401 


filaments. It can be shown that these represent cross 
sections of small filaments which, in this type of muscle, 
are arranged in the centers of lines connecting individual 
large filaments rather than in the middle of triangles 
bounded at the corners by the large filaments. Huxley, 
however, believes that this is a trivial difference and 
that in both cases the interactions between the small 
and large filaments are the basis of muscular contraction. 

Whether or not one thinks of contractility as residing 
in one filament or in the interaction between two fila- 
ments, these micrographs of Huxley and of Hodge have 
gone far in showing the geometrical framework within 
which the contractile mechanism resides. 
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Motility of Cilia and the Mechanism of Mitosis* 
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HE movement of cilia and chromosomes are in- 
terpreted in terms of the microscopic and fine 
structures in cells. Section 1 discusses structure and 
motility of cilia; Sec. 2, microscopic structure of the 
mitotic spindle; Sec. 3, centers of organization in spindle 
and cilia; and Sec. 4, on the physicochemical nature of 
the spindle; and Sec. 5, relation of cilia and chromosome 
movement to muscle contraction. 


1. STRUCTURE AND MOTILITY OF CILIA 


Many small organisms—bacteria and protozoa, sperm 
and embryos of larger organisms—are propelled by beat- 
ing their thin whip-like cilia or flagella (used inter- 
changeably here). Also, ctenophores even a foot long 
and several inches wide can swim about or adjust their 
gravitational orientation by coordinated beating of 
their ciliary bundles. When the cell or organism is fixed, 
as in the gills of mussels and clams and in the human 
trachea, cilia create a current capable of pumping a 
considerable quantity of liquid or mucous material. 

The length of cilia may range from a few microns to 
several millimeters, and there may be one to several 
thousand cilia per cell but their diameter is quite con- 
stant, usually between a tenth and a half micron. Where 
a number of cilia occur in a row, adjacent cilia beat 
slightly out of phase with each other and a regular 
propagating wave is observed. 

At the base of each cilium is found a characteristic 
bulbous enlargement, the basal granule. Further, proxi- 
mal to the basal granule, rootlets sometimes are found 
which may function as anchorage or serve to conduct 
impulses. The membrane of the cilium appears continu- 
ous with that of the cell body. 

Examined with an electron microscope (Fawcett and 
Porter,‘ and also dicussed in the following), cilia reveal 
a characteristic inner fibrillar pattern of amazing uni- 
formity. Near the periphery of the cilium, there are 
usually nine (or occasionally a multiple thereof) fibrils, 
each composed of two filaments (or tubules?) some 
100 to 200 A in diameter. Surrounded by the outer 
nine are two additional central fibrils of somewhat 
smaller diameters (Figs. 1 and 2). The nine outer fibrils 
reach the base of the cilium and appear to merge lat- 
e ] ally into a hollow tube to make up the basal granule. 
The two central filaments also extend the whole free 


length of the cilium but apparently do not reach the 
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basal granule. At the tip of the cilium, the outer nine 
and inner two fibrils are said to merge. In cilia with 
distinct directional beating, the line intersecting the two 
central filaments lies at right angles to the direction 
of beat. 

The pattern of nine plus two has been found in cilia 
from a wide variety of cells, in fact, in practically all 
cilia observed with adequate resolution. The eleven 
fibrils are unlikely to be artifacts formed during prepara- 
tion for electron microscopy, as sperm-tail flagella 
macerated in distilled water also show the frayed eleven 
fibrils in dark field illumination. 

Given this structure, how does one explain the mecha- 
nism of ciliary beat? The pattern of beat may be rela- 
tively simple, as shown in Fig. 3(a). There is a recovery 
stroke in which the limp cilium stiffens from the base up, 
and an effective stroke where bending is mostly at the 
base and the rest of the cilium acts as if it were stiff. The 
same flagellate organism which swims forward by this 


Fic. 1. Electron micrograph of cross sections of rat-trachea 
cilia. Compare with interpretive diagram, Fig. 2 [from J. Rhodin 
doso Dalhamn, Z. Zellforsch. u. mikroskop. Anat. 44, 345 
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beat may swim backward, also sidewise or circularly, as 
shown in Fig. 3(b).?> Bradfield® postulates that waves 
of contraction proceed along the length of the outer nine 
fibrils with a message perhaps traveling in advance along 
the two inner ones. The waves of contraction may be 
started at the base by a commutator-like device which 
would result not in a synchronous contraction but in 
waves with various phase lags. It is more likely, how- 
ever, that the outer fibrils may be the conductive ele- 
ments, the inner two at least partaking a more active 
function in beating. For, in certain sensory cells (see 
following) and at the embedded base of each cilium*7 
where one would expect conduction but not contrac- 
tion, one in fact finds the outer nine fibrils and not the 
inner two. 

It was tacitly assumed in the foregoing that contrac- 
tion of the fibrils was the basis for cilia beat. Actually, 
*the only evidence for contraction of cilia components 
(at the molecular level) appears to lie in the x-ray 
diffraction studies of Astbury et al. There they find in 
flagella collected from bacteria, in addition to an a- 
protein pattern (which is characteristic of many fibrous 
proteins, such as keratin, myosin, elastin, etc., and is 
believed to reflect the fundamental spacing of the poly- 
peptide backbone), a folded 6 pattern. This supercon- 
tracted pattern they believe reflects the folding of a 
fraction of the polypeptide chains responsible for con- 
traction. The protein isolated from this bacterial flagella 


Fic. 2. Interpretation of electron micrograph (Fig. 1) showing 
fine structure of cilia [from J. Rhodin and T. Dalhamn, Z. Zell- 
forsch. u. mikroskop. Anat. 44, 345 (1956) }. 
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1 i 
(a) (b) (c) 
(b) 


Fic. 3. Various patterns of beat of a Monas flagellum. Arrows 
indicate directions of swimming [from M. Hartmann, Allgemeine 
Biologie (Gustav Fischer Verlag, Stuttgart, 1953), fourth edition; 
after Krijgsman]. 


preparation lacks sulfur-containing amino acids and 
appears dissimilar to any of the muscle components so 
far known. 

Aside from localized contraction, the mechanism of 
ciliary beat has been interpreted by other schemes also, 
such as local swelling, reciprocal pumping of a liquid in 
and out of the cilium, or discontinuous flow of material 
through the cilium.!*.° The exact site of the motor 
function in the cilium is also disputed, but nevertheless 
there exist certain observations (Secs. 3 and 5) which 
link the structure and function of these minute struc- 
tures to other specialized motile structures of the cell. 

DeRobertis, Sjöstrand, and others!" made an in- 
teresting discovery related to the structure of cilia in 
the filaments of the retinal-rod cells and of the sensory- 
hair cells of the inner ear. These fibrous elements, 
believed for some time to be derived from embryonic 
cilia, also show a fine structure similar to that of cilia 
described in the foregoing. In these apparently non- 
motile fibers, the same outer nine fibrils are found, but 
the inner two are missing. 


2. MICROSCOPIC STRUCTURE OF THE 
MITOTIC SPINDLE 


Cilia may beat as frequently as a hundred cycles per 
second. Chromosomes, on the other hand, move ex- 


tremely slowly, the maximum velocity being a few 
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Fic. 4. Schematic diagram of mitotic spindle, (a) with 
centrioles and (b) without (modified from Schrader’). 


microns per minute at anaphase (for reviews of mitosis 
see references 12 to 21). However, certain elements of 
the mitotic apparatus responsible for chromosome 
movement may be identical to, or at least have proper- 
ties common to, portions of cilia. Before discussing this 
problem in Sec. 3, the structure of representative mi- 
totic apparatuses are described. 

Using Schrader’s description, the following terms 
are used. A dot or rod-like structure, the “‘centriole,”’ is 
found at the poles of the mitotic spindle of many animal 
cells and occasionally in plant cells. The centrioles are 
morphologically the focal points for the spindle fibers 
and the astral rays [ Fig. 4(a) ]. In an average plant cell 
[ Fig. 4(b) ], both the centrioles and the astral rays are 
missing and the spindle fibers show less tendency to 
converge at the poles. Between the two spindle poles 
lie “continuous fibers.” From the ‘“‘kinetochores,” a 
specific region of the chromosomes ‘‘chromosomal 
fibers” extend to or toward the spindle poles. 

For half a century, the reality of these fibers in living 
cells has been disputed, for, with very few exceptions, 
they could be seen only in cells after fixation and stain- 
ing. Recently, however, with improvements of the 
polarizing microscope, the author has been able to show 
these fibers clearly in living cells of many animals and 

plants by virtue of their positive birefringence (strength 
of birefringence 10-*—10~*).”: In animal cells, the 
fibers converge and their birefringence is stronger ad- 
jacent to the chromosome kinetochores and near the 
centrioles [Fig. 5(a)]. In plant cells, the situation is 
similar at the kinetochore region, but toward the “poles” 
the birefringence is weaker and the fibers are more 
diffuse [Fig. 5(b) ]. During anaphase movement, the 
birefringence of the chromosome fibers persists and 
s is strongest adjacent to the kinetochores and the 
see les. The birefringence of the continuous fibers falls 
te and then rises again after complete separation of 


idregion containing fibers with 
the chromosomes. The midr 
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strong secondary birefringence becomes the phragmo- 
plast of plant cells [Fig. 5(c)]. Within the phragmo- 
plast, small granules align, fuse, and become the cell 
plate, which divides the original cell into two. In animal 
cells, the cytoplasm generally cleaves inward at right 
angles to the spindle remnant forming two new cells with 
one nucleus each. Time-lapse motion pictures of these 
processes have been made by the author using a special 
polarizing microscope and were shown at the meeting, 

The material of the mitotic apparatus has been iso- 
lated from sea urchin and other eggs by Mazia, Dan, 
and their collaborators in quantities sufficient for chemi- 
cal analyses.” Amino-acid composition of their pro- 
tein fraction (molecular weight ca 45 000) shows, unlike 
the bacterial flagella protein, a fair content of sulfhydryl- 
containing amino acids. 

Although the spindle isolated after alcohol treatment 
by Mazia and Dan is stable, the fibers of the mitotic 
spindle in living cells are apparently extremely labile. 
The spindle fibers may disappear by siight mechanical 
agitation or by low-temperature treatment of the cell, 
only to reform in the course of a few minutes (see Sec. 5, 
also Carlson!:*7, Chambers,?5 Inoué,2? and Östergren"). 


3. CENTERS OF ORGANIZATION IN 
SPINDLE AND CILIA 


Although the apparent velocity and very probably 
the stability of the spindle fibers differ by orders of 
magnitude from the cilia, the fibrous elements of the 
two structures may be formed or organized in a very 
similar fashion. The argument follows. 

(A) The birefringence of the spindle fibers and astral 
rays is strongest adjacent to the kinetochores and 
centrioles throughout metaphase and anaphase (Inoué” 
and Schmidt”!). The fibers are arranged radially (within 
restricted cones, in the case of the kinetochores) from 
these centers, and the growth of the spindle (at least 
in animal cells) takes place by lengthening of the fibers 
joining the centers. This strongly suggests that both 
the centrioles and the kinetochores are centers of fiber 
orientation. 

(B) Growth of the axial filament of the sperm-tail 
flagellum starts from the basal granule, which same 
structure during the last mitosis acted as a centriole of 
the spindle.’ In certain protozoa, flagella and the 
mitotic spindle both grow simultaneously from common 
giant centrioles.) 

(C) In abnormal divisions of snail spermatocytes, 
some chromosomes lose their kinetochores and cannot 
partake in mitosis. The Pollisters*® have shown that a 
clear correlation exists between the number of such 
chromosomes and the number of supernumerary basal 
granules, which migrate to the cell periphery and form 
the same number of extra sperm tails. Also, those 
kinetochores earlier dissociated from chromosomes form 
small extra astral rays while the spindle for the next 


division is formed. 
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(D) With the electron microscope, centrioles of the 
mitotic apparatus have been shown to exhibit the same 
general structure and dimensions as that earlier de- 
scribed for the cilia basal granules, namely, a cylindrical 
structure containing nine groups of rods or tube-like 
elements.**~*7 This same structure is observed for the 
basal granule of sperm-tail flagellum.**:9 

Thus, it appears that basal granules of cilia, kineto- 
chores, chromosomes, and centrioles of the mitotic 
apparatus all act as centers of fibrous organization in 
cells. The basal granules, centrioles, and kinetochores, 
may be in fact identical structures, taking on different 
functions at different loci within the cell (see also 
Meves"’), 


4. ON THE PHYSICOCHEMICAL NATURE OF 3 5 
THE MITOTIC SPINDLE (a) 


Electron-microscope studies have revealed a charac- 
teristic fine structure in cilia (Sec. 1). The possible 
identity of their basal granules to centrioles of the 
spindle also was strengthened (Sec. 3). However, 
this technique as yet has revealed rather little of the 
structure and behavior of spindle fibers and kineto- 
chores.**:#!? The lack of success, I believe, is attributed 
to the difficulty or impossibility of preserving the 
spindle material in a reasonably native form after 
fixation and electron bombardment. In contrast, the 
polarizing microscope enables one to observe birefrin- 
gence of spindle fibers in actively dividing cells (Secs. 2 
and 3). Although the fine structure cannot be resolved 
directly, measurement of birefringence allows one to in- 
terpret the changes taking place in the microscopically 
unresolvable domain. This section describes further 
polarization-optical observations which may shed some 
light on the physical chemistry of the mitotic spindle. 

The spindle of the egg cell (of a marine worm 
Chaelopterus) can be stretched if the cell is flattened 
very gently. The length of the spindle is then found to 
be strictly proportional to the diameter of the com- 
pressed egg.” This relation is explained by the attach- 
ment of the spindle poles through astral rays#:*# to the 
cortical-gel layer. Immediately upon stretching, the 
spindle is thinner and more pointed at the poles, while 
in a minute or two it grows fatter and the birefringence 
increases. When the egg is compressed suddenly, the 
link between spindle poles and the cortical gel is ap- 
parently broken and the spindle shortens as it loses its 
birefringence (Fig. 6). 

At a given length, the birefringence of the spindle 
fibers is a function of temperature.” With abnormally 
low temperature (4° to 6°C), the spindle birefringence 
is abolished completely. When the temperature is raised, 
the birefringence returns. The loss of birefringence with 


low temperature is rapid (less than half a minute), es 
S 6 $ 6 Fic. 5. Birefringent spindle fibers in livi lls. 
but when the temperature is raised the birefringence are printed as negatives acd show awed bats paral ecuneene 


a indle fluctuate until, after several xis black. (a) Chaetopterus pergamentaceous metaphase: 
. ng structure of the spin "OR ee h Lilium longiflorum early anaphase; (c) the same, Snead 
> minutes, they reach an equilibrium specific for the new. with early cell-plate formation (modified from Inoue), 
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because cells which already have entered metaphase can 
go through division even in the presence of metabolic 
inhibitors such as cyanide and carbon monoxide.‘ 

Ag then is determined as the asymptote of the curve in 
Fig. 7. Were these assumptions warranted, one should 
expect a linear relationship between log B/(A o—B) 
and 1/T°K. Figure 8 shows this plot. From the slope 
and intercept we (Morales*® and Inoué) calculate the 
evolution of 28 kcal of heat per mole reacted, while at 
25°C a free-energy change of —1.8 kcal/mole and an 
entropy increase of 100 eu/mole is observed. 

The very low free-energy change agrees with the 
proposed lability of the spindle structure (weak gel 
with small number of active hydrogen bonds?), while 
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Fic. 6. eee tion of Chaetopterus spindle plotted against 
irom S. . Inoué, J. Exptl. Cell Research Suppl. 2, 305 and Ay = 6.0 
19. 


temperature. The native spindle is, therefore, in a 3.0 
temperature-sensitive equilibrium. 
The equilibrium birefringence at various tempera- 
; tures is plotted in Fig. 7. If it is assumed that the bi- 
T Soman __ refringence of spindle fibers is directly proportional to 1.0 
> the amount (B) of material oriented in that region, 
ies that only the equilibrium constant [k(7)] between 
_ oriented and nonoriented material is influenced by Afogg = — 1.8 kcal 
erature, and that the total amount (Ao) of the AH = +28.4 kcal 
abl 0.3 ASogg = +101 ev 
ible material in the same region remains constant, 98 
the equilibrium is expressed by 
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B 
Log A-5 


kT) 
<a te AB = B. 0.1 
. 3.25 330m3: z 3.40 3.45 3.50 3.55 
í was assumed constant since the spindle in the asx 10-8 
foplerus US CBE ‘is in puerctaphase equilibrium and also ; 
r Fic. 8. Log plot of spindle reaction equilibrium 
vs inverse absolute temperature. 
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the high heat of reaction and the high positive entropy 
which nearly cancel each other explain the apparent 
decrease of entropy (increase of spindle birefringence) 
at higher temperatures. At high temperature, the still- 
unoriented protein molecules presumably absorb a con- 
siderable amount of heat, thus, for example, releasing — 
bound water which could have prevented their orienta- _ 
tion. The melting and randomizing of the bound water 
then could account for the large increase in entropy 
(also see Anderson). 

‘It appears that the spindle fibers are regions wig 
high degrees of orientation, although very labile, ex- 
‘op essing the orienting influence of the kinetochores a d 
P ae oles. eer: earlier observation on tl 
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spindle fibers are undoubtedly almost fluid in nature and 


are not stably crosslinked gels as the term “fibers” may 
imply. With dehydrating agents (e.g., alcohol) and in 
an acidic environment the crosslinking is probably 
enhanced until the spindle is finally “fixed.” 

On this basis and from observations described in 
Sec. 2, anaphase movement of chromosomes may be 
explained by local reduction in the quantity of oriented 
material and the consequent shortening of chromosomal 
fibers. The orienting forces of kinetochore and centriole 
must be just as actively at work throughout this 
process. The continuous fibers, either actively elongat- 
ing or at constant length, could function as supports 
to counteract the pulling action of the chromosomal 
fibers. This mechanism of contraction of the chromo- 
somal fibers is similar to that postulated by the author 
for the action of low concentrations (<10-*M) of 
colchicine.” It is, however, in distinct contrast to the 
mechanism suggested by Swann*7-"® whose microscope 
lacked the power of resolution for detecting individual 
spindle fibers.” 


5. RELATION OF CILIA BEAT AND MITOSIS 
TO MUSCLE CONTRACTION 


The sole evidence for molecular contraction in cilia 
appears to lie in the x-ray diffraction data on isolated 
bacterial flagella (Sec. 1). Hypotheses involving mecha- 
nisms other than contraction (e.g., differential swelling) 
also have been postulated, but discriminating experi- 
ments are lacking. 

Anaphase movement of chromosomes was explained 
by an orientation equilibrium of spindle fibers (Sec. 4). 
The action of centrioles and kinetochores—the center 
of foci of orientation in the spindle—appears similar 
to that of basal granules of cilia during fibrogenesis 
(Sec. 3). 

Occasionally, cilia are resorbed or re-formed, but the 
process is much slower than the formation and dis- 
appearance of the mitotic spindle at each cell division. 
It appears that the cilia fibrils are quite stable while the 
molecules in spindle fibers probably are barely cross- 
linked (Sec. 4). In comparison, the contractile material 
in muscle may have a stability lying in between that of 
cilia and spindle fibers. The primary function of muscle 
and cilia is repeated rapid contraction, while with the 
spindle it is a single successful partition of the chromo- 
somes into two new cells. 

Mechanisms of muscle contraction are discussed in 
other papers of this symposium. It is interesting that 
one of the most widely discussed (and rather widely 
accepted) current hypotheses is that involving the 
creeping of two sets of filaments past each other." 

Regardless of the exact mechanism of thoice, one may 
not overlook the muscle model systems which can con- 
tract and produce the same force per cross section as 
live muscles. This is true in muscle fibers extracted with 


50% chilled glycerol, and in an oriented gel fiber formed 
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by mixing two purified muscle proteins, actin and 
myosin. In either case, contraction is induced specifically 
by the addition of ATP (adenosine triphosphate) in the 
presence of magnesium and potassium ions.*—*4 

Hoffmann-Berling has shown further that motility 
can be induced in glycerol-extracted cells other than 
muscle, again by the addition of ATP. Thus, he was able 
to induce glycerinated sperm-tail flagella to undergo 
prolonged beating, chromosomes to separate, and ex- 
tracted dividing cells to complete formation of their 
cleavage furrow (motion picture commercially avail- 
able®*':*5), These models respond to approximately the 
same concentration of ATP as muscle models. 

To what extent the movements induced by ATP in 
various cells reflect the same molecular mechanisms still 
is not clear. For example, the elongation of the central 
spindle, apparently responsible for the separation of 
chromosomes in the cell model, is not prevented by the 
same poisons to which the muscle and cilia model are 
very sensitive. Furthermore, although the organic tri- 
phosphate ATP (and ITP) specifically induces move- 
ments in extracted cells, the responding proteins show 
significant difference in their amino-acid compositions 
(see Secs. 1 and 2). It is nevertheless encouraging that 
movements closely resembling those found in living 
cells can be induced by the same reagent in cells from 
which much of the complex structures and materials 
have been removed. 

In conclusion, evidence for the long-sought molecular 
folding is still weak, and no unifying molecular mecha- 
nism has been found for cilia beat, anaphase chromo- 
some movement, and muscle contraction. Developments 
in recent structural and physicochemical analyses are, 
however, encouraging, and with a concerted intelligent 
approach, one may acquire before too long a much 
clearer understanding of the mechanisms underlying 
these cellular movements. 
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FULLY coherent and integrated picture of the 

structure of muscle in terms of its various com- 
ponents and the changing relationships accompanying 
or responsible for contraction is still a long way off, as 
becomes evident later. An adequate description of 
muscle contraction requires knowledge sufficient to 
answer questions such as the following: First, which are 
the components of the myofibril actually participating 
in the contraction? Second, how does the structure of 
the individual components and their interrelationships 
one with another change during shortening? Third, how 
is the mechanical work produced at the expense of free 
energy derived from the hydrolysis of adenosine tri- 
phosphate (ATP)? The first of these questions is, at 
present, the only one which can be answered with any 
certainty. Although many models have been proposed 
for the contraction process, the physical changes ac- 
companying contraction remain a subject for specula- 
tion, largely because of the difficulty in applying ap- 
propriate physical methods, such as x-ray diffraction 
and optical rotation, to the problem. The electron 
microscope has been employed with striking success in 
elucidating the details of muscle fine structure but, 
because of certain inherent limitations, it has been much 
less useful in following the structural changes accom- 
panying contraction. A final solution, it would seem, 
can be accomplished only by a careful and rigorous 
synthesis of data obtained from light and electron 
microscopy, x-ray diffraction, optical rotation, and 
other physical and chemical techniques. 

Before passing on to a consideration of the individual 
muscle proteins, it seems appropriate to review briefly 
some aspects of the progress which has been made since 
Engelhardt and Ljubimova! demonstrated the ATPase 
activity of “myosin.” Later, Szent-Györgyi and his 
collaborators showed that myosin is a complex of two 
proteins, actin and L-myosin, and that the association 
of these components is influenced by the presence of 
ATP2 This discovery led to an extended period in 
which the mechanical and other properties of acto- 
myosin threads were intensively studied. The discovery 
and development of the glycerine-extracted model sys- 
tem provided further stimulation for the study of 
artificial model. systems in general. Further progress in 
narrowing the gap between intact muscle and the vari- 
ous extracted actomyosin systems was made by the 
discovery of substances able to influence contraction in 

* This paper contains descriptions of original work by the author 
(on tropomyosin and paramyosin) which was aided by a research 
grant, E-1469 (C1), from the National Institute of Allergy and 


Infectious Diseases, National Institutes of Health, Public Health 
Service, U. S. Department of Health, Education, and Welfare. 


one way or another. Of these, perhaps the best-known 
is the “relaxing factor” of Marsh.‘ The more-refined 
physicochemical methods developed in recent years 
have been applied to advantage in characterizing the 
various muscle proteins, but they have been relatively 
unsuccessful in elucidating the molecular changes ac- 
companying contraction phenomena. Furthermore, it is 
difficult to extrapolate from the behavior of macro- 
molecules in solution to their behavior in a complex 
paracrystalline multicomponent system like the myo- 
fibril. There exists an urgent need for the development 
of techniques which will allow the determination of 
configurational changes both at the molecular and 
macromolecular levels during contraction in intact 
muscle. X-ray diffraction is the only method currently 
applicable to this problem, and its application is fraught 
with very great technical difficulties, stemming largely 
from the short duration of a twitch and from the very 
high source intensities required for an adequate diffrac- 
tion record at low angles. The application of optical- 
rotation methods appears to be an attractive possibility 
provided that the physical difficulties arising from the 
small dimensions and high degree of orientation of the 
contractile components can be overcome. 

Following the pioneering work of Astbury in clas- 
sifying the fibrous proteins into the a class (kmef group, 
now known to possess the a-helical configuration®—’), 
and other classes which need not be considered here, 
Bear’ showed that the x-ray diffraction patterns of 
various types of muscle fall into two distinct groups as 
judged by their low-angle diffraction patterns: (1) the 
Type I or paramyosin pattern corresponding to an axial 
repeat of 725 A found only in certain invertebrate 
muscles possessing the so-called “catch mechanism”; 
and (2) the Type II diffractions corresponding to axial 
spacing of about 400 A which are found in all muscles. 
As becomes evident later, this classification based on 
x-ray spacings is probably far more reliable as a criterion 
than some of the biochemical properties such as solu- 
bility and amino-acid composition, which have been 
employed in recent years as bases for nomenclature of 
the muscle proteins. At first, it was thought that this 
400-A periodicity arose from a single component. 
However, as is seen, x-ray diffraction studies on purified 
components of striated muscle indicate that most, if 
not all of them, exhibit periodicities of about 400 A. 
Furthermore, the macromolecules of many of these 
Type II components have lengths of about this magni- 
tude, and electron micrographs of myofibrils frequently 
show an axial spacing of similar dimension. It seems 
very likely, therefore, that, in the myofibril, the various 
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Fic. 1. Crystal of LMM showing the sharp axial banding observed in the electron microscope in the absence of an “electron stain.” 
Pe r ; dense transverse striations spaced about 420 A apart are probably owing to binding of salts by the end regions of the LMM 
heve [from D. E. Philpott and A. G. Szent-Györgyi, Biochim. et Biophys. Acta 15, 165 (1954) ]). X200 000. 
mo. 
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on Bee reactive sites repeating at distances of intermolecular rearrangement is an important part of 
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lic. 2. Electron micro- 
graph of a myofibril iso- 
lated from formalin-fixed 
toad muscle and stained 
with phosphomolybdic acid, 
showing the regular 400-A 
cross-striation in the form of 
fine, transversely oriented, 
dense lines [from M. H. 
Draper and A. J. Hodge, 
Australian J. Exptl. Biol. 
Med. Soc. 27, 465 (1949)]. 
X 64 000. 


ever, the results obtained so far do not rule out the 
possibility that configurational changes are involved as 
well—the results of detailed correlative x-ray diffraction 
and other measurements must be awaited before any 
definite conclusions can be drawn, and in any case, the 
results obtained with striated muscle are not necessarily 
applicable to all types of contractile systems. 


ACTIN 


This component of the actomyosin complex was iso- 
lated by Straub.” It appears to be a necessary com- 
ponent of the contractile system in muscle and com- 
prises 15 to 20% of the structural proteins in rabbit 
striated muscle. An outstanding characteristic of actin 
is its ability to undergo polymerization (the G—F 
transformation) to form well-defined fibrous elements, 
a process accompanied by the dephosphorylation of 
strongly bound ATP. The monomeric form of actin is 
one of the most difficult muscle proteins to characterize 
from a physicochemical point of view since the presence 
of salt favors the formation of the fibrous form. How- 
ever, it seems likely that the monomeric unit has a 
molecular weight of about 60 000 (Table I, from work 
by Cohen and A. G. Szent-Györgyi") with physical 
dimensions rather more uncertainly defined. The work 
of Astbury??? and of Selby and Bear” indicates that 
actin exhibits an axial-repeat period of about 400 A in 
conformity with the other proteins of vertebrate striated 


TABLE I. Dimensions of fibrous-muscle proteins [from C. Cohen 
and A. G. Szent-Györgyi, paper read at International Biochem- 
istry Congress, Vienna (1958) ]. 


Est. length Est. axial 
(A) 


M.W. ratio 
Actin 57 0001-74 000! 29013? 1213 ? 
Myosin 450 000117 *1600 50 
HMM 230 000!&-330 0009 40018 15-20:8 
LMM 96 000'*-140 000" 55018 30-4018 
LMM fr. 1 110 000 —120 000 30-40 
Tropomyosin 53 000” 3859 25%.16 
Paramyosin 134 000°! 14007 80” 
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muscle and of smooth muscle, both vertebrate and 
invertebrate. Actin has a pronounced tendency for 
complex formation with myosin, but is devoid of enzy- 
matic properties. 


MYOSIN 


Myosin is the major constituent of most types of 
muscle, e.g., it accounts for 55 to 60% of the structural 
proteins of rabbit skeletal muscle. It has ATPase ac- 
tivity which is calcium activated and magnesium in- 
hibited. It is a globulin, precipitating at low ionic 
strength, and can be extracted from muscle at neutral 
pH with solutions of ionic strength higher than about 
0.6. Characterization of the myosin molecule has been 
a subject of controversy for many years, but there 
seems now to be general agreement that the molecule 
has a length of about 1600 A, an axial ratio of about 50, 
and a molecular weight of about 420 000 (Table I). 

Brief exposure of myosin solutions to proteolytic 
enzymes such as trypsin,>~*? chymotrypsin,” and sub- 
tilizin®® degrades the myosin molecule into two fairly 
well-defined components, heavy meromyosin (HMM) 
and light meromyosin (LMM). The well-known proper- 
ties of the myosin molecule are divided between these 
fragments. LMM behaves physically very much as does 
myosin. It precipitates at low ionic strength and has a 
molecular weight of about 100 000. However, it has no 
ATPase activity and does not combine with actin. On 
the other hand, HMM has all of the ATPase activity 
of the myosin from which it was derived, combines 
with actin in the same proportions as does myosin, but 
differs from myosin in being soluble in solutions of low 
ionic strength. Reported molecular weights range from 
230 000 to 320 000 (Table I). HMM, like myosin, does 
not form ordered precipitates. LMM, on the other hand, 
under appropriate conditions, forms highly ordered 
needle-shaped crystals exhibiting a very striking axial 
period of about 420 A in the electron microscope, even in 
the unstained condition, as shown in Fig. 1 from work by 
Philpott and Szent-Györgyi.” It seems very likely that 
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Fic. 3. Semicontracted myofibril of toad muscle, shadowed with platinum, showing the regular axial period apparently associated 
with the presence of transverse bridges (arrow) [from M. H. Draper and A. J. Hodge, Australian J. Exptl. Biol. Med. Soc. 27, 465 


(1949) ]. X45 000. 


the fine cross-striations evident in these crystals arise 
from a selective binding of inorganic ions at specific 
sites located about 400 A apart (possibly at the ends of 
the LMM ‘“molecules’”’). This result is in good agree- 
ment with the fact that isolated myofibrils, when 
shadow-cast or stained with phosphotungstic acid 
(PTA) or phosphomolybdic acid, exhibit a sharp and 
regular cross-striation®* (Figs. 2 and 3) with an axial 
period of about 400 A in the relaxed condition, and it 
agrees with the evidence obtained by Draper and 
Hodge*-*4 concerning the distribution of bound mineral 
in myofibrils after electron-induced microincineration 
in the electron microscope.t These authors found that 
the mineral residue from well-washed myofibrils ob- 
tained from formalin-fixed muscles is present as fine 
cross-striations with an axial spacing of about 400A 


and organic components of muscle interact one with 
another by means of reactive sites or groups located 
about 400 A apart. Further evidence*® for this view is 
provided by the regular cross striations observed in thin 
sections of muscle (Fig. 5). There is in addition, some 
electron-microscopic evidence to suggest that this funda- 
mental period may shorten during contraction,’ a result 
which, if borne out by further investigation, strongly 
suggests that contraction must involve a configurational 
change in at least some of the macromolecular com- 
ponents of the myofibril. On the other hand, the sliding 
filament model, in which contraction is achieved by 
progressive interdigitation of actin and myosin fila- 
ments, does not require such configurational changes, 
and it should be noted that no changes have so far been 
observed in the wide-angle x-ray pattern of muscle dur- 


ka 

(Fig. 4), a result suggesting that the various inorganic ing contraction. However, such configurational changes 
a i ofibril after electron-induced microincineration in the electron microscope to illustrate the distri- - 
Fic. z R eE EAA within the sarcomere in Be vies regularly spaced, fine transverse striations spaced about 400 A l 
þution of the A. J. Hodge, Nature 163, 576 (1949)]. X f l 
apart [from M. H. Draper and A. J g j = 
Se ides a rationale for the microincineration of organic material induced by high beam intensities in t ‘ 
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Fic. 5. Thin longi- 
tudinal section of 
rabbit muscle fixed 
in buffered OsO, and 
stained with PTA, 
showing the regular 
cross-striation pres- 
ent in all bands of 
the sarcomere [from 


A. J Hodge, H. P K > 
Huxley, and D. Spiro, DR 
J. Exptl. Med. 99, 201 5 


(1954)]. X40 000. 


would be difficult to detect if, as seems possible, a 
sequential shortening of small contractile units (perhaps 
400 A long) is involved in the mechanism of contraction. 

A number of recent experimental investigations point 
toward the rather disquieting possibility that the myosin 
“molecule” may be an artifact of the extraction methods 
used to isolate this protein. Thus, although it has so far 
proved necessary to employ proteolytic enzymes such 
as trypsin, chymotrypsin, and subtilizin in order to ob- 
tain the meromyosins from preparations of myosin, 
there is as yet no convincing evidence to indicate that 
the splitting of peptide bonds is necessary for the 
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liberation of the meromyosins from the parent macro- 
molecule. Middlebrook*® was unable to demonstrate 
the presence of C-terminal groups in the ratios ex- 
pected from the known specificities of trypsin and 
chymotrypsin. Furthermore, the degradation of the 
meromyosins by these enzymes is a much slower process 
than their production from the parent macromolecule, 
a result indicating the presence of a highly sensitive 
region within the myosin molecule, and the results 
of turnover-rate studies indicate a metabolic inde- 
pendence of the two meromyosins. Thus, Velick*’ has 
shown that the turnover rate of phenylalanine is about 


Fic. 6. Electron micrograph of a crystal of rabbit tropomyosin deposited on a supporting film and stained with PT 
in pH 4.2 phosphate buffer. The main spacing is about 200 A, and a definite intermediate line is present. X125 wih 
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Tic. 7. Crystal of rabbit tropomyosin, stained with PTA, showing the crossed-grid appearance characteristic 
of many of these crystals in electron microscopic preparations. X70 000. 


five times more rapid in LMM than in HMM. Simi- 
larly, Schapira ef al.** have observed that the incor- 
poration of glycine is more rapid for LMM than for 
HMM. These results, on their face value, may be inter- 
preted as indicating that the meromyosins represent 
precursors of the myosin macromolecule. However, 
there is also evidence derived from careful experiments 
with fluorescent antibodies to suggest that at least a 
part of the meromyosins in the myofibril are distributed 
independently of each other and in a highly charac- 
teristic pattern. This is discussed later. If nothing else, 
the sum total of the foregoing results must sow a seed 
of doubt concerning the almost universally accepted 
assumption that myosin per se is a well-defined macro- 
molecular species which by some relatively simple 


crystal from an ammon 


jum-sulfate suspension mounted on the supporting film without washing 


interaction with other species such as actin is able to 
bring about the phenomenon of contraction. 


TROPOMYOSIN 


This protein component, which is universally dis- 
tributed in small concentrations in a wide variety of 
muscles, was discovered by Bailey’ and is remarkable 
in that it is the first and probably the only fibrous 
protein thus far obtained in a truly three-dimensional 
crystalline form. The crystals of tropomyosin are plate- 
like and have an unusually high degree of hydration 
(80 to 90%). This latter property probably results from 
the fact that tropomyosin, unlike many of the fibrous 
proteins of muscle (which tend to form one-dimensional 
paracrystalline arrays), exhibits a strong tendency for 
the formation of three-dimensional crossed-grid net- 


Fic. 8. Rabbit tropomy osip d dines of high density probably correspond to regions where salt is accumulated. X 135 000. 
-0. Gurukul i 
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works. A crystal of tropomyosin can, in fact, be regarded 
as a very highly ordered gel. Tropomyosin has no enzy- 
matic activity and, as its name implies, was at one time 
thought to be a precursor of myosin itself. However, 
there is no substantial evidence for this view, and 
indeed, the role played by tropomyosin in muscle func- 
tion remains a subject for speculation. 

‘Tropomyosin, the water-soluble component originally 
defined by Bailey, appears to be present in small 
amounts in all muscles (e.g., it accounts for about 4 to 
5% of the fibrous proteins of rabbit striated muscle) 
and is present even in those invertebrate muscles that 
exhibit Type I low-angle diffraction patterns (i.e., those 
that have paramyosin as well as the actomyosin system). 
It is of interest to note that the tropomyosin molecule 
with a molecular weight of about 53 000 (Table I), has 
a length of about 400 A, as estimated from physico- 
chemical evidence, a result in keeping with the lengths 
of all of the other components and subunits in the 
Type II system so far described. 

Examination of tropomyosin crystals in the electron 
microscope in this laboratory has proved instructive and 
the spacings observed appear compatible with physico- 
chemical estimates of the molecular length. A correla- 
tion with low-angle x-ray diffraction is currently being 
carried out.*! The results obtained when a preparation 
of small tropomyosin crystals is placed on a grid, stained 
with buffered phosphotungstic acid (pH 4.2) or allowed 
to dry without staining are illustrated in Figs. 6-9. A 
frequently observed pattern is one comprising striations 
spaced about 200 A apart (often with intraperiod lines) 
or a crossed-grid network with periods of the same mag- 
nitude in two directions (Figs. 6 and 7). In unstained 
crystals (Fig. 8), the periodic structure appears rather 
as a pattern of dots, strongly suggesting that inorganic 
ions are occluded or bound at, or near, the sites of inter- 
action of the long thread-like macromolecules arranged 
in an ordered gel structure. The square two-dimensional 
pattern with spacings of about 400 A (Fig. 9) affords 
yet another example of the remarkable capacity of 
tropomyosin to form open, ordered network structures, 
a property which may well have some significance in 
relation to the transverse bridges known to connect the 
longitudinal filamentous elements in striated muscle at 
regular intervals along the fiber axis. The regular out- 
lines of the tropomyosin crystals become evident if 
they are fixed, embedded, and sectioned by conventional 
procedures (Figs. 10 and 11). However, dimensional 
analysis is rather more uncertain than in the cases 
already mentioned, since it is technically difficult to 
define the plane of the section in relation to a particular 
crystallographic plane. Nevertheless, the three-dimen- 
sional ordered net structure of the <rystals is often 
clearly discernible (Fig. 12). This type of net structure 
is reminiscent of that observed in collagenous tissues 
such as Descemet’s membrane’? and in the pharyngeal 
structures of certain protozoa.® It may represent an 
important structural principle in ordered biological 
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Tıc. 9. Open, two-dimensional, net structure observed in a i 
rabbit tropomyosin crystal preparation. The spacing of about Fe 
400 A corresponds to the length of the tropomyosin molecule as k 
determined by physicochemical techniques (Table I). X75 000. wee 


systems and may not be irrelevant to the process of we 
contractility itself. g 


As has been seen, there exists a rationale for an 
orderly interaction of the muscle components so far 
described in terms either of their lengths being about : 
400 A or of there being subunits of this length. It seem: a 
clear, therefore, that the total x-ray pattern (bí 
meridional and equatorial) of whole muscle must con es 
tain information concerning the precise state of inter- 
action of the macromolecular components for Ly 
given state of the contractile system, and offer 1 
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Fic. 10. Electron micrograph of a thin section of rabbit tropomyosin crystals, fixed in buffered, osmium-tetroxide solution, stained 
with PTA, and embedded in n-butyl methacrylate. Note the regular outlines of the crystals and the precise cross-striation correspond- 


ing to a spacing of about 200 A. X20 000, 


possibility of following the physical changes accom- 
panying contraction. However, as already noted, there 
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are severe technical problems to be overcome in any 
h such investigation. 
; PARAMYOSIN 
The term paramyosin was introduced by Hall eż al.“ 
to describe a major component of certain specialized 


molluscan and annelid muscles (Table IT),*® which 
possess the ability to maintain a rigor-like contracture 
for long periods of time and which were described in 
£ the older literature as having a “catch mechanism.” It 
= is this protein that is predominantly responsible for the 
= Type I low-angle diffraction pattern obtained by Bear* 
from these muscles, a highly characteristic pattern 
Sse comprising reflections corresponding to a fundamental 
z Eee period of 725 A, with every fifth reflection highly 
= accentuated. The pattern can be interpreted formally 
ET d Selby‘8) in terms of anore c 
ets > in Fig. 13. “Paramyosin fibrils,” whic 
$ he A ae native form from the adductor 
pone isily 3 tain marine mollusks such as the clam, 
genset co exhibit a characteristic band struc- 


” mercenaria, 
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ture with a period of 145 A in the electron microscope, 
together with a spot pattern, the symmetry of which is 
such that the true repeat period of this composite 
pattern is 5X145 A=725 A (Fig. 13). Since this spot 


TABLE II. Relative intensities of the myosin and paramyosin ` 
x-ray diffraction systems exhibited by various muscles [from F. O. 
Schmitt, R. S. Bear, G. E. Hall, and M. A. Jakus, Ann. New York 
Acad. Sci. 47, 799 (1947) ].# 


Myosin Paramyosin 


Muscle diffractions diffractions 
Mytilus adductor ae +4444 
Venus adductor, w ++ HHHH 
Anodonta adductor, w ++ +++ 
Anodonta adductor, t +++ +++ 
Mya adductor ALALE Fedek 
Venus adductor, t +++- ms 
Pecten adductor, w +-+ HF 
Phascolosoma retractor + + 
Dog retractor penis ++ 
Pecten adductor, t dL 
Thyone retractor + 
Frog sartorius ate 


a Rough visual estimates of the relative intensities of the myosin and 
paramyosin diffractions on the patterns of the various muscles are indicated 
by +; w and t refer, respectively, to the “white” and “tinted” components 
of the muscles which possess them. 
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Fic. 11. Higher magnification view of two tropomyosin crystals in the same preparation as Fig. 10, showing 
the regular open-network lattice characteristic of these crystals. X95 000. 


pattern is frequently absent in these native isolated 
fibrils, and because the major component of these fibrils 
can be extracted and reconstituted into fibrils in which 
no spots are visible but which exhibit band patterns 
with fundamental repeat periods of 725 A (Fig. 16), it 
seems likely that the spots reflect the presence of one or 
more additional components within the fibrils. In any 
case, it is proposed here to restrict the term paramyosin 
to the major component of these native fibrils. A com- 
posite structure for the native paramyosin fibrils is 
further indicated by some recent work*’ in which trans- 
verse sections of adductor muscle show that the fibrils 
are built up of a number of thin layers. 
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In 1952, the author**” found that paramyosin could 
be quantitatively precipitated from acid solutions in 
the form of ordered tactoidal fibrils with an axial period 
of about 1400 A and with symmetrical intraperiod band 
structure, (Fig. 14) rather than the “polarized” band — 
structure characteristic of the native 725-A period. At 
that time, Schmitt eż al. were in the process of anal, 
ing the band structures of the newly discovered se gm 
long-spacing (SLS) and fibrous long-spacing (F 
forms derived from soluble collagen (asymme 
symmetric band patterns, respectively),« and 
duced that the collagen macromolecule 
four times the length of the axial pa 
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Fic. 12. Thin section of a rabbit tropomyosin crystal showing the characteristic open-network lattice. This type of 
structure probably accounts for the very high degree of hydration of the tropomyosin crystals. X 190 000. 


displayed by native-collagen fibrils. According to their 
concept, the SLS forms arose by a parallel packing of 
the macromolecules with like ends in register, the FLS 
by an antiparallel packing, thus accounting for the 
symmetrical band structure of FLS. Consequently, by 
analogy with the collagen picture, it seemed likely that 
the macromolecules of paramyosin were about 1400 A 
long, and that they were packing in antiparallel array 
to form an FLS-type banded structure. This prediction 
was borne out*®-#? by the results of sedimentation, diffu- 
sion, viscosity, and light-scattering investigations, which 
were necessarily rather crude because of the low ionic 
strengths required to keep the protein in solution under 
these acid conditions. However, the data were suff- 
ciently good to indicate the presence in the solutions of 
particles with lengths between 1200 and 1500 A and 
with axial ratios of about 70. Thus, the combined elec- 
tron-microscopical and physicochemical evidence estab- 
lished the length of the paramyosin macromolecule (or 
ast of the kinetic unit in solution) with some cer- 
0A. The more recent and 
s of Kay,” carried out on 


at le 
tainty as being about 140 


more accurate measurem. 


eosin in neutral solution at high ionic strength, 
parany ee Pea this value for the length of the para- 


oss pear to co 
2 myosin macromolecule. 2 
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Bailey*! obtained paramyosin in purified form, and, 
on the basis of similarities in the amino-acid composi- 
tions of tropomyosin and paramyosin, was led to call 
the latter “insoluble tropomyosin.”’*? Similarly, Kominz 
el al.* refer to paramyosin as ‘‘tropomyosin A” to dis- 
tinguish it from “tropomyosin B” (i.e., the original 
tropomyosin of Bailey). This profusion of nomenclature 
seems unwarranted, and it would seem advisable at 
the present time to retain the term paramyosin, since 
there exist definite differences in solubility properties, 
crystalline form, and x-ray and electron-microscopic 
periodicities which clearly differentiate this protein from 
the original water-soluble tropomyosins of Bailey. Para- 
myosin has no enzymatic activity, according to Szent- 
Gy6rgyi,** does not combine with actin, and there is as 
yet no evidence to indicate a possible function for it in 
relation to the specialized properties of the “catch” 
muscles. 

A remarkable variety of fibrous structures can be 
formed from paramyosin solutions under various con- 
ditions of pH and ionic strength. Those so far obtained 
include fibrils with axial periods of ca 70, 145, 725, and 
1800 A. (This last, together with one of 360 A, was 
first observed by Locker and Schmitt.®*) In the case of 
collagen, it can be shown that the native type, 700-A 
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Fic. 13. Native “paramyosin fibrils” isolated from the adductor muscles of Venus mercenaria in dilute salt solutions. These usually show 
(a) a transverse cross-striation with an apparent period of 145 A,‘8: (b) a pattern consisting of the 145-A cross-striation and a super- 
posed dot pattern of symmetry such that the true repeat period is 5X 145 A=725 A [from F. O. Schmitt, R. S. Bear, C. E. Hall, and 
M. A. Jakus, Ann. N. Y. Acad. Sci. 47, 799 (1947)]. (c) shows the node pattern derived from electron micrographs. This formal net 
structure is in good agreement with the results of low-angle x-ray diffraction studies [from C. E. Hall, M. A. Jakus, and F. O. Schmitt, 


J. Appl. Phys. 16, 459 (1945)].§ (a) and (b) X190 000. 


periodicity can be obtained by a parallel packing in 
which the tropocollagen macromolecules (2800 A long) 
are staggered with respect to one another by one-quarter 
of their length. The key to such a synthesis of band 
structure is the availability of the SLS band structure, 
which is in effect a “fingerprint” of the tropocollagen 
macromolecule. Attempts to produce segment-type 
structures from paramyosin solutions have so far proved 
unsuccessful, but it seems likely that parallel, anti- 
parallel, and staggered arrangements involving axial 
displacements of one-tenth the molecular length (i.e., 
1400 A/10) or integral multiples thereof could explain 
the various band patterns encountered thus far (Figs. 
13-18). A feature of interest in relation to these band 
patterns is the frequent occurrence of transitions from 
one type of packing to another. Figure 18 clearly shows 
such a transition from the 1800-A type to the 145-A 
spacing, Fig. 17 a rhythmic transition from the 145-A 
spacing to a ca 70-A spacing and vice versa. 
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X-RAY DIFFRACTIONJAND;ROTATORY 
DISPERSION 


All of the proteins considered up to this point, with 
the exception of actin, give wide-angle x-ray diffraction 
patterns in which the prominent feature is the 5.1-A 
meridional reflection, indicating that at least a signifi- 
cant proportion of the polypeptide chains possess the 
a-helical configuration observed by Pauling and Corey. 
This is true also of the meromyosins,"' LMM giving a 
good pattern, with HMM being more variable since it 
is rather easily denatured. However, the x-ray diffrac- 
tion method does not lend itself to estimation of the 
noncrystalline but helical parts of the molecule. The 
helix content of proteins in aqueous solution can best 
be estimated at the present time by rotatory-dispersion 
measurements, and the equation by Moffitt®* describing 
the rotatory dispersion of the a-helix can be used for 


this purpose with the muscle proteins, subject to certain — 


theoretical limitations, and on the assumption that the — 


Aa Ni 
= 
RP 


= 
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entire helical content is in the right-handed a-helical 
configuration. Table III, from the work of Cohen and 
Szent-Gyérgyi,!' shows the estimated a-helix content in 
aqueous solutions of the muscle proteins and fragments 
under discussion. It can be seen from this table also 
that the extent of the helical regions seems to be de- 
pendent upon the proline content. The proline contents 
of the highly helical muscle components (LMM fr. I, 
tropomyosin, and paramyosin) are very low, while 
those of myosin and HMM (which exhibit the lowest 
helix content) are correspondingly high. On a statistical 
basis, the data indicate that each proline residue pre- 


TABLE III. Helix content and proline and cystine concentration 
of muscle proteins [from C. Cohen and A. G. Szent-Györgyi, paper 
read at International Biochemistry Congress, Vienna (1958) ]. 
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No. of 
nonhelical 
residues 
Wt. % Equ. cystine Wt. % per proline 
4 helix in 105g proline residue 
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Fic. 14. Fibrils re- 
constituted from an 
acid solution of 
paramyosin show- 
ing an axial repeat 
period of about 1400 
\. Note that the 
band structure with- 
in each repeat period 
is arranged symmet- 
rically. This is in- 
terpreted as arising 
from an antiparal- 
lel packing of rod- 
like molecules about 
1400 A long [from 
A. J. Hodge, Proc: 
Natl. Acad. Sci. U. 
S. 38, 850 (1952)]. 
X70 000. 


vents about 20 other residues from participating in 
a-helix formation. 


LOCALIZATION OF THE MUSCLE PROTEINS 
IN THE MYOFIBRIL 


The commonly accepted distribution of actin, myosin, 
and tropomyosin in striated muscle as deduced from 
extraction experiments of a more or less selective nature 
is shown in Fig. 19, reproduced from a review by Perry.” 
However, there are indications that the picture is not 
quite so simple, especially in relation to the distribution 
of myosin and to the changes in the density distribution 
within the sarcomere accompanying shortening. Accord- 
ing to the concept of Huxley and Hanson,’ the actin 
and myosin are in the form of two separate sets of 
interdigitating filaments. Shortening is accomplished 
not by contraction of the filaments themselves, but 
rather by a mutual sliding action. As the J bands dis- 
appear, the ends of the myosin filaments pile up against 
the Z band to produce the well-known contraction 
bands (C, bands). However, there are many indications 
in the literature (e.g., Hodge®’) which suggest that at 
least some of the material contributing to A-band 
density in the relaxed myofibril is capable of actual 
migration to the Z bands (and possibly to the M bands) 


eal 
= 


FIBROUS 
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Fic. 15. Electron micrograph of a large flat paramyosin “crystal” obtained by reducing the ionic strength of an 
approximately neutral paramyosin solution, stained with PTA. The axial period is about 145 A. K 190 000. 


independently of any sliding of formed myosin elements. 
Possibly, such substances act as moderators, activators, 
or inhibitors in controlling the orderly interaction of the 
actin, myosin, and other macromolecular components. 

There is evidence to indicate that the myosin content 
of the A band is not homogeneous with respect to its 
extractibility. Thus, on extraction with Guba-Straub- 
ATP solution, a band of appreciable density remains on 
both sides of the M band® (Fig. 20), which can be 
removed only by further extraction with a pyrophos- 
phate solution.” Furthermore, it has been shown that 
appreciable amounts of components other than myosin 

< are extracted from the A band by these procedures.* 
The distribution of birefringence in the sarcomere as 
found by Inoué and Szent-Gyérgyi™ using high-resolu- 
tion polarization-optical methods coupled with extrac- 
tion and replating experiments is also at variance with 
the simple distribution of actin and myosin as postu- 
lated by Huxley and Hanson. 

The polarization-optical results are more in harmony 
with the very interesting results of Holtzer and 
Marshall® using the technique of “fluorescent antibody 

> analysis” to determine the localization of the various 
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muscle components. The distributions of actin and 
myosin found with this technique are essentially in 
accord with that shown in Fig. 19. However, when using 
antibodies specific for LMM and HMM, the very 
curious result emerges that these two components ap- 
pear to be localized independently of each other within 
the A+ region of the sarcomere. The LMM appears 
to be located preferentially in the A bands proper, the 
HMM in a complementary pattern consisting of two 
bands, one on each side of and adjacent to the M band. 
More recently, Holtzer and Szent-Gyérgyi® have con- 

firmed these observations and demonstrated that there — 
is no physical hindrance to penetration of the antibo 
into the myofibril, by carrying out experiments i 
volving extraction with KI solutions. In the control 
this treatment removes most of the protein of the my 
fibril. However, pretreatment of a myofibril with 
specific antibody prevents the extraction of the partic 
lar protein under consideration. The independent 
somewhat overlapping distributions found for L MM 
and HMM by the above methods ha 30 
firmed by Inoué and Szent-Gyérgyi, 
tively extracted for myo: 
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considerable extent, on th 


ALAN J. 


HODGE 


Fic. 16. Two fibrils from the same preparation as Fig. 15. The upper one (a) shows a simple period of about 145 A with a superposed 
apparently isomorphous structure having a repeat of 725 A, the lower one (b) shows a more complex pattern in which the fundamental 
repeat period is 5X 145 A=725 A. Stained with PTA. Similar patterns have been observed by Hanson ef al.5® X125 000. 


myofibrils to LMM and HMM solutions before polari- 
zation-optical examination. They found birefringent 
bands appearing in positions consistent with the fluo- 
rescent-antibody results. This apparent independence 
of at least some of the LMM and HMM within the 
myofibril recalls the observations of metabolic inde- 
pendence mentioned earlier, and raises the question of 
whether or not some kind of interaction or reaction 
of LMM with HMM is involved in the process of 


contraction. 
PERSPECTIVES 


odel of muscle is based, to a 


iding filament m aie 
The sliding e observation in the electron 
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microscope of two sets of filaments in the A bands, 
identified by Huxley and Hanson as actin and myosin. 
However, these filaments have been observed, of neces- 
sity, only in thin sections of muscle subjected to the 
procedures involved in fixation, dehydration, and em- 
bedding, so that the original state in the living, resting 
muscle fiber is a matter of inference. The danger of 
extrapolation from such observations is illustrated by 
comparing the low-angle x-ray diffraction data with the 
results of electron microscopy for muscle fixed (a) in 
the fresh state and (b) after glycerination. Living muscle 
gives a low-angle equatorial x-ray diffraction pattern 
consisting of two reflections corresponding to rod-like 
elements about 450 A apart in a hexagonal array,” 
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Fic. 19. Diagram illustrating the distribution of the protein 
components of the myofibril within the sarcomere. The height 
of the shaded areas represents the protein density at any point 
along the myofibril axis [from S. V. Perry, Physiol. Rev. 36, 1 
(1956)]. 


with a strong first order and with a relatively weak 
second order. If the ATP is removed from the muscle 
(i.e., in glycerinated muscle or muscle in rigor), a strik- 
ing reversal of the intensities of these two reflections 
takes place, the second order becoming very strong.*’ 
This indicates that, in the absence of ATP, a region of 
high electron density is present at a position inter- 
mediate between the primary rod-like elements, while, 
when ATP is present, the electron density between 
these elements is relatively uniform. The electron micro- 
scope results are at variance with this, for the two sets 
of filaments can be observed equally well in muscle 
fixed in the fresh state or after glycerination.*” It thus 
is necessary to conclude that one (or more) of the 
processes involved in fixation, dehydration, and em- 
bedding causes the muscle to go into a rigor-like state, 
and consequently, that the density distributions ob- 
served in electron micrographs do not correspond to 
those present in the living, resting muscle fiber. The 
results clearly demonstrate the need for a thorough 
correlation by x-ray diffraction of the procedures used 
in the preparation of specimens for the electron-micro- 
scopic examination of muscle. 

In the sliding-filament model, the two sets of filaments 
are postulated to remain at constant length on the basis 
of evidence derived largely from observations of A-band 
lengths during shortening. Thus, this mechanism does 
not require configurational changes of the protein com- 
neither at the a-helix nor at the macromolecu- 

lar level. No direct evidence on this point has been 
reported. However, Huxley”? found no change in the 
axial spacing measured from the low-angle x-ray diffrac- 
tion pattern on passive stretching of the muscle, a 
result which has been interpreted as possibly indicating 
that no change would accompany shortening. On the 

hand, some as yet inconclusive evidence from 
other ha zi scopical observations??? suggests that 
electron-micr® he axial period both with the 


ponents, 
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Fic. 20. Electron micrographs of thin sections through (a) 
intact glycerinated fibril and (b) glycerinated fibril after extrac- 
tion with Guba-Straub-ATP solution [from J. Hanson and H. E. 
Huxley, Nature 172, 530 (1953)]. 


sarcomere length and with the type of band pattern 
observed during shortening. Such a variation in the 
axial period would strongly imply the occurrence of 
configurational changes during shortening. It is clear 
that further evidence of a more direct nature is urgently 
needed in order to settle this matter. 

As has been seen, the application of analytical and 
degradative methods has proved to be of great impor- 
tance in solving many of the problems associated with 
muscle structure and function. However, it would seem 
that the time is ripe for an approach to muscle which 
considers the whole machine as a complex mixed 
macromolecular “crystal,” capable of changes in physi- 
cal state in order to convert available chemical energy 
into mechanical work. Any completely adequate theory 
of contraction must be able to take into account the 
interactions between the many and varied components 
of the myofibril and other contractile units. 
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HIS article discusses current thought on the 
fundamental problem of muscular contraction, 
i.e., on the nature of the molecular transducer which 
converts chemical free energy into mechanical work. 
Structural observations (Bennett, p. 394; Inoué, p. 402; 
Hodge, p. 409) have greatly influenced this thought, 
but most current hypotheses are based to an even 
greater extent upon the discovery by Engelhardt and 
“Ljubimova that the essential features of muscle action 
can be reproduced in vitro, with fairly pure materials 
extracted from muscle. This work provided such a 
far-reaching simplification that many researchers have 
been impelled to discard the system as a whole and to 
concentrate on figuring out how the in vilro system 
works. Fortunately, not everyone feels this way. 

The simplest system which shows muscle-like con- 
tractility is one wherein a thread spun from a protein 
extracted from muscle is suspended in a neutral medium 
of low ionic strength. If one places, in this medium, 
mM Mg** and mM ATP, the thread contracts and does 
external work. Reciprocally, the thread catalyzes the 
hydrolysis of ATP to ADP and P—an exergonic re- 
action! which is considered to “drive” many biological 
processes. The question is: How—in a molecular sense 
—are the hydrolysis and the shortening “coupled”? To 
give an impression of current thought on this question, 
some theories or “models” of the process are summa- 
rized and passing mention is made of the observations 
which inspired them. These particular models were 
chosen, in part because they are among the more reason- 
able ones, and in part because they were developed 
from ideas and techniques discussed by Doty (p. 107), 
Zimm (p. 123), and Rice (p. 69). 

In 1948, Kirkwood? suggested that the contractile 
element in muscle might be a flexible polyelectrolyte 
whose charge is modulated by phosphorylation of its 
serine residues by ATP. Actually, his was a scheme for 
relaxation rather than one for contraction, but his basic 
idea was all-important. At the time, Botts was making 
a thermoelastic analysis of threads spun from contractile 

protein, and she came to the conclusion that both the 
energy and the entropy decreased with increasing 
length.®:7 This is the behavior expected of a polyelec- 
trolyte bearing a net charge. Later, impressed by the 
high charge which ATP has in neutral solution, by the 

eae lity of Mg? for contraction, and by the fact 
papri git could bind so tightly to the protein as to 
that its isoelectric point (I.P.) beyond pH 9, we sug- 
shift tS: that the contractile element of a thread might 
gested’ t7 ely charged polyelectrolyte whose charge 
eed merely by the adsorption of ATP! 


in the firs! step of the ATPase process (Fig. 1). 
This neutralization would lead then to contraction, 
because, in shortening, the element would—for vari- 
ous reasons—gain entropy. Such a proposition rele- 
gated the hydrolytic step of the ATPase process to a 
minor role, a feature that greatly provoked biochemists ; 
however, a conjoint thermodynamic analysis with Hill$, 
and later a statistical-mechanical investigation of our 
model by Hill? convinced us that the model is quite 
acceptable from the point of view of energetics. 

The model in question purported to be a model only 
of the contractile element in muscle. To make the model 
fit the facts about the in silu organization, other as- 
sumptions had to be made. In 1946, Schmitt and his 
associates! provided evidence that, in the sarcomeres 
of uncontracted muscle, the filamentous elements ap- 
parently were already extended. Since it was known! 
that these elements exert no tension in passively 
stretched muscle, I suggested" that, at rest, the con- 
tractile elements from either end of the sarcomere must 
interdigitate, but that cross-bonds between such ele- 
ments must remain absent until the moment of excita- 
tion. The main concern, however, is—and was—to see 
if the model of the contractile element is correct, for it 
is in that device that transduction occurs. In its favor, 
we and others have mobilized additional evidence of 
which the following is representative. 

Consider the Michaelis-Menten equation which re- 
lates the steady-state rate of substrate degradation to 
the concentration of substrate. It has been shown" that 
the enzymatic activity of the muscle protein system 
obeys this equation very well. This makes it possible 
to extract from experimental data the two Michaelis- 
Menten parameters. The one of interest here is the 
reciprocal Michaelis constant; 


‘K=k,/ (kth), 


in the scheme, 


r ko 
BALS K n HOI, 


—1 


There exist simple ways of ascertaining whether 
kk; i.e., whether K can be considered identical 
with the equilibrium constant of substrate-to-enzyme 
binding. When this identity is legitimate, 


e NER ana — RT Ink. 


Many factors are known which influence the effective- 
ness with which the ATP structure can bring about 
contraction; for example, variations in ionic strength, 
in [Mg++], [Ca], susbtitutions in the ring (thus 
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generating other members of the nucleotide family), 
or elimination of the ring (TPP). As a rule, these varia- 
tions also influence the ATPase activity, and, therefore, 
one can obtain also a set of K values from chemical 
measurements alone. When one compares mechanical 
measurements of contractant effectiveness with values 
of K, there is a fairly good correlation, but in many 
instances there is no correlation whatever with maxi- 
mum rate of ATPase activity (as there should be if the 
tension generator were “geared” to ATP hydrolysis). 
Although there are other indications of polyelectro- 
lyte behavior in contractile protein—for instance, the 
fact that anions totally unrelated to ATP, e.g., I, 
SCN-, Fe(CN)* 6, also bring about contraction! 1’— 
one additional experimental foundation of the model 
can be discussed briefly. To do this, and to prepare for 
the description of other models, the term “contractile 
protein” has to be defined. The best preparation, from 
the point of view of work performance, is obtained by 
extracting directly from muscle mince, according to a 
well-known recipe, a system referred to as ‘““myosin-B.” 
A contractile system, however, can be obtained also by 
complexing two separately extracted proteins. One 
protein is fibrous, exhibits ATPase activity, and is 
called simply “myosin.” The other is globular, contains 
bound nucleotide, and is called ‘“‘actin.”” The contractile 
complex of the two is called ‘“‘actomyosin.” For the 
moment, the question is left open as to whether or not 
myosin-B is identical with actomyosin. Up to this point, 
myosin-B has been considered in its precipitated (e.g., 
thread) form, at low ionic strength. If the ionic strength 
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(a) 


(c) 


(b) 


Fic. 1. Elements of a polyelectrolyte model of contraction. It 
is assumed that the molecular chain has some internal flexibility 
owing to a free rotation at at least some points (a). In the absence 
of other effects, the chain then assumes a coiled form which maxi- 
mizes its configurational entropy (b). However, if the chain bears 
a net electrostatic charge, it will tend to extend out owing to 
repulsion between charges. Mg-myosinate might be so extended 
by a net positive charge in some of its regions o If so, adsorption 
of quadrivalent anions of ATP in the first step (complex formation) 
of the enzymatic hydrolysis would discharge the chain. An effec- 
tively neutral chain (d) would shorten toward the configuration 
of the unperturbed coil (a). 
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Fic. 2. Zimm plot of the light scattered from a myosin-B solu- 
tion without and with ATP. Extrapolation from the points ob- 
tained at small angles gives an intercept which does not change 
on ATP addition, thus indicating an essentially constant weight- 
average molecular weight. ATP addition does increase the slope, 
however, thus indicating (see text) that at least some particles 
are “inflating.” 


is raised to a few tenths molar, myosin-B dissolves, only 
then becoming accessible to the traditional tools of 
physical biochemistry. Electrophoretic examination 
shows that in such a state the myosin-B particles are 
anionic. The dissolved system still responds to ATP, by 
reducing its viscosity, and by lowering its optical tur- 
bidity, etc. It has been thought generally that, under 
these conditions, the underlying structural change is a 
dissociation of particles into actin and myosin. On this 
account, the interpretation which we placed on our 
early light-scattering experiments'7—that the myosin-B 
particles were inflating on ATP adsorption—met with 
something less than global enthusiasm. New investi- 
gations!®: were carried out in our laboratory, and also 
while guests in that of Schachman; this new work was 
sparked principally by Gellert and von Hippel. a 
Figure 2, from this work, shows a “Zimm plot” 
(cf. Doty, p. 61) of data obtained on pristine protei 
using the best techniques possible. Only the zer 
centration points are shown because there i 
evidence that the second virial coefficient is 
On a plot of this type, the reciprocal intercept 
weight-average molecular weight (Me) and 
initial slope gives a higher order average o 
of a particle dimension. Accepting this date 
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Final Mm for myosin-B = 4.2+0.2 x 10° 
Average Mm for myosin-A = 4.26+0.09 x 10° 
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Fic. 3. Molecular weight of myosin (lower curve) and 5-4 
extracted myosin-B (upper curve), measured at the meniscus by 
the “Archibald” approach to sedimentation equilibrium tech- 
nique, as a function of time of sedimentation at 4197 rpm and 
ca 5°C. Initial protein concentrations, 0.5 g/100 ml. 


value, it is concluded that the addition of a saturating 
amount of ATP to this system inflates particles at con- 
stant molecular weight. (The Yang plot, which is more 
sensitive to changes in, Mue, leads to the same conclu- 
sion.) A similar result is obtained using OH™ instead of 
ATP. Thus, when the protein is in the precipitated 
form and is cationic, the addition of ATP anions causes 
contraction; but, when it is anionic, the addition of 
ATP anions causes inflation. I feel that this constitutes 
some evidence for the idea that the interaction of ATP 
with this protein is that of an ion and of a polyelectro- 
lyte. However, conclusions drawn from the light-scat- 
tering behavior of a polydisperse system can be very 
tenuous. It could be, for example, that the solution 
contains large ATP-inert particles which dominate the 
scattering at small angles, and smaller particles which 
simply dissociate (say, into myosin and actin), but 
whose influence is felt only at larger angles. This ac- 
tually is suggested in our extrapolations from high-angle 
measurements. If the amounts and dimensions of these 
two hypothetical classes of particles were just right, the 
surviving heavy fraction might be so enriched with inert 
particles that its average dimension might appear to 
increase, even if the particles themselves are not inflat- 
ing. To block this possibility some excursions in the 
ultracentrifuge were necessary. 

With the ultracentrifuge, we soon found that the 
myosin-B system was readily separable into light and 
heavy components. This made it possible to apply the 
molecular-weight technique developed by Kegeles and 
Schachman on the basis of the Archibald equations. In 
Fig. 3 the Mw, from meniscus measurements, is plotted 
as a function of sedimentation time. Tt decreases from 

/ high values (such as are given by light scattering) 
yer” totic value of 4.2 10° g. On the other hand, 
A ae technique is applied to purified myosin, 
3 ie ss 5 
oa the start one obtains a constant value of 4.2 10° g. 
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This result immediately identifies the “light compo- 
nent” in myosin-B as myosin and reveals that the 
molecular weight- of myosin is 4.210° g (until now, 
this quantity has been very controversial). To explore 
the heavier and more polydisperse components which 
generate no schlieren boundary, we turned to the ultra- 
violet absorbance method. This method gives C(x) 
rather than C’(x). Figure 4 shows two sample graphs 
constructed from densitometer measurements on a cell 
containing myosin-B saturated with ATP; both are 
plots of absorbance against distance in the direction of 
centrifugation—the top one corresponding to some time 
during the acceleration phase and the bottom one to 
many minutes at speed. By the time the lower measure- 
ments had been taken, the heavier components had 
plummeted to the bottom of the cell, and myosin 
(the “light component”) was beginning to be swept out. 
From curves such as these, the concentration of myosin 
was measured. Extrapolating backward in time, allow- 
ing for sectorial dilution, the myosin concentration was 
subtracted from the total concentration to get the con- 
centration of the “heavy components” at the earlier 
time; for instance, the time at which the upper plot was 
made. By doing this at various times, the ‘‘clearance 
curve” of the heavy components was inferred, and from 
it the weight distribution of the various components 
also, both in the presence and in the absence of saturat- 
ing concentrations of ATP. The conclusion from these 
manoeuvers was that the (5-hr extracted) myosin-B 
system contains about 65% myosin and 35% heavier 
components, the latter distributed roughly into two 
classes. On addition of ATP (or PP), about one-third 
of the total heavy components depolymerizes (yielding 
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Fic. 4. Absorption of ultraviolet light by an ultracentrifuge 
cell filled with a solution of myosin-B and ATP, observed at 
different stages of sedimentation 
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certainly myosin, and possibly actin), while two-thirds 
seem to remain intact. Based upon these determinations, 
the light-scattering data were re-examined to determine 
if the inflation suggested by those measurements might 
have been apparent only. It turns out very clearly that 
the inflation must be real, so I feel that one is entitled 
to say that the nature of the interaction between ATP 
and the myosin-B system in solution is understandable 
as an ion-polyelectrolyte effect. 

The polyelectrolyte model is a special case of a class 
of models in which it is assumed that the adsorption of 
ATP—and not its enzymatic cleavage—causes a change 
in the elasticity of a mechanically continuous protein 
element. In the same class, for example, are the pro- 
posals of Pryor,” Flory,?? and Laki” in which it is 
assumed that on adsorption of ATP a “crystallized,” 
perhaps helical, configuration becomes unstable and 
passes into a random coil. Also in the class is the pro- 
posal of Astbury,”* which envisions passage from the a- 
configuration into the “‘super-contracted” configuration. 

Another model which also could be constructed from 
materials found in muscle, but which would operate 
on a radically different principle has been described 
by H. Huxley and Hanson and by A. Huxley and 
Niedergerke* and was cast very recently into chemical 
language by Weber.” This model frankly is designed to 
agree, on the one hand, with certain observations on 
the organization of proteins within the fibril, and, on 
the other hand, with the widespread belief that the 
“coupling” of ATP hydrolysis for the purpose of 
“driving” another reaction always involves phosphoryl- 
ated intermediates. The central feature of the Huxleys- 
Hanson theory” of the intrafibrillar organization of 
proteins is that it adopts, first of all, the ‘“actomyosin” 
interpretation of the myosin-B system, and considers 
that over-all shortening is achieved by the motion of 
actin filaments relative to myosin filaments, without 
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Fic. 5. Weber’s?® proposed chemical scheme to account for 
the sliding motion assumed in the Huxley-Hanson theory of 
contraction. 


429 


invoking the intrinsic contraction of either type of fila- 
ment. How this sliding motion could result from the 
interaction of the filaments with ATP is shown in Fig. 5. 

In schemes of this type it is necessary to explain how 
it is that the filaments slide in one direction only and 
not in the other, since obviously the reaction scheme is 
quite symmetrical. Weber does not allude to this point, 
but it has been well-recognized by Huxley,” who 
formally invoked a spatially asymmetric probability of 
reaction without pretending to give it a chemical basis. 

Key support for the Weber scheme is considered to 
reside in certain tracer experiments of his colleagues, 
the Ulbrechts.*® They reported that myosin-B—but not 
reconstituted actomyosin, which is also contractile— 
under conditions in which contraction occurs, catalyzed 
the back incorporation of labeled phosphorus from 
ADP into ATP. This suggested existence of the reaction, 


ATP--actomyosin = actomyosin- P+-ADP. 


Since the Ulbrechts found, however, that fibrils from 
which myosin had been extracted catalyzed the ex- 
change just as well, Weber reasoned that it must be the 
actin partner which is phosphorylated first; that is, 


ATP-factin = actin: P+ADP. 


To get the desired translation, he then assumed that P 
is extruded with the formation of an actin-myosin bond, 
which bond then migrates across the enzymatic site. 
At this point, water attacks the bond and the process 
starts anew with another ATP. This interesting scheme 
deserves close attention, for, if correct, it should harmo- 
nize much chemical and structural data. At the same 
time, some of its weaknesses must be considered. 
Confirmation of the basic Ulbrecht experiment is not 
yet forthcoming; indeed, in some quarters, there is con- 
cern lest the observed exchange might be the result of 
contamination by granular ATPases. The plausibility 
of actin phosphorylation by ATP is also in serious doubt. 
Strohman”® recently demonstrated phosphorylation of 
actin by the creatine kinase system, but attempts to 
phosphorylate actin with ATP itself have shown only 
exchange of whole nucleotide molecules, and no real 
phosphorylation. Likewise, Levy and Koshland failed 
to find O'§ incorporation into ATP when ATP was 
incubated with actin in the presence of labeled water. 
Evidence for a transphosphorylation in just the reverse 
direction (i.e., myosin to actin) actually is suggested in 
the Levy-Koshland work. These authors found that the 
orthophosphate produced by myosin ATPase was en- 
riched twofold over what it should have been, had the 
incorporation of O!8 occurred only at a single hydrolytic 
step. Still another purely biochemical difficulty arises 
from*considering the speed required by the proposed 
hydrolytic reaction in order to account for the observed 
rate of shortening. From the model by Weber, one 
would expect an individual enzymatic site to be, say, 
5 A or 5X10~ long. For esterases which are actually 
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Fic. 6. Scheme for the Podolsky model of contraction. The 
mechanism for phosphorylating the thin filament (e.g., the cre- 
atine kinase-creatine phosphate system) is not included in the 
diagram. 


much faster than myosin-B ATPase, (V m/Eo) at 25°C is 
about 10° sec. Therefore, the relative velocity of fila- 
ments moved by the esterase activity should be about 
5X10 psec. Since the myosin filaments are supposed 
to be closing on the actin filaments from either side, 
the predicted shortening velocity of the sarcomere is 
10™ psec. For an unloaded 3-cm frog muscle contract- 
ing at 0°C, Hill®! reported a shortening speed of 5 cm 
sec. In such a muscle, the sarcomeres, which are 2 u 
long, would shorten at a speed of 5/3X2 usec", or 
3.3 usec, at 25°C this speed easily could reach 10 
usec, or 100 times the speed expected from a fast 
totally unencumbered esterase. To these difficulties 
should be added two others which stem from the ob- 
served length—tension behavior of muscle and myosin- 
B threads. Both living muscle (single fibers’) and 
myosin-B threads are able to contract until they are 
20 to 30% of their original length. But, assuming the 
arces which bring about the interfilament motion to be 
lort-ranged, it is hard to see how the system ever 
ould contract to less than 50% of its original length. 
it the other extreme, a living single muscle fiber which 
is passively stretched and then excited to contraction 
continues to generate tension until it has been stretched 
to about twice its rest length. Already at 1.4 times rest 
length the overlap between the presumably interacting 
filaments has ceased! 
Although I have shown a measure of dissatisfaction 
with the sliding filament model, I wish at the same 
time to avoid an impression of blissful contentment 
with the coiling class of models typified by our poly- 
electrolyte scheme. To mention but two serious diffi- 
culties with coiling models, consider that if myosin is 
invoked as the polyelectrolyte, one must face up to the 
difficulty that present measurements indicate that one 
mole of ATP may influence maximally as much as 105 g 
of myosin. This seems too large a sphere of influence, 
even for Coulombic forces; and it is the difficulty which 
has led Szent-Györgyi’ to invoke very unorthodox 
nergy excitation and transmission. Another 
hortcoming of the coil model (a shortcoming 
oved for the sliding model’) is that, until 
it has never been shown to be consistent with 
now, } neat-work generalizations established by Fenn 
certain living muscle. In this situation, the proposal 
and Hill for 
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recently put forth by Podolsky* is most intriguing. His 
model is basically a variant of the polyelectrolyte model, 
but its modifications allow it to agree, on the one hand, 
with some structural observations, and, on the other, 
with the Fenn-Hill relationships. 

Podolsky commences with two empirical equations 
evolved by Hill,*! from measurements by Fenn and Hill. 
These equations describe the heat and work exchanges 
of a muscle contracting in the neighborhood of its rest 
length, bearing a load ¢ and shortening at the constant 
velocity v. The first equation is the ‘‘force-velocity” 
relation (Fig. 6): 

(r+a)(v+b)=5ab; a,b, constants. (1) 


The second equation asserts that — Q, the rate at which 
the muscle produces heat, is a linear function of v; i.e., 


—Q=ab+ ar. (2) 
From the first law of thermodynamics, it is known that 
for a system at constant pressure which performs work 
w in addition to work of expansion, 

AH=0—w. (3) 


If this be a chemical system proceeding under conditions 
in which AH can be expressed as the extent of reaction, 
v, multiplied by a (constant) “heat per equivalent of 
reaction,” h, then, 


hv=Q—w. (4) 
At this point, one has two choices, since Eq. (1) can be 
solved for either r(v) or v(r). If one takes 7(v), com- 
bines it with Eq. (2), and substitutes into Eq. (4), 
one obtains, 
(S) 
if one takes v(r), combines it with Eq. (2), and sub- 
stitutes into Eq. (4) one obtains, 


ý= ù(v, constants); 


(6) 


Thus, Hill’s equations seem to force one to assume that 
the reaction rate of the net driving reaction depends 
either upon the shortening velocity or upon the load 
(analytically, of course, these are equivalent conclu- 
sions.) To Podolsky, it has seemed easier to interpret 
Eq. (5) physically, since a dependence of reaction rate 
on shortening rate can arise if the two participants in 
the chemical reaction are attached to structures that 
are in motion relative one to another. Specifically, he 
has elaborated the model shown in Fig. 6. At rest, the 
“thin filament” is supposed to be a positively charged 
polyelectrolyte at equilibrium length. On excitation, 
there goes into operation a system capable of depositing 
on this filament the negative ions of ATP. The thin 
filament thus develops tension and begins shortening, 
drawing inwards the two Z membranes. At the same 
time, the ATPase sites on the thick filaments are free 
to pick the bound ATP molecules off the thin filament, 
so that the net number of bound ATP molecules (and, 


ý= ù(r, constants). 
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therefore, the tension in the thin filament) will be a 
balance between the charging rate and the ATPase rate. 
The ATPase rate now must be considered more closely. 
The thick filaments do not move. There is, therefore, 
a relative velocity between every point on the thin 
filament and the point on the thick filament vertically 
opposite to it. This velocity is zero at the central ends 
of the thin filaments (where they are supposed to be 
attached rigidly to the thick filaments), and it rises to 
a maximum, say «, at the outer ends. If v is a fraction 
which rises linearly from 0 at the central (fixed) ends of 
the thin filaments to unity at the outer edges of the 
thick filaments, it is simplest to assume that, at any 
position along the thick filament, the relative velocity 
between opposite points on the two filaments is ru. 
Since both Z membranes are closing in on the center, 
each with velocity 2, 


2u/6=0 i.e., u=u(v, constants), 


(7) 


where ô is the sarcomere rest length and v is the shorten- 
ing velocity of the whole fiber, expressed in muscle 
lengths per second. The chemical basis for the depend- 
ence of reaction rate on shortening rate is introduced 
in this way (Fig. 7): It is assumed that, as an ATP 
molecule bound to a thin filament, there can be defined 
a probability of reaction, P, which depends essentially 
upon a velocity constant (conventionally defined) and 
upon the time spent in transit, which in turn depends 
upon ru. If the distance between bound ATP molecules 
on the thin filament be /, then an ATPase site, at 
position corresponding to 7, will be presented with 
passing ATP molecules rz// times per second; therefore, 
per second it will carry out (ru/l)P (ru) fruitful re- 
actions, and the entire filament will carry out, 


m (ru/l) P (ru)ddr (8) 


fruitful reactions, where Adr is the number of ATPase 
sites in the element dr. Multiplying expression (8) by 
the number of filament units per ml of tissue, ¥ (u, 
constants) is obtained, and, finally, by means of re- 
lation (7), 


(9) 


Thus, on the basis of the Poldolsky model, one can reach 
the functional dependence (9) which is that called for 
by the Fenn-Hill relations [Eq. (5)]; moreover, the 
explicit version of Eq. (9) can be fitted very closely to 


ý= p(v, constants). 


a. a 
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Fic. 7. The motion of an ATP affixed to a thin filament 
relative to an ATPase site on a thick filament. 
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the experimentally determined version of Eq. (5). The 
obvious merits of the Podolsky suggestion are three: 
there is retained what seems to this author the best 
molecular transducer by far—i.e., the coiling polyelec- 
trolyte (or any of its thermodynamic analogs); there is 
invoked a structural organization consistent with cur- 
rently popular ideas; and it is shown that the heat-work 
behavior of the model would be very like that observed 
in living muscle. 

To what Podolsky actually has written, I should like 
to add some opinions of my own. If, indeed, his model 
is to enjoy consistency with current views on protein 
deployment, then it must be assumed that his thick 
filaments are myosin (or its subunits), and that his 
contractile thin filaments are actin. It is, therefore, 
perfectly reasonable to assume the existence of ATPase 
sites on the thick filaments. However, it is not at all 
obvious that actin could function in the polyelectrolyte 
sense required by the model in its present form. As 
already mentioned, actin is a globular protein of mono- 
mer weight, about 6X10* g. It has an interesting content 
of bound adenine nucleotide. It is well known that, 
under certain conditions, the globular (G) form of actin 
polymerizes into necklace-like aggregates of fibrous (F) 
actin. As normally prepared, the nucleotide of G-actin 
is ATP, and of F-actin it is ADP, but the precise 
mechanistic connection between dephosphorylation and 
the G-F transformation is unknown. Recently, Oosawa 
and his collaborators**> carried out an elegant study 
of the G = F transformations. They showed them to 
be cooperative processes, and concluded that the F > G 
transformation is assisted by (a) hydrolysis of bound 
nucleotide—i.e., H2O-+-G(ATP) — F(ADP)+P— (b) 
addition of divalent cations such as Mg**, and (c) in- 
crease of ionic strength. Naturally, these observations 
suggest that in G-actin the monomers are kept from 
aggregating by the electrostatic repulsion between their 
negative charges; in other words, the dispersed form of 
actin is charged more negatively than the aggregated 
form. To the extent that these transformations depend 
upon the state of the bound nucleotide, they could be 
driven by interaction with enzyme systems. On the 
basis of the recent work by Strohman,” transphos- 
phorylation from the creatine-phosphate, creatine- 
kinase system would disperse the actin, and transphos- 
phorylation from the dispersed actin to myosin ATPase 
would condense the actin. But this is just opposite to 
what the Podolsky model requires, since in his model 
the form of actin with the bound ATP would be assumed 
to be the condensing form, while the form of actin from 
which the myosin ATPase has removed a P would be 
assumed to be the dispersing form. One way out of this 
difficulty might be to suppose that im situ the actin has 
sufficient adsorbed cations so that the GF trans- 
formations are inverted from what they are in free 


solution, e.g., that the G — F condensation is Prevented 
I 


by electrostatic repulsions between positive charges on 
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the monomers. Again, not enough is known about intra- 
cellular-ion distributions to reject this possibility. To 
complete this description on a favorable note, I will 
add that the Podolsky model enjoys a substantial ad- 
vantage over any coiling model in which it is assumed 
that myosin is the responding substance, since, if actin 
is assumed to be the responding substance, the volume 
of material which would have to be influenced by one 
molecule of ATP is much smaller and correspondingly 
more plausible. 

Nothing I have read or thought about has weakened 
my intuitive conviction that the tension-generating de- 
vice in excited muscle will prove to be a mechanically 
continuous structure, because any alternative device is 
apt to run afoul of vector analysis; nor am I ready to 
relinquish my faith in Coulombic interactions as the 
most rapidly generated and the most long ranged of the 
forces that the transducer could employ. On the other 
hand, I feel utterly dismayed over the experiments 
which have zot been done in this field. For example, 
x-ray diffraction, intrinsic birefringence, and optical- 
rotation techniques have been available for some time, 
but one has been content to speculate about small-scale 
configurational change without trying to establish its 
occurrence or nonoccurrence in the in vilro system. 
Everything points to the transcending importance of 
both Ca** and Mgt in the fundamental process— 
indeed, to the likelihood that it is these ions, not ATP, 
which are moved in excitation; yet, there is very little 

ertain information on where these ions reside in the 

ber. One debates nowadays whether or not ATP or 

-ceatine phosphate are dephosphorylated in the twitch, 

out, so far as I know, little effort is being put into 
devising optical methods to follow dephosphorylation, 
and, at present, there are not even the means for block- 
ing one of these reactions and not the other. However, 
ignorance in these matters will be dispelled only by hard 
work and certainly not by prolonging this gloomy 
epilogue. 

Added nole: Since this manuscript was prepared, 
W. T. Astbury has informed us that his associates have 
demonstrated the appearance of the cross-8 x-ray 
diffraction pattern when ATP shortens a film of myosin- 
B, thus strongly suggesting a real configurational 
change. Also, F. Buchthal has come forth with electron 

micrographs which show a wide gap (of nonoverlap) 
between thin and thick filaments, at muscle lengths at 
which tension can still be readily demonstrated. 
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HE major advances of the science of genetics have 

come about through the use of two fundamental 
types of operations. The first and oldest consists in 
mating of selected, multicellular plants or animals to 
produce large numbers of offspring, among which the 
distribution of various inherited characters is examined. 
This procedure has produced that body of knowledge 


known as classical genetics and which, beginning 
systematically with the studies of Mendel, has deline- 
ated the processes which sustain and modulate bio- 
e logical heredity in all living forms. This experimental 


approach, which essentially examines genetic mecha- 


nisms in the germ cells, has been extremely successful 


in elucidating hereditary phenomena in organisms as 
diverse as Drosophila and Zea mays, but is difficult to 
apply to the slowly and less extensively reproducing 


organisms like the mammals. Thus, while some notable 
advances have been accomplished, it has not been 
generally possible to obtain sufficiently large numbers 
of progeny from selected mammalian matings to provide 
populations adequate for measurement of many im- 
portant genetic events. 

‘The second major class of genetic operations involves 
a more recent technique, which consists in study of the 
independent micro- and ultramicro-organisms—the 
free-living cells and the viruses. Here the standard unit 
is a single cell or particle, and the fundamental operation 
involves examination of the distribution of genetic 
traits among the progeny of asexual reproduction. The 
principal power of this method lies in the vast multipli- 
cation factor which it affords. Instead of hundreds or 
thousands, millions or billions of progeny, all arising 
from a single individual, can be rapidly produced and 
quickly scanned for a large variety of genetic characters, 
so that even rare events can be quantitatively examined. 
In the last twenty years, such genetic studies on 
microorganisms have produced a whole new field of 
knowledge. Sexual methods of genetic analysis are not 
excluded from these operations but can sometimes be 
arranged in a step preceding the clonal multiplication 
of the single cells. In addition, however, new methods 
for nonsexual genetic analysis have been found which 
promise to be exceedingly productive. Among the 
fundamental results which have issued from use of 
single-cell techniques applied to microorganisms during 
the last two decades are included: (a) the first system- 
atic demonstration of the direct relationship between 
genes and. enzymes; (b) delineation of many specific, 
metabolic pathways involved in gene-controlled bio- 
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of genetic materials between cells previously considered 

to multiply only mitotically; (d) introduction of genes 

into cells by means of temperate viruses or pure 

deoxyribonucleic acid; (e) the most accurate measure- | 

ment of gene-mutation rates; (f) demonstration and | 
| 


syntheses; (c) demonstration of the direct exchange 
f 


quantitative analysis of the processes involved in en- 
zyme induction among genetically competent cells in 
response to specific environmental stimuli; (g) delinea- 
tion of the characteristics of linear inheritance in bac- 
teria; (h) systematic use of mitotic crossing-over in 
certain organisms to localize genes on their chromo- 
somes; and (i) the most detailed mapping at the level 
of molecular dimensions of the linear sequence of genetic 
determinants, as accomplished in a bacteriophage 
chromosome. 

An index of the fruitfulness of this latter approach to 
the analysis of genetics and related metabolic processes 
is afforded by the fact that the great majority of the 
discussion devoted to genetic processes in this bio- 
physical conference has been concerned with develop- 
ments arising directly from microbial genetics. 

In our laboratory, a program was initiated, attempt- | 
ing to develop a methodology that would make possible | 

i 
i 


studies of mammalian cells by means of the techniques 
of microbial genetics. If successful, such a methodology 
would provide a new avenue for exploration of mam- 
malian genetic processes among the somatic cells of 
multicellular organisms, which would complement 
studies on the genetics of the germ cells achieved by 
conventional mating studies. In addition, it would 
afford all of the many kinds of analyses which an un- 
limited number of progeny provides. Such techniques 
might facilitate elucidation of the nature of any changes 
in the genome and phenome, respectively, which are 
responsible for the characteristic behavior of the various 
differentiated tissue cells; localization of different genes 
on the various chromosomes; provision of mutated cells y 
containing specific biochemical blocks for the deline- 
ation of various enzymatic steps through quantitative 
study of specific biosyntheses, like that of hormones and 
antibodies in specialized cells; search for processes like ~ 
sexual genetic exchange and transduction in somatic — 
cells, and investigation of the genetic biochemistry of 
normal and malignant cells. 
Participating in this program have been a seri 
devoted co-workers: Steven Cieciura, Harold 
and Philip Marcus carried out studies which fc 


Sato, and N. C. Webb joined this program in a 
doctoral capacity. Chromosome studies 
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Fic. 1. (a) Petri dish which was seeded with 100 single cells of 
the S3 strain of the HeLa culture, an aneuploid which originated 
in a human cancer. (b) Petri dish seeded with single cells of a 
euploid human strain. These cells tend more to spread and migrate, 
and so the colonies tend to run together unless a smaller number 
of cells is plated or a larger plate is employed. 


carried out initially in a collaborative arrangement with 
Dr. Chu and Dr. Giles at Yale University, and more 
recently with J. H. Tjio, who joined our laboratory last 
year. Dr. Arthur Robinson has conducted those aspects 
of these studies which involved work on human patients. 
The first step in this program, which, of course, has 
drawn a great deal on previously existing tissue-culture 
methodology, was to develop means for plating single 
nmalian cells, under conditions such that every 
m Je cell develops into a discrete, macroscopic colony.! 
single c sential in order to quantitate growth of single 
sarge S rge population, and to make possible isola- 
cells of a 14 es clones. A simple, quantitative means for 
tion of mutan Je cells into colonies was achieved by a 
growth of single 
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procedure, identical in principle to, and only slightly 
more complex in practice than, the plating method 
which constitutes the basis of quantitative microbiology. 
Figure 1(a) illustrates a representative plating in which 
100 cells of the S3 HeLa clone, plated in a Petri dish 
in nutrient medium, have attached to the glass and 
produced, within the limits of sampling uncertainty, a 
discrete macroscopic colony from each cell inoculated. 
The counting of such colonies is completely objective 
and highly reliable, so that this methodology permits all 
of the types of experiments characteristic of quantita- 
tive bacteriology to be applied to mammalian cells. 
Cells taken from organs as diverse as skin, liver, spleen, 
bone marrow, testis, ovary, kidney, lung, and others, 
from man and other mammals, and from individuals of 
a large variety of ages, all respond similarly. It can be 
concluded that at least large numbers of cells exist in 
virtually every organ of the mammalian body which 
retain the potentiality of growth as independent micro- 
organisms, if provided with an adequate physical and 
chemical environment. 

By and large, two general types of cellular and 
colonial morphologies result from such plating. That 
illustrated in Fig. 1(a) has been called ‘“‘epithelial-like” 
because the cells form compact, tight colonies with 
relatively smooth, well-defined edges. In our experience 
so far, the cells that tend to grow stably in this fashion 
have been polyploid, and usually aneuploid. The other 
type of colony, which has been called “‘fibroblast-like” 
but should perhaps be referred to simply as “stretched,” 
is illustrated in Fig. 1(b), which shows the tendency of 
the colonies to grow with rough edges, and of the cells 
to line up as elongated parallel structures. This stretched, 
needle-like conformation is more characteristic of the 
euploid cells grown by the methods developed in our lab- 
oratory. However, the molecular environment strongly 
influences the morphology of cells grown in this manner.’ 

The course of developments in microbial genetics 
has demonstrated the enormous utility of quantitative 
single-cell plating for the isolation of stocks with rare 
genetic markers that permit analytical experimentation. 
Unless each cell can grow in isolation to form a colony, 
it becomes impossible to apply the highly discriminating 
screening methods by which millions of cells are sub- 
jected to a stressful situation permitting growth of only 
occasional rare mutants, for otherwise, the exceedingly 
rare mutant which though potentially is capable of 
reproduction is not part of a large reproducing popu- 
lation and may not be enabled to express its growth 
potentiality. An effective single-cell plating technique 
permits ready recognition of even very rare genetic 
events, and quantitation of their frequency. 

Isolation of mutant clones from such plates is a 
Straightforward process for which a variety of different 
mechanical methods have been developed. We have 
established mutant clones which have arisen spontane- 
ously or have been induced as a result of x-irradiation. 
Among these are included forms with divergent nutri- 
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tional requirements for colony formation, changed 
colonial morphologies, resistance to destruction by 
specific viruses like Newcastle disease virus, and differ- 
ences in chromosomal constitutions (Fig. 2). These 
mutants have been grown for months or years in 
continuous culture, during which they have produced 
astronomical numbers of progeny, and have exhibited 
stability with respect to their identifying characteristics 
which is in every way comparable to that of the 
familiar mutants in bacteria and molds.?3 

Attention was next devoted to the development of a 
defined chemical medium which would promote growth 
in high efficiency of single mammalian cells, and so 
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Fic. 2. Demonstration of some representative mutant strains 
isolated from the HeLa population. (a) Plating efficiency curves 
in the presence of different amounts of human-serum of 2 mutants 
found to occur spontaneously in the original HeLa population. 
The cellular and colonial morphologies of these two strains are 
identical but their growth requirements are different. (b) Distribu- 
tion of chromosome number in 2 HeLa clones. The clone at the 
top (S/NDV) has the same chromosome number as $3, but is 


characterized by its resistance to destruction by Newcastle disease 
virus. 
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make available means for isolating mutants with specific 
nutritional markers. Many laboratories have contrib- 
uted studies elucidating small molecular requirements 
of mammalian cells in tissue cultures utilizing massive 
cell inocula.*~® In studying the growth requirements of 
individual, isolated cells, we found that single cells of 
the S3 clonal strain of the HeLa cell form colonies with 
100% plating efficiency, when a chemically defined 
small-molecular medium is supplemented with two 
macromolecular serum fractions each migrating as a 
single moving boundary in an electric field. The two 
proteins of mammalian serum which, under the condi- 
tions of these experiments, complete the growth require- 
ments of single cells are serum albumin and an æ- 
globulin with a molecular weight of approximately 
45 000, which appears to be a glycoprotein in compo- 
sition. Both substances are necessary for growth of 
single cells under the specific conditions of our pro- 
cedures. The functions of these proteins are still under 
study, but that of the a;-globulin has been found to be 
fulfilled completely by the protein fraction named 
fetuin, an a,-globulin which constitutes 45% of calf 
fetal serum protein.” 

Albumin and fetuin have been purified by repeated 
precipitation to the point where preparations are 
approximately 98% homogeneous electrophoretically 
and ultracentrifugally, although the ultimate purity of 
such preparations is, of course, difficult to specify with 
certainty, particularly since fetuin has been demon- 
strated to be heterogeneous immunologically. A typical 
electrophoretic pattern is presented in Figs. 3(a), and 
3(b) demonstrates the colonial development achieved 
from inoculation of single cells into a medium containing 
the purified albumin and fetuin, amino acids, growth 
factors, salts and glucose. The plating efficiency equals 
that in serum-containing medium. Experiments on 
mutants of S3 which exhibit different molecular nutri- 
tional requirements are now in progress. 

It seems likely that fetuin and albumin may not be 
required as true metabolites, but rather as conditioning 
factors which permit establishment of the necessary 
physicochemical relationships between the cell and 
surrounding medium. Thus, the fetuin fraction exercises 
a specific action on the cell membrane, since trypsinized 
cells added to a medium, complete except for the 
presence of fetuin, remain as rounded spheres, unat- 
tached to the glass surface. With the addition of fetuin 
to such a medium, the cells rapidly attach to the glass 
and stretch out in a highly characteristic configuration 
[Figs. 4(a) and 4(b) J. As little as 1 mg/cc of fetuin can 
be detected by means of this action. Fetuin has been 
known to be a powerful antitryptic agent, and at least 
part of the function of fetuin seems to be antiproteo- 
lytic in nature, because this substance prevents the 
tryptic loosening and rounding of glass-attached cells. 
The participation in the attachment reaction of a 


protein not included in the fetuin fraction has been n 


claimed by some workers.® 
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Fic. 3. (a) Electrophoretic pattern of fetuin purified by repeated, fractional (NH,)2SO, precipitation from fetal calf serum. (b) Colonies 
developing on plate seeded with 100 S3 cells, in a medium containing known small molecular components, and purified fetuin and human 


serum albumin. 


In further development of tools for quantitative study 
of the growth and genetics of the mammalian cell 
in vitro, it became necessary to attempt to control the 
lability of karyotype which had been demonstrated to 
affect cells cultivated by standard tissue-culture tech- 
niques. Recent studies have shown that virtually every 
mammalian cell which has been established as a stable 
tissue-culture strain has a chromosomal constitution 
radically different from that of the parental species 
from which it originated.” Euploid cells were generally 
found either to stop growth after a very short period 
in culture, or else to proliferate, but with drastic changes 
in chromosome number and structure. In order to 
achieve conditions eliminating this proclivity for aneup- 
loid chromosomal constitution in cell cultures (which 
renders genetic studies of questionable significance), 
methodologies were required for simple and reliable 
monitoring of chromosomal constitution of cells grown 
in vitro. Such methods were developed by Axelrad and 
McCulloch," by Rothfels and Siminovitch,” and by Tjio 
and the author in our laboratories. When the karyo- 
types of such cultures could be checked with ease and 

cy, it was possible to find growth conditions 
gccure ant karyotype integrity to be maintained. 
uel pe ditions were secured by development of a 
eae for media and regulation of other condi- 


i t s ss 
screening tes ove conditions leading to mitotic 


5 rem F 
P P might induce polyploidy. By careful 
inhibition, 


control of the process 
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regulation of the pH and temperature during cell 
incubation, and by use of pretested media which 
incorporate all the needed nutritional requirements for 
cell growth and exclude toxic substances, it became 
possible to grow cells from normal tissues of man and 
other mammals, for periods now approaching a year, 
and for numbers of progeny which have exceeded 10% 
without change in chromosomal constitution." Figure 5 
shows the chromosome complement of typical human 
male and female cells. These methodologies have per- 
mitted characterization of each of the human chromo- 
somes, and particularly of the X and Y sex-determining 
chromosomes.1® 

In order to employ human genetic markers with 
biochemical, immunologic, and pathologic character- 
istics, a method was devised by which cells from any 
individual could be simply and reliably introduced into 
stable growth in vitro. A small piece of skin, approxi- 
mately 10 to 50 mg in mass, is excised from the ventral 
part of the forearm, dispersed by trypsinization, and 
grown in the medium which contains fetal calf serum, 
which was shown to maintain stable karyotype. This 
method has proved itself reliable in securing actively 
growing cultures‘ from any individual.“ It has made 
possible confirmation of the fact that the human 
karyotype does, indeed, consist of 46 chromosomes, as 
first reported by Tjio and Levan!’ Studies are now 
under way on cells of individuals with phenyl ketonuria 


eg, invelved in cell dispersion bY ie and other genetic diseases, in an attempt to establish 
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Fic. 4. (a) Photomicrograph demonstrating the rounded condition of cells, which will shortly be released from their bond to the 
glass in the absence of any “stretching factor.” (b) Photomicrograph demonstrating the stretched condition of the cells in an adequate 


concentration of purified fetuin. 


cytogenetic and biochemical differences between these 
and normal cells in the iz vitro cultures. These method- 
ologies are also being employed clinically to determine 
the chromosomal constitution of persons with various 
degrees of clinical hermaphroditism and to compare 
the results so obtained with those utilizing the sex 
chromatin method devised by Barr and his associates." 
Thus, an anatomically female patient suffering from 
ovarian dysgenesis (Turner’s syndrome) was shown to 
possess only 45 chromosomes. 

In contrast with the normal human chromosome 
constitution shown in Fig. 5, Fig. 6 presents that of 
the aneuploid S3 HeLa cell, a clonal stock which we 
developed from the culture which Gey and his associates 
established from a human carcinoma of the cervix.'” 
The chromosome number of this clonal strain is 78, 
which represents a hypotetraploid condition. Studies 
are in progress comparing the different morphology and 
behavior of cells which differ in chromosomal number 
and constitution. Thus, mutant clones of animals like 
the Chinese hamster have been isolated, which have 
stemline chromosome numbers of 22 and 23, respec- 
tively. 

As one example of the kinds of quantitative studies 
made possible by these techniques, the remainder of 
this discussion is devoted to analysis of their application 
to study of the action of high-energy radiation on mam- 
malian cells. Ionizing radiations are capable of disrupt- 


ing any chemical bond in any molecule of any cell which 
has absorbed such energy. On the basis of such qualita- 
tive considerations alone, one might expect an enormous 
variety of cellular pathologic actions. Radiobiologic 
studies on mammalian systems carried out by many 
laboratories over many years have more than borne out 
this expectation, demonstrating a bewildering variety of 
pathological consequences attending various kinds of 
exposure to radiation of mammals, including phenomena 
as diverse as gastrointestinal disturbance, drop in the 
level of blood white cells, epilation, and carcinogenesis. 
One of the most critical problems in this field involves 
determination of the degree to which chemical disorgani- 
zation attending exposure to a given dose of ionizing ra- 
diation affects the genome or the phenome, respectively, 
of the irradiated cell. A quantitative answer to this 
question would greatly simplify attempts to understand 
the enormously complex series of events attending irra- 
diation of mammalian cells and of whole animals. In 
addition, the answer to this question is essential in 
order to determine the degree to which exposure of 
mammalian populations to various doses of ionizing 
radiations will result in random mutations transmissible 
to succeeding generations. 


Physiologic damage to cells in the form of reversible 
lag periods in cell multiplication, changes in cell perme- 


ability and inhibition of many specific enzymatic 


activities, have been demonstrated in many organisms. 
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Fıc. 6. Chromosome complement of a 
typical S39 HeLa cell, a subclone of the 
S3 strain, of human malignant origin, Ųų 
with a stemline number of 78. 


In a variety of nonmammalian forms it had also been 
shown that the cell nucleus is much more sensitive to 
damage by irradiation than the cytoplasm. Genetic and 
cytologic examination of radiation effects, particularly 
in the simpler organisms, has revealed that changes of 
the genetic structure arising from irradiation include 
single gene mutations, chromosome breaks, production 
of areas deficient in chromatin, and occasionally elon- 
gated and amorphous chromosomes, these latter pre- 
sumably arising from cells which had been irradiated 
during mitosis. At the site of fragmentation, a chromo- 
some exhibits sticky surfaces which can re-attach so 
that the chromosomes may reconstitute themselves into 
a configuration more or less closely resembling the 
normal. In all probability, however, a defect, even 
though invisible, will persist at the site of breakage 
unless this happened to lie in a genetically inactive 
region. All or part of any chromesome which has 
suffered such a “hit?” may, through imperfect restitu- 
tion, fail to be incorporated in the mitotic apparatus 
and be lost in subsequent cell divisions. Such a condition 
can cause genetic imbalance that may destroy the 
ability of a cell exhibiting it to reproduce indefinitely. 
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In addition, if an unrestituted chromosome divides, the 
two sister chromatids may unite at their broken ends 
to form an anaphase bridge that can prevent completion 
of mitosis. If multiple chromosome hits occur in the 
same cell, all of the foregoing damaging actions char- 
acteristic of single breaks are intensified. Moreover, 
abnormal restitution of the sticky, broken ends of 
different chromosomes may occur, producing a variety 
of complex translocations, such as dicentric chromo- 
somes which can destroy the cell’s reproductive 
capacity.}s 

Earlier studies of the relative sensitivities of the 
mammalian genome and phenome, respectively, to 
damage by high-energy radiations at first seemed to 
favor the thesis that for mammalian cells, in contrast 
to cells of other organisms, physiologic rather than 
genetic effects may be most important in the dose range 
up to and including the mean lethal dose for the whole 
organism, which is about 400 to 500 r. This supposition 
appeared to be supported by several kinds of evidence: 
studies in which massive cell inocula were irradiated in 
tissue culture with progressive doses of irradiation 
revealed that permanent loss of cell proliferation did 
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Fic. 7, X-ray survival curve of the ability of single S3 HeLa 
cells to form colonies as a function of the dose, D. The points can 
be fitted either by the equation N/No=1—[1—e-/"*)? or by 
N/No=e!7"(1-+-D/77), the latter relationship being preferable, 
since it is based on the model that reproductive death for this 
polyploid cell requires two or more effective hits anywhere in the 
chromosome complement. A more refined and comprehensive 
equation which expresses the contributions to genetic death of 
haploid, diploid, or polyploid cells of any species by single-hit and 
multiple-hit radiation damages is presented elsewhere. 


not occur until many thousands of roentgens had been 
administered. Presumably because of the way in which 
tissue-culture thinking in the past was dominated by 
the concept that cell proliferation is a property of a 
cellular community, experiments like this were inter- 
preted as an index of the dose needed to inactivate the 
cellular reproductive mechanism. Moreover, irradiation 
of single cells of bacteria, yeasts and protozoa, which 
might be considered as models for the mammalian cell, 
‘ndicated that in all these forms, inactivation of repro- 

ction required thousands of roentgens.! Hence, since 

th of a whole mammal can be effected by an exposure 

‘ only 400 to 500 roentgens, it appeared possible that 

mammalian-cell interaction with irradiation proceeds 
in a manner fundamentally different from that of other 
forms. While this thesis has been vigorously opposed, 
especially by geneticists, among whom H. J. Muller 
has been particularly active, it is still occasionally 
affirmed in the current scientific literature that very 
high doses in the neighborhood of many thousands of 
roentgens are needed to achieve irreversible damage to 
mammalian cells 7 vitro” and the ability of ionizing 
radiation to produce mutagenesis in mammals is still 
being challenged.” 

With the development of the techniques which have 
been here described, it became evident that the use of 
cell proliferation from massive cultures cannot be used 
as an end point to quantitate irreversible effects of 
radiation on cell reproductive capacity unless the results 
are calculated as a survival function based on the 
number of cells irradiated. Since evea single cell 

ining intact can proliferate eventua ly to produce 
peee ec of outgrowth whatever, the dose at which a 
any gegia unspecified population fails to produce 
large DaT outgrowth is meaningless in terms of the 
recogniza the average to inactivate the cellular 
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istic of irradiated cells to exhibit reversible mitotic lags, 
and to continue to multiply for one or even several 
generations, but then to produce a microcolony of sterile 
progeny, use of the mitotic index of the population as 
a measure of irreversible effects on cellular reproduction 
is not reliable. However, the use of the single-cell 
plating procedure makes possible precise determination 
of survival curves for mammalian cells by procedures 
exactly like those which had been used for cells of 
E. coli. In this way it is possible to measure accurately 
the dose required for inactivation of the reproductive 
mechanism of the individual cells constituting a popu- 
lation. The first such curve obtained is shown in Fig. 7, 
and approximates a two-hit relationship, with an initial 
shoulder, followed by a linear exponential drop, which 
is maintained for doses up to at least 1800 r. Of major 
importance is the fact that the mean lethal dose, Do, 
which is obtained from the slope of the linear part of 
the curve, was only 96 r. The determination of this 
curve is completely objective and has been verified by 
this time in a number of laboratories. 

Several features of the behavior of such irradiated 
cells pointed to damage to the genome as the primary 
process responsible for reproductive death. The small 
value of the mean lethal dose for the S3 HeLa cell made 
it possible to rule out of consideration single-gene 
mutation as the dominant radiobiological process. The 
hypothesis was advanced that cell death is a conse- 
quence of chromosomal damage resulting from irradi- 
ation, and this proposal appeared to give good quanti- 
tative fit with the behavior of the system. This inter- 
pretation was supported by the following facts. After 
x-irradiation of single cells with approximately 6 mean 
lethal doses, the survivors were allowed to form colonies 
which were then picked and grown into new cell stocks. 
Examination showed at least 4 out of 5 such strains to 
be mutated from the original form, exhibiting morpho- 
logical or nutritional differences, which persisted after 
long-term growth. Moreover, each of these four strains 
was revealed to possess a chromosomal constitution 
grossly altered from that of the original, unirradiated 
strain (Fig. 8). The 2-hit nature of the curve for the 
S3 cell was explained on the basis that reproductive 
death in such cells requires, in the main, at least two 
independent hits, and therefore appears to be a result 
of aberration produced by interaction between two 
separately damaged chromosomes. Other aneuploid 
cells of human origin, like S3, and with chromosome 
numbers far in excess of the normal 46, gave x-ray sur- 
vival curves similar to that of $3.22 

The fate of the cells whose ability to multiply 
indefinitely has been destroyed by irradiation is of 
interest. While some may multiply for a few generations, 
all appear to retain the ability to metabolize effectively 
and to carry out complex biosyntheses of macro- 
molecular structures like that of a specific virus, added 
to the system after irradiation. Under proper environ- 
mental conditions, cells of either aneuploid or euploid 
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constitution may, after irradiation, produce giant forms. 
Thus, these cells which have been destroyed reproduc- 
tively by irradiation, have maintained intact a large 
proportion of their other metabolic activities. At least 
small distortions of these functions probably also exist, 
but may require subtle means for their demonstration. 

A pattern of events similar in some ways, but different 
in others, was obtained on irradiation of human cells 
with normal chromosome complements. All of these 
cells originated in normal human tissues and possessed 
the typical, elongated configuration commonly associ- 
ated with euploid human cells.‘ Determination of 
chromosome constitution on several of these revealed 
them to have the normal diploid number. All of them 
behaved identically, and exhibited a mean lethal dose 
of about 50 r, corresponding to an even greater radio- 
sensitivity than the HeLa cell. The hit number of these 
survival curves is less than that of the aneuploid HeLa 


cell, and lies somewhere between 1 and 2, a greater 
accuracy not yet being available for these cells because 
of their great radiation sensitivity. Hence, the conclusion 


may be drawn that euploid human cells are even more 
radiosensitive than the malignant, aneuploid HeLa. 
As a further test of the hypothesis that cell repro- 
ductive death, as a result of x-irradiation, arises directly 
from chromosomal damage, a series of experiments was 
carried out in which euploid human cells were irradi- 
ated, after which the chromosomes were delineated and 
examined cytologically for direct evidence of aberra- 
tions. A preliminary report by Bender,” counting 
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Fic. 8. Chromosome number distribution of the clonal strain 
S3 (top), and of 3 typical subclones isolated from among the 
survivors of radiation with 500-700 r (S3R3, S3RA1, and S3RE1). 
These chromosome analyses were carried out by Chu and Giles? 


TABLE I. Chromosome aberrations at various radiation doses. 
Single-hit aberrations are chromosomal defects caused by a single 
ionizing event, and include a complete break in one chromatid 
only; a break in both chromatids at the same point presumably 
reflecting a break in the chromosome before it has doubled or a 
traverse of both chromatids by the same ionizing particle; an 
achromatic region in which the continuity of the chromosome is 
uninterrupted, but chromatin has disappeared from a particular 
area; or the presence of one or more greatly elongated chromo- 
somes trailing ‘“‘sticky” streamers. Multihit complexes comprise 
chromosomal aberrations involving interaction between two or 
more independent hits to 1 or more chromosomes, such as trans- 
located dicentrics and ring chromosomes. 


Single-hit aberrations 


Presence 
of one 
Single Double or more 
No. of chro- chro- Achro- “sticky” Multihit 
Dose mitoses matid matid matic chromo- aber- 
(roentgens) scored break break regions somes rations 

0 116 22 1 1 1 1 
10 3 0 0 0 0 0 
20-25 33 6 1 0 0 0 
40-50 20 37 4 0 2 1 
75 101 113 23 7 2 14 
150 26 26 5 8 4 10 
Totals 299 204 34 16 9 26 


chromosome breaks induced in euploid human cells 
grown in vitro, suggested that approximately 300 r were 
required to produce an average of one visible chromo- 
some break per cell. This figure is almost six times 
higher than the mean lethal dose for these cells. In an 
attempt to obtain an estimate of the roentgen efficiency 
of chromosome breaks for euploid human cells, valid 
under our conditions of study, experiments were carried 
out in our laboratory in which the conditions of growth 
of the cells were brought as nearly as possible to 
optimal values in order to minimize some of the uncer- 
tainties owing to mitotic lags. In addition, however, 
an independent method of arriving at the chromosome- 
breaking efficiency was employed, based on the fact 
that the appearance of aberrations due to abnormal 
chromosome restitutions, can occur only in appreciable 
numbers at doses beyond the mean lethal dose. Such 
abnormal restitutions will not disappear, as can single 
chromosome breaks which can reseal without leaving 
any visible trace.” 

Figure 9 presents typical sets of mitotic figures 
obtained when cells with normal chromosome consti- 
tution taken from various tissues of normal human 
subjects are irradiated in vitro at different dose levels. 
It is obvious that, with cells irradiated with 50 r, single 
chromosome breaks are readily evident [Fig. 9(b)]. 
When the dose is increased to 75 r, two effects appear: 
the number of single breaks is increased; in addition, 
however, new types of aberrations appear which indi- 
cate that, at this dose, multiple chromosome hits in 
individual cells have become a frequent occurrence. 
Examples of such complex anomalies formed through 
interaction of several radiation-damaged chromosomal 
sites are presented in Figs. 9(c) and 9(d). A summary 


in a collaborative study with LEGEAK Rangi University Haridwar con ie Seip reike Ua Table I. From these figures, 


g 
w >`, 
5 
> <2 
J y ot ` +e 
i 7 Ai r, 
oe 
F 9 Sa, _™ 
% a à 
~ Q 
% sc ) 
ZF or’ 
X° ,Q 
4 or 
= © & 
Ss 
4 
`T 
Fa Vo, 
at 
WT K 
Pa) 
3 4 S 
4 ga 
1G & £ | 


5 AN og ridwar Collection. Digitized by S3 Foundation USA ` 
3 


STUDIES ON MAMMALIAN CELLS IN VITRO 443 


. 
>. 

gt 
\ 


a f ie © : 


(d) 


Fic. 9. Typical kinds of chromosome abnormalities observed when euploid human cells are irradiated i vitro with 230-kv x-rays. 
cc (a) Unirradiated cell. (b) Cell irradiated with 50 r. Breaks and deletions ae aslindicated by arrows. (c) Chromosomes'’of cell irradi- 


> ated with 75r, showing a translocation in the center, and a dicentric. (d) Portion of a mitotic figure of a cell after 150-r irradiation, 
demonstrating formation of ung EBLOTRORARRangri University Haridwar Collection. Digitized by S3 Foundation USA 
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one can safely conclude that the dose needed to intro- 
duce an average of one visible chromosome break per 
cell is no more than 40 to 60 r, a value which agrees 
within the limits of experimental uncertainty with that 
obtained as the mean lethal dose for colony formation 
of these cells. 

The conclusion may be drawn that the genome of the 
euploid human cell is extraordinarily susceptible to 
chromosome damage by ionizing radiation, and that the 
average dose needed to introduce one chromosome 
break per cell is similar to the mean lethal dose to the 
cell’s reproductive function, i.e., approximately 50 r. 
Since this value is far less than that required for the 
inactivation of various enzymatic activities in mam- 
malian cells,” it appears to establish the fact that the 
cellular genome of man is far more sensitive to radiation 
damage which is, at least in part, presumably irrevers- 
ible, than are other of its structures. While studies are 
currently in progress to determine whether cells in vivo 
will behave in the same way, experiments with radio- 
protective agents added to the medium suggest that 
only relatively small changes in the cellular Do value can 
be accomplished by changes in the cellular environment. 

These considerations afford an understanding of the 
action of ionizing radiation on mammalian cells in terms 
of a primary effect—the initiation of a chromosome 
break, which requires on the average an exposure of a 
normal human cell to 50 r or less, and a secondary 
effect—the production of complex chromosomal aber- 
rations through abnormal restitution of the fragments 
resulting from multiple breaks occurring within the 
same cell. The first of these involves a variety of 
alternative possibilities: The cell may continue repro- 
duction with no recognizable cytogenetic change, though 
with perhaps a gene mutation at the site of the original 
chromosome break; it may suffer loss of all or part of a 
chromosome which could impair its ability to reproduce; 
or it may undergo formation of a chromosome bridge 
at anaphase which will certainly end its reproductive 
ability. When multiple chromosome fragmentations 
leading to complex translocations occur, the cell is much 
more likely to be incapacitated with respect to repro- 
duction. Different cells should display 1-hit, multiple- 
hit, or intermediate types of survival curves, depending 
on the degree to which these different processes operate 
in causing destruction of the ability to multiply. 

This analysis appears to offer an explanation for the 
difference in radiobiologic response exhibited by various 

ad : z 
mammalian cells. For example, normal diploid cells of 
different species, differing in their total mass of chromo- 
E material, should be expected to exhibit differences 
in radiation sensitivity roughly though not exactly 
eling their chromosomal volumes, since the cell 
with greater chromosomal mass will offer greater oppor- 
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cells of the chick, whose DNA content is less than 
one-quarter that of human cells, possess a mean lethal 
dose for x-rays approximately 4 to 8 times greater than 
that of euploid cells of man. In Table II is presented a 
comparison of the Do values and DNA content of 
diploid cells from a variety of different species. Despite 
relatively large uncertainties in the individual determi- 
nations of the mean lethal dose and of the DNA 
content, and the large range of values encompassed, it 
is evident that these very different diploid cells exhibit 
a remarkably constant value for the product of these 2 
variables. Other aspects of these relationships are 
considered elsewhere. 

These considerations afford insight into the effects 
of various chromosome conditions like polyploidy on 
the radiosensitivity of mammalian cells from the same 
species. With increasing numbers of chromosome sets 
within a cell, its sensitivity to killing by a 1-hit process 
should decrease. Thus, a haploid cell will lose its ability 
for indefinite multiplication as a result of damage to 
any single gene essential to replication. Diploid cells 
which have a double gene set presumably will require 
loss or aberration of larger portions of a chromosome 
before the genetic imbalance necessary for inhibition 
of multiplication results. Since such chromosome losses 
readily occur as a result of 1-hit processes following 
exposure to ionizing radiations, such cells will often 
exhibit curves with a hit number close to unity, but 
may in some cases be multiple hit, if the gene distribu- 
tion among the chromosomes is such that death rarely 
follows a single-hit process. The Do value of such curves 
will be influenced by the volume of each chromosome, 
the magnitude of the breaks which are produced, and 
the degree to which 1-hit lethal and 2-hit lethal processes 
occur when cells are exposed to various types of irradi- 
ation under various conditions. Cells with higher degrees 
of ploidy will be much less susceptible to reproductive 
killing by loss of part or all of any chromosome because 
of the smaller imbalance produced when larger multiples 
of the chromosome set are present. In such cells, the 
major lethal process will involve interaction between 


TABLE II. Comparison of radiation and the DNA content for 
diploid cells of three different living forms. The data indicate 
that, despite individual variations in Do values and DNA contents 
of over a hundred-fold, the product of these two variables remains 
reasonably constant, varying only by a factor of about two. 


Product: 

Dip- DNA DoXDNA 
loid content content 
cell (in picograms  (roentgens 
type Do values per cell) Xpicograms) 
Yeast about 8000 r> 0.05? 400 
Chick about 300—400 r° DS! 900 
Man 50-60’re 8.34 500 


^The Do value is taken as the dose needed to reduce the reproducing 
fraction of the cell population to 37%, in the region where the survival 
curve is linearly exponential. 
reference 27. 
e See reference 25. 
4 See reference 28. 
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two or more chromosomes to produce malformations 
like anaphase bridges. Hence, such cells will usually 
display a hit number in the neighborhood of 2. While 
the larger volume of the chromosome set in such cells 
might at first be considered to make them more vulner- 
able than diploid cells to the accumulation of hits that 
can lead to cell death, this factor may be more than 
counterbalanced by the relatively low probability with 
which two simultaneously broken chromosomes will 
restitute abnormally, so as to form a nondividing 
configuration (like a dicentric) as opposed to a union 
still permitting normal mitosis. Thus, the HeLa $3 
cell, an aneuploid human malignant cell with 78 
chromosomes, exhibits a survival curve which is approx- 
imately 2-hit, as opposed to the more nearly 1-hit 
curve of the cuploid cell, and exhibits a Do value of 
96 r = stead of 50 r characteristic of the normal diploid 
cell. These parameters—the degree of cell ploidy; the 
TEN ion among the different chromosomes of regions 
whose loss may lead to cell death through imbalance; 
and the separate probability of formation of chromo- 
somal aberrations which mechanically prevent un- 
limited division by 1-hit and 2-hit processes distributed 
among the various chromosomes—would appear to 
afford explanation of the difference in radiosensitivities 
of different cells. Quantitative evaluation of these pa- 
rameters in selected normal and malignant cell systems 
is now under study. 

These considerations also can offer explanation for 
our recent findings that the S3 cell which is enormously 
more sensitive than bacterial cells to killing by x-rays, 
is much more resistant to ultraviolet irradiation. Ultra- 
violet, while a highly effective lethal and mutagenic 
agent for bacteria, is known to be much less efficient 
than x-rays in producing chromosome breaks. While 
haploid bacterial cells can be killed effectively by point 
mutations and so are readily inactivated by ultraviolet 
light, the need for chromosome breakage to occur before 
the multiploid animal cell is inactivated would render 
it resistant to the action of ultraviolet irradiation. 

Application of these aspects of the action of ionizing 
radiation at the cellular level, to analysis of the under- 
lying mechanism involved in the pathologies arising 
from acute, whole body irradiation of animals leaves 
little doubt that many of the most important features 
of the mammalian radiation syndrome are due to the 
cumulative actions on the individual body cells of the 
same kind as those described in the preceding para- 
graphs: 

(a) Knowledge of the Do value for individual cells 
for the first time provides clear explanation why the 


mean lethal dose for total body irradiation of mammals ° 


should lie in the range of 400 to 600 r. Exposure of the 
whole body to a dose of this magnitude would inactivate 
the reproductive mechanism of more than 99% of the 
cells with radiosensitivities like those of the cells studied 


> jn vilro. The body might recover from smaller doses 
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which would still leave sufficient intact cells which 
could reproduce at a rate rapid enough to compensate 
for the losses. The mean lethal dose to the whole animal 
represents a point of loss of reproducing cells so exten- 
sive as to provide widespread damage to the integrities 
of physiological function which depends on cell repro- 
duction and which constitutes a stress which cannot 
ordinarily be withstood. 

(b) This formulation is in accord with the failure to 
find any specific biochemical lesion resulting from 
animal irradiation in the mean lethal range. Significant 
depression of individual enzyme functions have been 
found to require much higher doses than that needed 
to kill the whole animal. While no specific enzyme 
function could be implicated, many studies showed 
that DNA synthesis was greatly reduced as a result of 
irradiation in the dose range here considered.” The 
theory here considered would predict such behavior. 
DNA synthesis can proceed normally only when cell 
reproduction progresses. Destruction of this function 
through production of chromosome aberrations must 
eventually depress DNA synthesis, even though no 
enzyme system of the cells or body fluids has been 
significantly altered by the irradiation. 

(c) Similarly, the failure to find any accumulation of 
toxic material in the body fluids after exposure to 
radiation in the mean lethal range is to be expected. 
While at higher doses significant amounts of toxic 
products might be produced as a result of chemical 
structural alterations resulting from ionizations pro- 
duced in the tissue, doses of several hundred roentgens 
only will alter significantly structures with a size of the 
order of magnitude of the chromosomes, and will 
produce biological changes only if the uninterrupted 
integrity of such structures is vital to normal body 
operations as is indeed the case with the genetic 
apparatus of the individual cells. 

(d) It follows that one would expect those tissues to 
be most radiosensitive which must maintain the highest 
rate of cell mitosis, since these will suffer the most 
immediate loss of their specific functions, on which the 
body depends. This correlation between radiosensitivity 
and mitotic rate has been recognized as one of the 
earliest generalizations to emerge from radiobiologic 
experience. However, while the rapidly dividing tissues 
are indeed most radiosensitive from the point of view 
of their immediate contribution to embarrassment of 
the whole organism through failure to maintain their 
specific functions, it is necessary to regard virtually all 
of the body cells as equally sensitive to chromosomal 
damage. Each normal cell nucleus has an equal proba- 
bility of accumulating similar chromosomal injuries. 
Such aberrations will remain largely latent in each cell 
until it comes to reproduce. Hence, the cells of the 
more rapidly dividing tissues are simply the first to 

make evident the injuries which must be regarded as 


equally numerous in the more slowly Buenas 
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regions of the body. This fact is of vital importance in 
understanding genetic transmission of radiation dam- 
age, and the ability for tumors to develop many years 
after a radiation experience.” 

(e) Similarly, our picture accounts in straightforward 
fashion for the characteristic lag period between radi- 
ation exposure and the development of symptoms, 
which has long been recognized as typifying mammalian 
injury by ionizing radiation. The cell which has suffered 
damage which is primarily only chromosomal, will 
continue to function effectively for a period, since its 
full complement of enzymes and other structures pre- 
sumably remains largely unaltered. Not until the cell 
reaches the point where it should normally divide will 
it begin to exhibit grossly deviant behavior from the 
normal. Moreover, we have shown that such cells may 
even multiply for several generations, before they and 
their progeny cease reproduction.” 

(£) The sequence of events which ultimately leads to 
death of any acutely irradiated animal may thus be 
explained as follows. Chromosomal aberrations intro- 
duced among all of the cells exposed will first reveal 
themselves in those cells of the most actively mitotic 
tissues. Division will be prevented in most of such cells 
irradiated with several hundred roentgens, with the 
result that the functions supplied by these tissues will 
fail and the body will be threatened. As other more- 
slowly dividing tissues reach the point where they too 
should produce cell proliferation, they will add the 
results of their failure to those which have already 
introduced distortion of normal body physiology. In 

contrast to the actively dividing structures like the 
‘one marrow, and the epithelial linings of the gastro- 
atestinal tract, tissues like nerve and muscle in which 
cel] division rarely occurs, can continue to exhibit 
normal function even after exposure to many thousands 
of roentgens. It is of interest that cell reproductive 
death has also been identified as the major factor in the 
mammalian radiation syndrome by Quastler and his 
co-workers, using experimental approaches and tech- 
niques completely different from those employed here.*! 

In this connection, an alternative hypothesis has 
been proposed that the more-rapidly dividing tissues 
are more radiosensitive only because cells in mitosis 
are in a more sensitive condition and hence are damaged 

more readily by irradiation. This proposal does not 
fit the facts, since even in the most rapidly dividing 
tissues, no more than 3% of the cells are in mitosis at 
any one time. Hence, if this were the true explanation, 
it could account at most for a negligible reduction in 
cell viability of such tissues. However, in early embry- 
onic development, where high mitotic frequencies often 
are achieved and where phased cell reproduction may 
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in the mean lethal range, by injection of viable bone- 
marrow cells, though not by cellular fragments or 
extracts. Such cells recolonize the irradiated tissue 
which is being depleted through impaired cell-repro- 
ductive function. While other tissues have presumably 
suffered equal diminution in the percent of cells capable 
of reproduction, these require replenishment less rapidly 
and hence provide more time for the surviving cells to 
recolonize the tissue, provided the bone-marrow func- 
tions can be maintained. It may be expected that 
animals might be saved from lethal effects of even 
higher doses, by additional inoculation of viable cells 
from other tissues which, as the dose increases progres- 
sively, would become critical in their failure to maintain 
functional integrity of the body. 

(h) Similarly, the interesting experiments first carried 
on by Patt and his co-workers,” in which animals were 
subjected to low temperatures immediately after irradi- 
ation, become explicable. Such animals failed to develop 
any of the symptoms of radiation injury until after 
their temperature was restored, after which the entire 
sequence of pathogenesis was initiated as though the 
irradiation had just occurred at the time their body 
temperatures had been raised back to normal. At the 
low temperature, mitosis is inhibited so that the body 
does not produce the conditions which can bring into 
expression the latent damage to the cellular genetic 
apparatus. On rewarming, normal cell reproduction is 
again initiated, so that each time a genetically damaged 
cell comes to mitosis its injury becomes functional and 
contributes to embarrassment of a normal function. 

(i) One might expect, then, to find some rough corre- 
lation between the Do obtained for euploid cells of 
different animals as measured by the survival curves 
here described, and the mean lethal dose for the whole 
animal. While the latter figure must, of necessity, be 
influenced by many complex interactions of an unpre- 
dictable kind, it is of distinct interest to find that such 
a correlation is at least suggested by the small amount 
of currently available data, as shown in Table III. 

Studies of the action of radioprotective agents on 
x-ray survival curves of S3 cells are completely con- 
sistent with the interpretations here developed. The 
compound, 2-mercaptoethyl guanidine, which has been 
demonstrated to raise the MLD for mice from 900 to a 
maximum of 1450 r,35:36 also exercises significant radio- 


TABLE III. Data demonstrating that possibly some parallelism 
may exist between the mean lethal dose for whole warm-blooded 
organisms (LDso) and the Do values of their single, euploid cells 
as determined in vitro. 


X-ray Do of euploid 


X-ray LDso for 
cell in vitro 


whole animal 


7 ; aa e 4 Man 400- 500 r 50- 60 r 
occur to bring many cells into mitosis simultaneously, Chinese hamster 825-1190 r° 160 r 
this factor might easily play an important role. Fowl 1000 r” 300-400 r 
These considerations also explain quite naturally 
(g) . ] A di d wi a See reference 33. 
how it is possible to save anima s irradiated with doses Aaea referente 34. 
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protection on S3 HeLa cells plated in vitro. The shape 
of the survival curve remains approximately 2-hit as it 
was in the absence of the compound, but Do, the mean 
lethal dose for cell reproduction, is raised from 96 r to 
a maximum of 160 r, a value which agrees far better 
than perhaps could be expected with the degree of 
protection achieved in the whole animal. All of the data 
on the kinetics of the radioprotective action of this 
compound on single plated cells are consistent with the 
interpretation that the presence of this material lowers 
the effective dose of x-rays which reaches the sites 
within the cell whose inactivation results in loss of the 
ability to reproduce indefinitely. 

(j) Finally, these data support the cellular-genetic 
interpretation of the great effectiveness of relatively 
low doses of radiation in suppressing antibody formation 
in the mammalian body. The dose range needed to 
inhibit antibody production significantly lies in the 
region of 50 to 150 r,37 a value which, through its close 
correspondence with the mean lethal dose for euploid 
cell reproduction, suggests this action to reflect the 
effect of radiation in preventing multiplication of 
antibody-producing cells. By contrast, specific macro- 
molecular biosynthetic mechanisms appear essentially 
undamaged even after irradiation of mammalian cells 
with thousands of roentgens.!® Thus, the great radio- 
sensitivity of antibody production suggests that the 
mechanism of new antibody production involves the 
need for cell multiplication, rather than simply antibody 
synthesis by pre-existing cells. 

The fact that some mammalian cells exhibit survival 
curves for the reproductive function which are multiple- 
hit in character must not be taken as an indication that a 
threshold exists for the induction of mutations by high- 
energy radiations. On the contrary, all of the evidence 
here presented leads to the conclusion that, while a cell 
displaying a 2-hit survival curve does display a threshold 
for killing by high-energy radiation, the production of 
single chromosome breaks, and the attendant alteration 
of genetic material at the site of such a break, goes on 
even when doses insufficient to kill are applied. From 
the point of view of the entire organism and its progeny, 
it may be far better for any cell to be permanently 
inactivated by multiple chromosomal aberrations than 
to suffer only a single chromosomal change, which then 
may be handed on to the offspring, usually as a recessive 
genetic defect. 

The data here presented make possible an estimation 
aS to whether background radiation may contribute to 
the processes of aging in man. Since 50 r is the average 
dose needed to produce 1 visible chromosome break 
Per cell by the techniques here employed, one may well 
©xpect that damage on a smaller scale which cannot be 
Seen by ordinary microscopy results from even smaller 
doses, Thus, one can calculate that, with a background 
exposure of 0.1 r per year, in 70 years the accumulated 
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damage in one-seventh of all of the body cells, and 
probably more subtle effects in a much larger number. 
This would appear not to be a negligible process, and 
thus makes it probable that the gradual attrition of 
body functions which constitute aging, may be, to a 
very significant degree, the reflection of accumulation of 
cellular genetic injuries in cells and their descendants 
originating from background irradiation. By extension 
of the methodologies here discussed, it is proposed to 
attempt analysis of physiologic and genetic differences 
in the behavior of cells taken from aging animals, and 
to examine as well the effects of the molecular constit- 
uents of body fluids from such subjects on physiologic 
and genetic behavior of standard cell strains in vitro. 

Discussion of the use of quantitative methods in 
measuring mammalian cell growth and genetics would 
be incomplete without at least mention of the out- 
standing development by Dulbecco and his associates*® 
in achieving a plaque technique for precise enumeration 
of single particles of mammalian viruses exactly as has 
been current in bacteriophage studies. Thus, it becomes 
possible also to quantitate virus and cell interaction in 
animal systems, so that one may expect equally pro- 
found insights to arise from such studies as has been 
the case in the interaction of bacteriophages with their 
bacterial hosts. 

The thesis which has been the purpose of this 
discussion is that newer developments promise to 
provide well-defined systems for quantitative explora- 
tion of physical and physicochemical mechanisms 
involved in growth, genetics, and differentiation in 
mammalian-cell systems. There is every indication that 
in the space of a few years it will be possible to come to 
grips with specific molecular aspects of these basic 
mechanisms in as intimate a fashion as is now current 
in the biophysical approach to microbial systems. 


BIBLIOGRAPHY 


1T. T. Puck, P. I. Marcus, and S. J. Cieciura, J. Exptl. Med. 
103, 273 (1956). 

2 T. T. Puck and H. W. Fisher, J. Exptl. Med. 104, 427 (1956). 

3T. T. Puck, J. Cellular Comp. Physiol. (to be published). 

4H. Eagle, J. Biol. Chem. 214, 839 (1955). 

5V. J. Evans, J. C. Bryant, W. T. McQuilkin, M. C. Fiora- 
monti, K. K. Sanford, B. B. Westfall, and W. R. Earle, Cancer 
Research 16, 87 (1956). 

6 J. F. Morgan, H. J. Morton, and R. C. Parker, Proc. Soc. 
Exptl. Biol. Med. 73, 1 (1950). 

1 H. W. Fisher, T. T. Puck, and G. Sato, Proc. Natl. Acad. Sci. 
U. S. 44, 4 (1958). 

8I. Lieberman and P. Ove, J. Biol. Chem. 233, 637 (1958). 

° F, C. Parker, L. N. Castor, and E. A. McCulloch, “Cellular 
biology, nucleic acids and viruses,” special publication of New 
York Acad. Sci. 5, 305 (1957). 

10 A, E. Moore, C. M. Southam, and S. Sternberg, Science 124, 
127 (1956). 

u A. A. Axelrad and E. A. McCulloch, Stain Technol. 33, 67 
(1958). 

2 K. H. Rothfels and L. Siminovitch, Stain Technol. 33, 73 
(1958). 

13 J. H. Tjio and T. T. Puck, J. Exptl. Med. 108, 259 (1958). 


angri University Haridwar Collection. Digitized by S3 Foundation USA 


. T. Puck, S. J. Cieciura, and A. Robinson, J. Exptl. Med. 
(1958). 

E o and T. T. Puck, Proc. Natl. Acad. Sci. U. S. 
iblished). 

. Tjio and A. Levan, Hereditas 42, 1 (1956). 

. O. Gey, W. D. Coffman, and M. T. Kubicek, Cancer 

ch 12, 264 (1952). 

H. Muller in Radiation Biology, A. Hollaender, editor 

w-Hill Book Company, Inc., New York, 1954), p. 351. 

T. Puck and P. I. Marcus, J. Exptl. Med. 103, 653 (1956). 

C. M. Pomerat, Ann. New York Acad. Sci. 71, 1143 (1956). 

galls, H. R. Morrison, and L. I. Robbins, New Engl. 

, 252 (1958). 

. T. Puck, D. Morkovin, P. I. Marcus, and S. J. Cieciura, 
Med. 106, 485 (1957). 

. Chu and N. M. Giles, J. Natl. Cancer Inst. 20, 


. Bender, Science 126, 974 (1957). 


AP AIAG 


pee 


PUCK 


25 T, T. Puck, Proc. Natl. Acad. Sci. U. S. 44, 772 (1958). 

26 Z, M. Bacq and P. Alexander, Fundamentals of Radiobiology 
(Butterworths Scientific Publications, London, 1955), pp. 228-262. 

27 C, A. Tobias, Natl. Acad. Sci.-Natl. Research Council, Publ. 
No. 18, 46 (1956). 

25 I, Leslie in The Nucleic Acids, E. Chargaff and J. N. Davidson, 
editors (Academic Press, Inc., New York, 1955), Vol. II, p. 1. 

» Z, M. Bacq and P. Alexander, see reference 27, pp. 257-258. 

30 Ç, L. Simpson and L. H. Hempelmann, Cancer 10, 42 (1957). 

a H, Quastler, Radiation Research 4, 303 (1956). 

32 H, M. Patt and M. N. Swift, Am. J. Physiol. 155, 388 (1948). 

a G. Yerganian, Federation Proc. 14, 1371 (1955). 

4 Z. M. Bacq and P. Alexander, see reference 27, p. 220. 

33 D. G. Doherty (personal communication). 

3 R. Shapira, D. G. Doherty, and W. T. Burnet, Radiation 
Research 7, 22 (1957). 

37 H. T. Kohn, J. Immunol. 66, 525 (1951). 

38 R, Dulbecco and M. Vogt, J. Exptl. Med. 99, 167 (1954). 


f 


al 


$ REVIEWS OF MODERN PHYSICS 


VOLUME 31, 


NUMBER 2 APRIL, 1959 


48 
Interactions between Cells 


Paur WEIss 
The Rockefeller Institute, New York 21, New York 


E SCIONS among cells are the means by 
which the cell community of the organism estab- 
lishes and maintains its organizational harmony. They 
are so numerous and varied that it would take more 
space than is allotted here just to draw up a reasonably 
comprehensive list. Yet, knowledge about them is still 
very scanty. Fascinating progress has been made in the 
study of some of the biochemical and biophysical com- 
ponents of the living system. But knowledge seems to 
have grown the faster, the smaller the sample of the 


living system that was taken under investigation. The 


major advances were made down at the molecular level. 
The task of dealing with the larger cellular level, there- 
fore, involves a re-entry into areas of uncertainty, and 
the course from the introductory article on cell dy- 
namics (Weiss, p. 11) to the present article marks a 
full circle from relative ignorance through knowledge 
back to relative ignorance. The study of cells in inter- 
action deflates complacent notions that all the major 
facets of cell life are truly understood even in principle. 

“Interactions between cells” covers practically every- 
thing that is going on in organisms, and obviously this 
account must confine itself to a few crucial examples. 
There are essentially two ways for cells to interact: 
either a cell elaborates a diffusible substance which 
affects another cell at some distance, or a cell transmits 
an effect to another cell by direct contact. Since media- 
tion by diffusible agents, as in the hormone system, is 
more widely studied and better understood than are the 
contact interactions, the emphasis here is on the latter. 
A more detailed account can be found in the author’s 
recent review article on “Cell Contact.” 

First, a few words on what is meant by “contact.” 
It has been very easy to define contact between cells in 
terms of observations with the ordinary light micro- 
scope. If microscopically two cell borders came so close 
as to merge into a single line, there was contact, whereas 
a microscopic space in between signified lack of contact. 
Conversely, any interruption of the microscopic outline 
between two cells was interpreted as “protoplasmic 
continuity.” But evidently, such questions of “con- 
iiguity” vs “continuity” between cells are merely ques- 
tions of the resolving power of the instruments at hand. 
It is not surprising, therefore, that the introduction of 
electron microscopy has brought a marked change of 
Outlook, 9 

In the first place, electron microscopy has revealed a 
Brae degree of organization at cell surfaces than pre- 
ek S y assumed. As an example, one can cite the larval 

Phibian skin referred to in the introductory article 
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which rest on the basement membrane, is dotted at regu- 
lar intervals with submicroscopic bodies, about 1200 A 
high, consisting each of two electron-dense round plates 
connected by a lighter neck.? The outer one of the 
plates forms part of the cell surface. This surface itself 
is separated from the underlying basement membrane 
by a gasket-like granulated film of a few hundred 
Angstréms thickness. The cell surface thus is actually a 
mosaic of patches of supramolecular order and different 
chemical and physical properties, which have been 
partly identified. This was done by applying various 
enzymes to fragments of live skin before they were 
fixed in osmium tetroxide and prepared for electron 
microscopy. In pancreatic lipase, the dark plates lose 
their osmiophilia, indicating that normally they have a 
high lipid content. In distilled water, the neck of these 
bodies swells and breaks, leaving unconnected double 
plates. Thus, the neck region is hydrophilic. Salivary 
amylase dissolves the film through which the epidermal 
cell is attached to the basement lamella, suggesting 
abundance of carbohydrate. It also erodes the parts of 
the cell surface between the dense bodies, but leaves the 
latter intact and protruding, thus identifying them as 
solid bodies. 

Such observations demonstrate that it would be 
quite mistaken to consider a cell as being equal and 
uniform over its whole surface. Characteristically, this 
surface pattern is confined sharply to the portion of the 
cell that is in immediate apposition to the basement 
membrane, thus indicating a direct contact interaction 
between cell and substratum. This supposition is con- 
firmed by experiments in which the contact between the 
two structures was first broken and then restored, as fol- 
lows. When part of the skin is injured, the epidermal cells 
near the wound roll off. The dense bodies, which seem 
to serve as suckers to attach the cells to their substra- 
tum, become detached and are resorbed within a few 
days. They are re-formed during wound healing as the 
epidermal cells migrate over the defect, but are pre- 
cisely confined to that fraction of cell surface now in 
fresh contact with the substratum. In other words, the 
cell has a mechanism for producing this specialized 
apparatus which responds to the interaction between 
cell surface and the underlying carbohydrate-rich film. 
As a result, an “induction” on the submicroscopic level 
occurs. The cell is induced along a geometrically and 
physically defined surface to display a specific fraction 


of its synthetic repertory. Other instances of contact — 


4 


induction are given later in this paper. 


When two cells are in contact, the problem becomes 2 d 


more complicated. At the contact points between two — 
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epidermal cells, single dark plates are present, the well- 
known nodes of Bizzozero. Electronmicroscopically, 
they show as a single dark plate in each of the contact- 
ing cells with a lighter space in between.’ Odland’ has 
recently scanned this region densitometrically in su- 
perior electron micrographs and has found an addi- 
tional dark line in the center of the light interspace. 
This would indicate that the molecular array separating 
the two cell surfaces is not random, but has some degree 
of spatial order. It may be that the cell surfaces are 
linked by macromolecules normal to the surfaces of 
some 100 A in length, which is the order of width of the 
light space. Gaps of similar dimensions have been found 
between practically all cells that make intimate con- 
nections with each other, including synapses between 
nerve cells. It would seem best, therefore, to define two 
cells as being in contact, operationally speaking, when 
they are separated by a space whose contents are not 
subject to random perturbation. Depending on the 
dimensions, several factors might operate to limit per- 
turbation of the interspace. For example, macromolec- 
ular compounds moving out of the cells may form 
bridges between organized parts and establish special 
submicroscopic attachments or interaction points be- 
tween the adjoining cell surfaces. Evidence from tissue 
culture, obtained over a decade ago,‘ indicates that 
certain cells, and perhaps all cells, when left to their 
own devices, surround themselves with characteristic 
colloidal exudates, which must be taken into account 
when cell-to-cell relations are analyzed. Such surface 
coronas, extending for unknown distances into the 
cellular environment, might be major factors in immune 
reactions, cellular aggregations into colonies, phagocy- 
tosis of specific substances, selective associations among 
nerve fibers, and the like. It is equally possible that cell 
surfaces react with each other specifically without such 
mediators. A few examples are cited in the following. 
In experiments on the development of nerve fibers, it 
has been found that fibers of the same type have a 
selective affinity to one another—motor fibers applying 
themselves to motor fibers, and sensory fibers to sensory 
fibers, and within each class, each subgroup to its 
corresponding type. Only by virtue of such specific 
grouping is it possible for a surgeon, for instance, to 
expose the spinal cord and sever discriminately a bundle 
of pain fibers or of fibers for deep perception. Such 
fibers would not run in common bundles unless they had 
grown out that way, and since they do not develop all 
at the same time, the older fibers must have served as 
ides to latecomers of the same character. 
The same principle applies, even more subtly, to the 
relations among the cells within the central nervous 
tem on which the coordination of neural functions 
sys ds. Our current concern with the nervous system 
depends. tem centers mostly around such matters as 
as a Sys roperties, electric parameters and time 
geometric prop of interconnections within the net- 
distribution of branches, temporal 
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characteristics of the individual units, synaptic resist- 
ances, etc. We generally ignore the fact that there is 
great biochemical diversity within the system far be- 
yond the gross distinctions of cholinergic and adrenergic 
portions, which makes it work as something more than 
a monotonic network of conducting fibers. Developing 
nerve cells have highly selective discriminatory affini- 
ties by which they “recognize” each other and their 
surroundings. This introduction of the anthropomor- 
phic term of “recognition” is not anything one need 
apologize for so long as physicists speak of electrons 
“seeing” each other. Recognition of one cell by another 
may be based on conforming charge distributions, on 
conforming molecular groupings, or on as yet wholly 
unsuspected mechanisms. The problem of explaining 
such recognition is of course encountered in many other 
biological phenomena, e.g., in enzyme-substrate rela- 
tions and in antibody-antigen reactions. With a few 
notable exceptions,® affinity reactions among somatic 
cells, however, have received little attention and even 
less methodical study, even though there is ample evi- 
dence for the widespread occurrence of such selective 
behavior. 

To cite a specific example, cells can “locate” their 
proper destinations in the body even if they are de- 
prived of their customary routes for getting there. This 
has been shown’ by letting embryonic cells of known 
destination be distributed at random through the blood 
stream of a host embryo. In order to be able to identify 
the injected cells, one chooses cells which carry a 
marker. We chose the precursor cells of pigment cells 
which, when introduced into nonpigmented breeds of 
chicks, would reveal their origin by synthesizing black 
melanin. It was found that after an injection of such 
cells into the blood stream, about 8% of the host chicks 
later carried scattered patches of black pigment cells in 
their feathers and skin. These pigment cells were always 
situated in the precise positions where such cells would 
have resided in the donor animals, and in no case was a 
pigment cell ever observed to have become lodged 
where pigment cells do not normally belong. One must 
conclude, therefore, that donor propigment cells had 
colonized only specified areas in the host embryos where = 
such cells normally end up, which, in view of the random 
dissemination of the dissociated cells in the host body, 
implies the existence of a mechanism for highly selective 
localization. As few as one or two propigment cells, 
when arriving by chance at an appropriate site, “recog- 
nize” it and settle down to proliferate and differentiate 
into pigment cells. Cells that miss encountering within 
the embryo an environment favorable to their type fail 
to become lodged and to differentiate, and presumably 
are resorbed. 

On the other hand, outside the embryo, on the yolk 
sac, where specific sites for embryonic cells are lacking, 
clumps of injected cells get stuck, and their fate led to 
a further significant extension of these observations. It 
was discovered? that, where random mixtures of em- 
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bryonic cells from a variety of sources (nerve, muscle, 
cartilage, glands, etc.) had become lodged on the yolk 
sac, they gave rise not to indiscriminately mixed struc- 
tures, as might have been expected, but to quite 
harmoniously organized organ complexes. Some degree 
of self-sorting according to types must have taken 
place among these scrambled cells after they had be- 
come trapped. 

Further conclusive evidence of “self-sorting” was 
found in tissue cultures in which random mixtures 
of suspensions of cells of different embryonic origins 
—for instance, cartilage and kidney cells—had been 
combined.’ These cultures developed compound struc- 
tures in which islands of pure cartilage were sharply 
demarcated from blocks of pure kidney tissue, indi- 
cating that cells had become re-associated like-to-like. 
Thus, even in vitro cells, which have acquired in 
their prior embryonic history some biochemical differ- 
ential related to their subsequent development as either 
kidney cells or cartilage cells, can assort themselves 
according to type. Since the cells make contact only 
with their surfaces, one must assume that some property 
associated with differentiation into kidney cells or into 
cartilage cells has imparted type-specific markers to 
their surfaces for mutual recognition. But how such 
“recognition” leads to active sorting remains obscure. 

How do mixed cell populations achieve this orderly 
reassortment? Do like cells “attract” each other? Or 
do they “recognize” each other only after chance en- 
counters? A clue came first from some observations on 
selective wound healing after the grafting of various 
kinds of epithelia into skin gaps in the flanks of am- 
phibian embryos.” The first coverage of a wound is 
effected by the migration of epithelial cells of the wound 
edge over the raw surface. When these advancing cells 
reach a graft, fusion of the two fronts occurs only if the 
graft contains an epithelium of a type which normally 
borders on skin epidermis. In that event, the advancing 
cells stop in their tracks, so to speak, merge into a uni- 
form sheet with the graft, and no further proliferation 
occurs. Implants of skin, or of cornea or oral lining— 
which are normally continuous with skin—are accepted 
in this manner. But, when a fragment of lung or gall 
bladder or esophagus is implanted in a skin wound, the 
migrating skin epidermis is not halted on contact with 
those foreign-type cells. The epidermal edges glide over 
or under the graft and continue until they meet with 
the skin advancing from the opposite side. 

Results of this kind have long been known in surgery 
and lead to the conclusion that, whether we have a 
plausible explanation for the phenomenon or not, cells 
of identical or conforming constitution (as epidermis, 
cornea, and oral lining) will recognize each other as 
akin and stay put; whereas cells of unlike character 
develop a reaction which will cause them to separate. 

With these facts as background, we proceeded to 
study the contact reactions between different kinds of 
cells directly in vitro,’ by taking phase-contrast time- 
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lapse motion pictures of encounters among cultured 
cells.* If the cells that make contact are all of the same 
kind—all liver cells, for example—they aggregate and 
stay together. The free borders of roaming cells are in 
constant motion; but wherever two like cells touch each 
other, that portion of their surfaces becomes quiescent. 
The bond between them, however, is not static; there 
is no cement which sticks them together as in an im- 
munological precipitin reaction. On the contrary, the 
cells continually glide over one another, constantly 
changing their positions relative to each other and, by 
the time others have joined them, their positions in the 
group. 

On the other hand, if cells of two different types are 
cultured together (e.g., lung and liver or lung and 
kidney), the result is quite different. As the loose cells 
stray about, they collide at random. There is not the 
slightest indication that they would react to each 
other’s presence unless or until their free borders touch. 
Neither do like cells “attract” each other, nor do unlike 
cells avoid each other. Mutual recognition and conse- 
quent discriminatory behavior are decidedly contact 
responses. All cells, whether alike or not, make primary 
contact indiscriminately. But, whereas like pairs there- 
after draw closer together and remain joined, unlike 
pairs separate secondarily by reciprocal withdrawal of 
those parts of their borders that had been in fleeting 
touch. While the mechanism of identification undoubt- 
edly resides in the surface, the interior of the cell seems 
to be engaged in the withdrawal reaction, giving the 
phenomenon the general appearance of a “reflex” re- 
sponse with a “sensory” and a “motor” component. 
The “reflex time” is of the order of 10° sec, varying both 
with the strangeness of the confronted cells and with 
their respective rates of motility. In conclusion, self- 
sorting of scrambled cell types results from chance 
collisions with matching combinations holding on to 
each other, whereas nonmatching ones do not last. 

“Matching” combinations involve either cells of the 
same type or of complementary types, the latter being 
types that normally cooperate in the building and fune- 
tional operation of complex organs. For instance, an 
active mutual adhesion between nerve processes and 
enveloping sheath cells has been demonstrated," and 
our motion pictures reveal a similar marked tendency 
of macrophages to confine their excursions to within 
the borders of the large flattened lung cells with which 
they have been jointly explanted. 

The discriminatory response is strictly cell-type spe- 
cific, but it is not species specific. That is to say, carti- 
lage cells of the chick will reject association with liver 
or kidney cells of the same chick, but they will combine 
readily with cartilage cells of a mouse. This fact, de- 


rived earlier from reaggregation experiments,” has now 


* These moving pictures, made in collaboration with A. C. 
Taylor and A. Bock, were shown at the Study Program in Bio- 


physical Science in conjunction with the lecture on which this 
article is based. re 
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been proven visually by the motion pictures. On the 
other hand, the same cell type may undergo progressive 
alterations with age, so as to make more mature and 
less mature cells of the same strain less acceptable to 
each other than cells of the same age would be. Further- 
more, there are signs that the differentials among cell 
types, which underly their discriminatory responses to 
each other, may be subject to gradations in accordance 
with their ontogenetic relationships. 

The details of this principle still remain to be worked 
out. But what has emerged thus far is enough to force 
a substantial revision of former excessively static pic- 
tures of the organism as a fixed framework of parts 
shifted into predestined places by a rigidly prescribed 
system of tracks and schedules and then immured by 
fibrous cements. Although, grossly, this picture still 
holds, it gains much greater flexibility in detail from the 
realization that the eventual pattern of combination, 
association, and segregation among cell types in typi- 
cally ordered arrays is not just passively arrived at, 
but is actively insured and guarded—and restored after 
disturbance—by a system of subtle mutual conform- 
ances and nonconformances with which the various cell 
types have been endowed as a corollary of their onto- 
genetic differentiation. 

The principle of the self-linking of cells into matched 
groups by contact affinities and disaffinities has an im- 
portant bearing on the explanation of the establishment 
and maintenance of order in the networks of the nervous 
system. The building up of central and peripheral nerve 
cables by the selective attachment of nerve fibers of a 
given type to others of like specificity has already been 
referred to in the foregoing. The same principle holds 
for interneuronal relations, as well as for the relations 
between neurons and their non-neural receptor and 
effector organs, on an even subtler scale. Since this is 
one of the most compelling, yet at the same time least 
recognized, instances of specificity and selectivity in 
cell-to-cell interactions, it is well to recapitulate briefly 
the relevant experiments, which I started almost forty 
years ago (see a recent review) and which have more 
recently been amplified by my student, Sperry.” 

During development, countless connections are es- 
tablished among individual neurons and between neu- 

rons and peripheral sense organs and muscles. The 
latitude inherent in the primary developmental mecha- 
nisms of neurogenesis’ is so great that it would rule out 
any such microprecision in the details of the neuronal 
circuitry as is usually postulated as basis for coordinated 
functions. On the other hand, there is equally conclu- 
sive evidence to show that the nervous system does 
emerge from its embryonic phase with a large list of 
ready-made coordinated performances, which, since 
learning by trial and error could have played soe i, 
their formation, must be explicable in ces A 9 pye = 
ntal interactions. The dilemma was solved by the 
op me ; f secondary processes which adjust the 
demonstration 0 se” ee stem of connections 
details within the gross primary sy 
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to the unique arrangements of each individual speci- 
men. Although nerves from the same central sources 
may in different individuals end up in different muscles, 
the muscles themselves then send a specifying influence 
back into the centers with the information on just what 
muscle of what name lies at the end of what line. 

The test experiment consisted of transplanting a 
supernumerary muscle (or a whole set of limb muscles) 
near a normal limb and hitching it to a random branch 
of the local nerve supply. This amounted to “tapping” 
the communication network by inserting an extra re- 
ceiver. What was observed then was that the trans- 
planted muscle responded in the course of normal ac- 
tivities of the animal always at the precise time and 
with the same strength as the muscle bearing the same 
name in the normal limb. Thus, each muscle is called 
up, as it were, by the central system by its proper name, 
and if there are several muscles of the same name 
present in a common innervation district, they all re- 
spond simultaneously when the particular code is called. 
Since the test muscle was inserted arbitrarily, it is clear 
that it must have established its name-specific corre- 
spondences in the nerve centers secondarily. The same 
correspondence between periphery and centers was also 
confirmed for the sensory part of the system, first for 
proprioceptive! and tactile!® sensations and later for 
the visual field." 

It all boils down to the following. Even though it is 
impossible to distinguish different skeletal muscles by 
present biochemical methods, the nervous system can 
tell them apart very well. Whatever the telling differ- 
ence between muscles, it must be something that can 
make itself known in a retrograde manner over the 
motor nerve fiber in such a way that the central appara- 
tus thereby gains exact information about its peripheral 
terminations. Motor fibers are initially blank and un- 
specified. They acquire specific identity only after they 
penetrate identified muscles at the periphery, and the 
same holds for sensory neurons in regard to their end- 
ings. Somehow, the acquired detailed specificity is then 
transmitted still further back from the primary to 
secondary neurons, and so on up the lines. 

By this device, the neuronal population becomes 
progressively specialized into subunits whose member 
cells can henceforth regulate their own mutual relations 
by virtue of their acquired distinctive properties. 
Neurons of corresponding or complementary specifici- 
ties thus become joined into cooperative systems. The 
proposition that this implies only linkage of a static 
morphological order all the way through the nerve 
centers does not seem to go far enough if one is to 
explain either the aforementioned experimental results 
or the general problems of neural coordination.“ But, 
if one adds the assumptions: that synaptic contact be- 
tween neurons is merely an enabling, but not a decisive, 
condition for impulse transmission; further, that actual 
transmission requires specifically matched states of two 
apposed surfaces; and, finally, that the specificities 
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underlying this conformance are subject to modification 
by both peripheral and central influences; one comes 
closer to a satisfactory concept. The nervous system 
would then emerge as a vast system of resonance cir- 
cuits, the elements of which would be linked by con- 
forming molecular surface configurations, partly per- 
manently, partly variably in response to changing 
central and peripheral influences. 

From these remarks, it can readily be seen that 
compatibilities and incompatibilities along the bound- 
aries between neurons are as significant as discrimina- 
tory devices as are the signs of surface “recognition” 
described earlier in this article for other cell types. 
Indeed, the direct demonstration of cell discrimination 
in the motion pictures should help to make the corre- 
sponding property of neurons acceptable, if not more 
palatable, to those who had hoped to be able to main- 


“tain their oversimplified faith in the essential identity 


of all neurons so long as they were faced only with the 
more indirect earlier evidence. In reopening the whole 
question of specificity in the nervous system, which of 
late has been lying dormant, our observations on cell 
encounters in tissue culture thus assume added signifi- 
cance, as they also open the way to a more direct 
practical study of the nature of the specificities 
concerned. 

‘The reference to the progressive diversification within 
nerve-cell populations points logically to the more ele- 
mentary question of how any primitive cell types, 
including neurons, acquire their distinguishing charac- 
teristics in the first place. To discuss the problem of 
divergent differentiation would go far beyond the scope 
of this article. There is one aspect, however, that fits 
into the context. Differentiation occurs essentially by 
dichotomies in the courses of transformation of cells 
with basically identical physical and chemical endow- 
ments.'7 Any cell strain is, by virtue of such endow- 
ments, initially capable of effecting a limited variety of 
qualitatively different reactions. Every step of differ- 
entiation implies that from that multiple repertory one 
definite course has become activated in some members 
of the group, and an alternative course in the other 
members, the two courses being mutually exclusive. 
Sooner or later, the different reactions lead to manifest 
diversity. Development is composed of long series of 
such steps. Present evidence is that each step has its 
own mode of triggering mechanism, and that, contrary 
to earlier illusions, there is no single master agent that 
could be held generally accountable for the various and 
Successive dichotomous changes. Formally, however, 
one can distinguish two classes—one in which the 
differentiating activity plays entirely within the bounds 
of a group of similar cells, referable perhaps to differ- 
ential interactions among the members of the group 
depending on their relative positions; and another in 
es the reaction of a given cell group is decisively 

enced by an extraneous cell group. 
his second class is usually termed “induction” ; 
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again, different steps of “inductive” influences need 
have little in common but the name. Some of them 
operate over distances, others only between adjacent 
tissues. Evidently, these latter raise the issue of whether 
true contact interactions are involved. The answer 
seems to be that in some instances and for certain steps 
of differentiation, intimate contact between the inter- 
acting cells is essential, whereas, in other cases, such 
direct contact is dispensable, the interaction being 
mediated by diffusible substances. 

Some cases in which direct contact is necessary may 
be cited as examples. When devitalized (frozen-dried 
and rehydrated) cartilage is transplanted into the 
vicinity of a limb bone in amphibian larvae, the implant 
can induce the formation of new cartilage in contact 
with its surface.1 Thus, competent living host cells 
build on cartilage to the dead cartilage model as bees 
would build on new honeycomb to an artificial wax 
honeycomb presented to them. This is a particularly 
interesting case, because the bodies of the cells in the 
implant cannot have been the inducing agents directly, 
as they are enclosed and insulated in the cartilaginous 
matrix; therefore, the inductive stimulus to surrounding 
host cells must have originated in the exposed surface 
of that matrix. Another set of experiments showed the 
same thing for bone.! A frozen-dried and rehydrated 
metatarsal bone of a rat, inserted into the leg of a host 
rat, induced there new bone, including new bone mar- 
row, in contact with the old bone. That contact is 
essential has been confirmed from other sources.!® Such 
observations support the conclusion that extracellular 
materials may play important roles in certain inductive 
interactions.” 

The next case is of special interest to students of 
collagen. The stroma of the cornea consists of layers of 
collagen in a characteristic regular arrangement. When 
frozen-dried cornea was transplanted into a corneal 
defect in rabbits, the grafted stroma as such persisted 
for many months. In addition, however, it induced 
along its inner side the formation of several new layers 
of typical stroma. Thus, the architecture of the dead 
stroma matrix has in some fashion been transmitted to 
the induced layers, so that they assumed the charac- 
teristic pattern of collagenous corneal lamellae. This 
result again proves that not only cells but cellular 
products likewise can influence other cells inductively. 

The intimacy of submicroscopic cell contact, in the 
sense of the introductory remarks, which may be re- 
quired for “contact inductions” has not yet been clari- 
fied. One of the few clues is the fact that, during the 
period of inductive interaction between eye cup and 
prospective lens epithelium, the attachment between the 
interacting cell layers is so firm as to resist mechanical 
separation.’ But, it is still undecided whether the spe- 
cific transmission across the cell borders is simply an 
orienting or sorting influence on pre-existing molecular 
populations” or whether it involves the actual passage 
of substance,” perhaps prepared by molecular reorienta- 
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tion.! There is a vast field here for exact studies on the 
relations between physical configuration and chemical 
activities. These relations are bound to remain in ob- 
scurity so long as one focuses attention on those cell-to- 
cell influences that are mediated not by contact, but by 
remote control through diffusible agents. The latter, 
although equally real, can tell only part of the story of 
cellular interaction. 

In summing up, to state that all of the specific interac- 
tions among cells exemplified in this article are based 
on specific properties of the interacting cells is a truism. 
But, it explains why so little is known about the molec- 
ular phenomena governing cell-to-cell relations. There 
just has not been enough preoccupation with the molec- 
ular basis of cellular specificity. Unquestionably, cell 
“recognition” and selective response must eventually be 
reduced to terms of properties of molecules and molec- 
ular populations. But, so far, there has been very little 
tangible progress in that direction. There have been a 
few hypothetical suggestions to explain cell-to-cell con- 
formances, none of them quite tenable in the light of 
recent observations. I had proposed*4)”> that some sort 
of steric fitting (e.g., by corresponding charge distribu- 
tions) between complementary molecules might produce 
bonds or linkages between like cells, whereas unlike 
cells without conforming molecules in their surfaces to 
interlock could establish no connections. Yet, this 
supposition is now clearly contradicted by the fact that 
like cells, while remaining together, are not fastened 
but move continuously around each other. This leaves 
us for the time being without substantial explanation 
of the phenomena of selective aggregation as shown in 
the films. 

The only thing that seems clear is that these phenom- 
ena point to the same general area of specificity that 
covers immune reactions, enzyme-substrate interac- 
tions, pairing of chromosomes, fertilization, parasitic 
infection, and phagocytosis. The latter clearly in- 
volves selective uptake of particles of suitable molec- 
ular surface organization. The problem is no different 
from that met in cell-to-cell encounters, except for the 
great disproportion of size between the members of the 
pair in phagocytosis. Presumably, the selective inges- 
tion of macromolecules belongs in here, too, with the 
discrepancy of size even more pronounced. The opposite 
extreme is found in a cell which spreads out selectively 
over a large body of proper substratum, where the cell 
is very small in comparison to the partner which it 
tries to engulf. So, from phagocytosis and the uptake of 

macromolecules, through cell-to-cell contacts with spe- 
cific “recognitions,” up to the problem of selective 
adhesion of a cell to its substratum, one deals with a 
continuous spectrum of problems of the same nature. 

This Very fact holds promise of progress, as advances 
one sector of that broad area will shed light on 
ors. On the other hand, cleverly getting 
the acknowledgment of specificity in any one 
T whether by theoretical constructs or by unreal- 
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istic models, will not strip the other sectors of their 
aspect of specificity, and, hence, will not relieve the 
intellectual discomfort engendered by our inability to 
squeeze the broad subject of specificity into the limited 
conceptual framework which we have erected from the 
study of fragmentary vital phenomena lacking that 
aspect. Specificity, as a real and basic control mecha- 
nism of cell behavior, can no longer be relegated to a 
corner, but must be placed in a central position in 
cellular and molecular biology. The less that is known 
about it, the greater is the challenge. If there has been 
some dodging in the past, it was because of some 
concealed hope that the whole thing might yet in the 
end turn out to have been an illusion born of inadequate 
penetration. Exactly the opposite has happened. The 
more penetrating the analysis, the more cogent has the 
evidence of specificity become; witness the motion pic- 
tures of cell discrimination. There seems little doubt 
that once the reality and universality of the problem 
are generally acknowledged, and the promising tech- 
niques at our disposal for its disciplined study are rec- 
ognized, progress can be rapid. But there is even less 
doubt that, if the existence of the problem continues to 
be widely ignored, or even denied, some of the major 
clues to the understanding of living systems will remain 
missing. 
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N the context of these studies in the biophysical 

sciences, it is the purpose of this contribution to 
identify some of the basic problems of nerve that may 
be effectively studied at the molecular level by the 
methods of biophysics and biophysical chemistry, and 
to describe certain nerve structures, such as the myelin 
sheath, which provide great scope for the application of 
such methods and which have important application 
also to such general biological problems as the nature 
of cell membranes. 


a The primary function of neurons is the transfer of 


information from one region to another, the message 
being propagated along the fiber from cell body to axon 
terminals by means of a local disturbance or propagated 
bioelectric impulse. To many, particularly to those 
studying the activity of the larger nerve units in the 
processing of information by the central nervous system 
or in the behavior of the organism as a whole, the in- 
dividual neuron is but the structural and functional 
unit in a highly complex network whose basic function 
involves the all-or-nothing, on-or-off responses of the 
units. Thus, in neurophysiology, as well as in bioelec- 
trical studies generally, the most elementary problem 
which could be posed to the molecular biologist or bio- 
physicist would doubtless be the elucidation of the 
mechanism by which the impulse is propagated along 
nerve fibers—i.e., the nature of the changes in the 
properties of the surface film of the axon that underlie 
excitation and propagation of the action wave. 

If this were in fact the only problem to be dealt with, 
the present contribution would be very short indeed. 
Little is known about the molecular composition of the 
axon surface membrane and even less is known about 
changes that occur in it during propagation of the im- 
pulse. However, for several reasons, I do not wish to 
take such a limited point of view. In the first place, the 
axon surface membrane does not exist as a separate 
entity. It is but part of the neuron which, as a complex 
reacting system, contains many constituents. Even if 
one were interested only in bioelectric processes, it 
would be impossible to understand the phenomena going 
on in the surface membrane without reference to the 
Structural and biochemical properties of the fiber as a 


whole. M oreover, nerve performs many functions other 
SS 
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than that of propagating impulses. Some of these func- 
tions are known, some can be guessed, while others are 
completely unknown. 

For example, somewhere in the neuron there are 
synthesized certain highly active substances, such as 
acetyl choline and adrenergic substances, which have 
transmitter functions; i.e., they transmit the impulse 
across the junction from the nerve fiber to the muscle 
fiber or other end organ. The site of synthesis of this 
material in the neuron, its mode of transfer from this 
site to the nerve endings (possibly enclosed there in 
vesicles) and its transmission to the innervated tissue 
are at present very poorly understood. Certain neurons 
also produce neurosecretions that can be identified 
cytologically and histochemically and can be demon- 
strated to pass from the nerve cell body down the axon 
to the axon terminals. Possibly other hormonal sub- 
stances are similarly produced and transmitted, but 
have thus far escaped detection. 

Neurons also manifest certain properties, especially 
during embryogenesis or regeneration, that are highly 
specific. Thus, each motor fiber grows out and eventu- 
ally innervates its appropriate muscle. This innervation 
is not only highly specific with respect to the muscle 
fiber which is “sought out” and innervated, but also it 
is necessary for the stability and maintenance of the 
normal properties of the muscle fiber. Cutting the nerve 
fiber to a muscle may cause the latter to change its 
physiological and pharmacological properties markedly, 
or even to degenerate. 

Finally, it is probable that certain aspects of nerve 
function may involve chiefly the nerve cell or axoplasm 
without obvious relationship to the axon surface film. 
Such processes would probably not lend themselves to 
study or detection by bioelectric methods. For example, 
as is discussed in more detail below, certain constituents, 
such as the fibrous protein, seem to be present in all 
neurons and to represent one of the characteristic 
features of neurons generally. However, there is still 
not the foggiest notion as to the biochemical and physio- 
logical function of this protein or protein complex. 

It is desirable, therefore, that the biophysical and 
biochemical investigation of the neuron be not limited 
to bioelectric processes, but be systematic and exhaus- 
tive so as to reveal “what is there,” particularly at the 
molecular and macromolecular level. One has only to 


consider the history of other fundamental biological 


problems, such as that of muscle contraction, to realize 
the importance of such a systematic biochemical and 
biophysical approach. Only when significant advances 
in knowledge of the composition and molecular organi- 
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zation of muscle and its fibrous proteins were obtained, 
could the fundamental problem of the molecular mecha- 
nism of contractility be fruitfully attacked. By com- 
parison with muscle, despite much research extending 
over a century, knowledge of the composition and 
molecular organization of nerve is still primitive. More- 
over, impulse propagation in the nerve fiber is a subtle 
phenomenon which, unlike contractility in muscle, is 
not revealed in structural alterations thus far demon- 
strable even in the electron microscope. 
The investigation of the molecular biology of nerve 
is, therefore, very challenging, particularly to young 
iophysical scientists not yet steeped in traditional con- 
pts of neurophysiology. This article, which is neces- 
rily limited in space, is meant to be provocative rather 
aan exhaustive in its treatment of basic matter. For 
more extensive information, the reader is referred to 
reviews by Schmitt. 


I. NEURON 


The unit of the nervous system, the neuron, consists 
of a nerve cell body with its dendritic arborization and 
synaptic connections with the terminal twigs of other 
neurons; of an axon which is the thin, cylindrical out- 
growth of the nerve cell; and of the endings or terminal 
twigs which make apposition with the innervated cell or 
tissue through the synapse (see Fig. 1). The neuron has 
physiological as well as morphological polarization. The 
impulse passes from presynaptic fiber to nerve cell, 
down the axon and across the terminal synaptic mem- 
brane to the innervated tissue, but not in the reverse 
direction. For present purposes, it is convenient to 
consider the cell body, axon, and endings separately. 


(A) The Cell Body 


The nerve cell body has the task of spinning out the 
axon in development and of regenerating it in the event 
of injury. Since the volume of axoplasm to be biosyn- 
thesized in a long fiber may be hundreds and, in some 
usands of times that of the cell body, it is not 
d that the metabolism of the nerve cell is very 

lish its impressive task of biosynthesis, 
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Fic. 1. Diagrammatic re, 
resentation of a myelinat¢ 
nerve fiber. Note cell body wi’ 
dendrites and "cytoplasmic ¢ 
ganelles, axon, |myelin shea 


of Ranvier, and terminal twig 
making synaptic connectio 
with peripheral tissue. 


Endings 


the cell is equipped with the cytological differentiation 
found by electron microscopists characteristically in 
highly active, secreting gland cells. Particularly promi-* 
nent are the membrane-limited structures (endoplasmic 
reticulum, ergastoplasm) richly studded with RNA-rich 
ribosomes (see Palay). These basophilic structures are 
identical with the Nissl substance, which has long been 
studied by cytological methods and by microabsorption 
spectroscopy and which is known to reflect the synthetic 
activity of the nerve cell. With improved spectrophoto- 
metric methods, such as those developed by Hydén, it 
has been possible to follow quantitatively the changes 
in RNA, protein, lipid, and dry weight of regenerating 
neurons and to correlate these changes with the periph- 
eral growth of the axon (see Brattgard et al.*). Despite 
the increase in protein, lipids, and water during the 
axon outgrowth, the amount of RNA appears to remain 
constant. However, changes occur in the state of aggre- 
gation of RNA; small, finely dispersed particles are 
formed in the process of ‘“‘chromatolysis,” long known 
to occur during regeneration. When contact is re-estab- 
lished between the outgrowing axon and the peripheral 
innervated cell, there is characteristically an increase 
of RNA and of cell volume until the original values are 
restored. According to these authors, the only type of 
cell that can compete with the nerve cell in biosynthetic 
activity and with the degree of cytological differentia- 
tion for such processes, particularly that involving the 
RNA system, is the exocrine cell of the pancreas of small 
animals. The neuron is visualized as ‘‘an enormous gland 
cell structure whose lively protein metabolism serves 
the specific nerve function.””” 

The high metabolic activity of nerve cells (as dis- 
tinguished from nerve fibers), long known to be charac- 
teristic also of normal, unregenerating cells (as in the 
brain), may be owing in part to the constant synthesis 
of axoplasm which—according to Weiss and Hiscoe® 
and supported recently by Waelsch? and by Ochs and 
Burger*—passes continuously, during the life of the 
neuron, as a plastic, gelled mass down the axon at a 
slow rate (ca 1 mm/day). The manner in which axo- 
plasm disappears peripherally, e.g., by oxidation or 


with Schwann cells and nod s% 
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yther metabolic process, is not known. If nerve cells are 
in fact constantly synthesizing axoplasm, it becomes 
mportant to establish the purpose served by such a 
etabolically expensive process. It is possible that one 
.ch purpose may be the maintenance of the turgor 
` the axon over its long extent to the endings. One 
instituent that would also be constantly synthesized 
such a process is the fibrous protein of the axon. When 
ie function of this protein is discovered, light may be 
aed also on the apparent necessity for its constant 
ynthesis. 

The metabolism and the cytological differentiation 
f the nerve cell are responsive not only to the need for 
active biosynthesis, as in the regeneration of a severed 
#xon, but also to physiological activity of the neuron. 
{his apparent energy dissipation is in contrast to the 
highly conservative nature of the process of impulse 
elpropagation in the peripheral fiber. 
The evidence for the highly developed membrane- 
limited structures in the cytoplasm of living nerve cells 
and for the presence of fibrous protein oriented with 
long axes parallel to the distinguishing directions of the 
nerve cell is not merely cytological and electron micro- 
scopical, but derives also from polarization-optical 
studies of fresh nerve cells.’ 


= (B) Axoplasm and the Axon 


As a differentiated type of nerve-cell cytoplasm, axo- 
plasm is a highly complex system. Noteworthy is the 
low absorption at about 2600 A which characterizes the 
region peripheral to the axon hillock, roughly at the 
junction of the nerve-cell cytoplasm with the axon. This 
is in agreement with the relative scarcity in the axoplasm 
of ribosomes, as seen in the electron microscope, and of 
basophilic material, as seen cytologically. Membranous 
material, possibly consisting chiefly of fragments of the 
membrane-rich cytoplasmic system, have been observed 

ij in squid axoplasm in this laboratory by Maxfield! 

|| during the process of protein fractionation and physico- 

| chemical analysis of axon proteins. Peripheral axo- 
plasm is low in materials absorbing at 2600 A (see 
von Muralt"), consistent with the apparent lack of sub- 
stantial biosynthesis in this region of the neuron. 

Among the particulates commonly present in cyto- 
plasm, only the mitochondria of axoplasm have been 
Subjected to special study. These have properties similar 
to those isolated from other tissues (Foster!). In thin 
sections of lobster fibers, they have been observed to 
occur preferentially near the Schwann-cell surface, sug- 
Sesting the possibility that metabolic energy utilization 
may occur in this region (Geren and Schmitt’). 

Although axoplasm is a highly complex, specially 

ifferentiated form of nerve-cell cytoplasm, it is highly 
“esirable that the main features of its composition and 
‘Macromolecular organization be established, not merely 
jin the light it might throw upon the energy coupling 
Ra olved in active lon transport and in other processes 

ated to impulse propagation, but also as clues to the 
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discovery of other processes carried out by nerves, pos- 
sibly unrelated to, and not predictable from, those 
which may be studied by electrical methods. 

For such studies of axoplasm, the squid giant fiber is 
uniquely suited. Axoplasm, relatively uncontaminated 
by sheath constitutents and other nonaxonic material, 
can be readily obtained from these fibers. From axo- 
plasm obtained from Loligo pealii, the New England 
squid, analyses have demonstrated (Koéchlin™) the 
presence of previously unknown substances, particularly 
isethionic acid, which turns out to be the major anion 
in the acid-base balance of the squid nerve fiber. The 
fibrous protein of axoplasm has also been studied in this 
material (Maxfield,'° Maxfield and Hartley,!* Schmitt!). 

More recently, the large squid, Dosidicus gigas, that 
abound in the waters of the Humboldt current off the 
western coast of South America, have been utilized for 
chemical investigations. Deffner and Hafter!7 have dem- 
onstrated the presence in this axoplasm of some fifteen 
free amino acids, as well as cysteic amide (a sulfonate 
closely related to isethionic acid and taurine, which are 
also present in relative abundance) and other low molec- 
ular weight constituents (see Table I), including some 
five peptides whose composition is now being investi- 
gated and whose function is not known. Using high- 
capacity methods of electrophoretic fractionation and 
large amounts (10 to 20 g) of freeze-dried axoplasm, 
these substances are being isolated in quantities suffi- 
cient to permit determination of their composition and 
biological properties. 

Among the macromolecular constituents of axoplasm, 
the fibrous protein is of special interest. Polarization- 
optical analysis (Bear eż al.8) demonstrated (1) that 
the fibrous protein exists in the normal fresh axon where 
it is oriented with long axes parallel to the fiber axis, 
(2) that the intrinsic birefringence of the protein macro- 
molecules is similar in magnitude to that of other fibrous 
proteins, and (3) that the partial volume occupied by 
the oriented fibrous protein is probably relatively small. 
Electron-microscopic observations.” revealed that the 
fibrous protein occurs characteristically as thin (ca 100 
to 200 A) filaments of indefinite length and with no 
demonstrable axial discontinuities or periodic structure. 
P. F. Davison and E. W. Taylor in this laboratory have 
made recent physicochemical studies (unpublished) on 
fresh axoplasm of Dosidicus shipped iced from Chile to 
Cambridge. From these, it appears that the fibrous pro- 
tein may have a molecular weight in the order of millions 
and may be very long and thin. However, the possibility 
cannot be excluded that the protein as it occurs in the 
axoplasm so far available (i.e., suspended in cold saline 
and studied some days after removal from the squid) 
may in fact be a high polymer, of which the monomer 
may have a length of the order of 2500 A. In addition, 
the material which has so far been isolated may repre- 
sent a complex of two or more proteins. Only when 
this protein or protein complex has been isolated and 
characterized chemically and structurally will it be 
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TABLE I. Dialyzable substances in squid axoplasm. Expressed as micromoles per gram dry weight. 
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dossible to determine its biological role. Meanwhile, as 
t is purified and fractionated, it will be subjected to 
screening tests intended to provide clues regarding its 
biological activity. 


(C) The Axon Surface Membrane 


Electron microscopy permits one to visualize directly 
the axon surface membrane, though admittedly this is 
possible only in fibers which have been fixed, embedded, 
and cut into ultrathin sections. This membrane, or some 
component of it, presumably corresponds to the irritable 
membrane whose physical properties before, during, and 
after the passage of the nerve impulse have been studied 
by electrophysiologists. Some of the functions that must 
be subserved by membrane constituents are shown 
schematically in Fig. 2. 

At high resolution, the membrane appears asa double- 
edged structure with two dark edges separated by a less 
dense area (see Robertson*””). The total thickness is 
A, the dark and light lines having about 
} . The apparent thickness of the dense 
them embrane may be reduced as methods are 
A satisfactory interpretation of this structure, 
f the lipid and protein components of the 

must await a detailed study of model sys- 
d lipid-protein systems. The consensus 
to be that the dark regions represent 


regions in 
improved. 
in terms 0 
membr ane, 
tems of lipids an 
at present seems 


the aqueous, protein-containing interfaces. The less 
dense regions represent the lipid phase with its pre- 
ponderant aliphatic hydrocarbon chains. This point of | 
view is not shared by all. Stoeckenius®:™ believes that © 
the osmium-tetroxide deposits electron-dense material © 
in the region of the unsaturated double bonds of the — 
lipids rather than at the protein, aqueous interface. 
Whatever may be the eventual interpretation of the 
structure of the axon surface membrane, it is perhaps 
safe to say that the membrane includes a bimolecular | 


Outside T 


Inside 
Low [Na] 
High [K] 


Fic. 2. Diagrathmatic illustration of ion movements through 
the nerve membrane [after A. L. Hodgkin and R. D. Keynes, 
J. Physiol. 128, 28 (1955)]. Ion movements occurring during the 
action wave are shown at the right; active transport, involving 
coupling of metabolic energy, is shown on the left. The excitable 
membrane pictured is presumably identical with the limiting 
surface membrane of the axon, or some portion thereof. 
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leaflet of mixed lipids and at least a monolayer of protein 
or other hydrophilic, osmiophilic material on either side 
of the lipid layer. This represents the “unit” membrane 
structure, thought by many competent electron micros- 
copists to be characteristic of biological membranes 
generally. In addition to this structure, which might be 
thought of as the “floor space” of the cellular factory, 
one supposes that there are mounted upon or within 
this membrane the enzymes responsible for the special 
molecular and biochemical processes which underlie the 
rapid change in cation permeability at the time of im- 
pulse propagation and the “pumping” of sodium ions 
from axoplasm against an activity gradient into extra- 
cellular space. Obviously, much research is required 
before the details of the molecular organization of the 
membrane can be established. Meanwhile, in describing 
the membrane structure, it is perhaps desirable to avoid 

m terms such as “triple” membrane (so-called because of 
the two dense and one less dense regions) ; this might 
cause investigators of the nature of the true ‘“‘physio- 
logical”) membrane (i.e., that deduced from indirect, 
chiefly electrical, methods) to identify one or another 
component of the fixation artifact seen in the electron 
microscope as a self-sufficient entity, ignoring the fact 
that the membrane must almost certainly be considered 
as a complex system of interacting, interdependent 
components. Precisely in such situations are the methods 
of molecular biology different from those of descriptive 
morphology. 


(D) Nerve Endings and the Synapse 


At axon endings, electron microscopy reveals an ac- 
cumulation of particulates, particularly mitochondria. 
Also characteristic is the presence of small (ca 200 to 
500 A) vesicles, appearing sometimes in large quantity. 
It has been suggested that these vesicles contain acetyl 
choline, adrenalin-like materials, and other physiologi- 
cally and pharmacologically active substances such as 
histamine and serotonin. If transmitter substances occur 
in packets and if such packets were presented at the 
end organ, perhaps as a secretion across the membrane, 
certain electrophysiological studies of synaptic poten- 
tials might thus find explanation (Katz, p. 524). At 
‘present, the origin of the vesicles is still unknown, as 
‘are the details of the mechanism by which they or their 
‘contents find their way into the endings. 

Of considerable theoretical significance is the problem 
‘of the nature of the relationship between the limiting 
imembranes of the nerve terminals and of the muscle or 
‘other innervated cell. If, as current consensus indicates, 
there is a space of at least a few hundred Angstrom 
‘units between these membranes, the problem of the 
mechanism whereby transmitter substances pass from 
meuron to innervated tissue is a different one than if the 
‘two membranes were strictly in molecular contact. This 
iis still a matter for discussion. The factors involved in 
«determining membrane separation by electron micros- 

«copy are considered below in more detail. 
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II. SATELLITE-CELL DIFFERENTIATIONS AND 
RELATIONSHIP TO THE AXON 


Extending to the farthest reaches of the body, por- 
tions of the nerve axon may be very remote from the 
metabolic center near the nucleus of the cell body. This 
would constitute an inherently very unstable system. 
However, the axon is probably seldom actually remote 
from metabolic centers because it is everywhere sur- 
rounded by thin satellite cells, the Schwann cells, which 
are themselves metabolically active. These cells may 
well have intimate biochemical relationship with the 
axon. Such cell-to-cell biochemical relationship and in- 
terdependence have in fact been demonstrated in vitro. 
Puck, for example, has shown that a monolayer of 
cells of a certain type, which are themselves unable 
to obtain all of their nutritional requirements from the 
medium, can be stabilized by depositing upon them a 
layer of cells which can provide or utilize the required 
factor. These act as “feeder” cells to the underlying 
cells and constitute a stable system. It seems possible 
that the evolutionary advance to the higher organisms, 
with their extensive nervous system, depended upon 
the development of the process whereby satellite cells 
come to envelop outgrowing axons, thus permitting 
the fibers to extend to the extremities of the body as 
stable neurons. 

The satellite cells may function in relation to the 
axon in other ways also. In any homogeneous group of 
nerve fibers, the velocity of impulse propagation is a 
direct function of the diameter of the fiber. In inverte- 
brate fibers, to achieve relatively high velocity of propa- 
gation (30 to 40 m/sec), the axon diameter becomes 
relatively enormous (0.5 to 1.0 mm in the squid). How- 
ever, in myelinated fibers in which the axon is covered 
to some 99% by myelin, impulse-propagation velocity 
may be high (50 to 100 m/sec) in fibers of modest 
diameter (10 to 20 u). In these cases, impulse propaga- 
tion is believed to be saltatory in nature rather than 
continuous; i.e., the membrane changes responsible for 
excitation occur in very restricted regions (at the nodes 
of Ranvier), and internodal segments respond as entire 
units. The investigation of the molecular organization 
of the myelin, which acts as internodal insulation, and 
of its mode of formation by the Schwann cell, present a 
most rewarding opportunity for the methods in the 
molecular biology of nerve. 


(A) Myelin 


From polarization-optical analyses, it was shown by 
Schmidt*® that the lipid molecules in the myelin sheath 
are oriented with their paraffin chains extending radi- 
ally, whereas the protein components are oriented with 
their distinguishing directions perpendicular to this 
direction, i.e., tangential. The lipid-protein layers must 
be thin with respect to the wavelength of light, con- 
stituting a Wiener mixed body.2? 


X-ray diffraction studies supported this view and 
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small-angle diffraction showed that the lipid-protein 
layers have a characteristic thickness of about 171 A in 
cold-blooded animals and about 185 to 190 A in warm- 
blooded animals (Schmitt eż al.28). On the basis of ex- 
tensive studies of the individual lipid types both dry 
and in aqueous systems, these authors showed that the 
identity period in the radial direction must include two 
bimolecular leaflets of mixed lipids, and that the protein 
and water components must occupy approximately 25 A 
each. Several alternative models involving bimolecular 
leaflets flanked by monolayers of protein were proposed 
(see Fig. 3). It was pointed out” that nerve myelin, 
being susceptible to x-ray diffraction analysis and having 
a lipid-protein architecture similar to that of cell mem- 
branes, might provide a model for the study of such 
membranes which, because of their thin paucimolecular 
nature, could not be similarly studied. 

The concentric-layered structure of myelin was con- 
firmed by direct visualization in the electron microscope 
(see F erndndez-Morén® and Sjöstrand“) after develop- 
ment of the technique of thin sectioning for electron 

x thickness of the layers was somewhat 
microscopy. The diffraction for the 
less than that demonstrated by x-ray diffraction tor 


Fic. 3. (a) Schematic represen- 
tation of molecular organization of 
the nerve myelin sheath (aiter 
Schmitt, Bear, and Palmer*’). A, 
concentric lipid-protein layered 
structure deduced from polariza- 
tion optics and from small-angle 
x-ray diffraction; B, unit lipid- 
protein structure in the radial 
direction (perpendicular to the 
planes of the layers); C, sections 
through the molecular chains in the 
planes of the layers. (b) Diagram 


Cholesterol summarizing molecular organiza- 


Cerebroside tion of nerve myelin from x-ray 
Cholesterol diffraction data, as suggested by 
olestero Finean.* 


| Sphingomyelin 


Region of hydrophilic 
groups 


fresh fibers. This difference was shown to be owing to 
the fixation, dehydration, and embedding required in 
the electron-optical technique; nerves were subjected 
to diffraction analysis after each step in the preparative 
technique, and the thickness of the layers was deter- 
mined by electron-microscopic examination of thin sec- 
tions of the same specimens.®2—*4 

After the main features of myelin architecture had 
been determined, the question arose as to how such a 
structure is produced by the Schwann cell, in collabora- 
tion with the axon. A clear-cut answer to this question 
was provided by Geren,**%6 at least in the case of pe- 
ripheral fibers. From observations of embryonic sciatic 
nerves in chick embryos,*” Geren proposed her mem- 
brane theory which is schematically illustrated in Fig. 4. 
According to this ingenious theory, now verified by the 
observations of many investigators (e.g., Robertson*), 
the Schwann cells, after enveloping the outgrowing 
axons and causing their respective surface membranes 
to adhere firmly, wrap themselves many times around 
the axon by a continuous infolding of the outer Schwann- 
cell surface membrane. The mechanism of myelination 
in the central nervous system is less clear. Although 


CC-0. Gurukul Kangri University Haridwar Collection. Digitized by S3 Foundation USA 


Y 


> 


EE a a ee 


MOLECULAR ORGANIZATION OF 


various authors have seen the myelin layers as a spiral 
(in transverse section), as in peripheral fibers, others 
(Luse® and DeRobertis et al.) have obtained evidence 
which they interpret as indicating that lipid-protein 
layers are formed in the cytoplasm of the oligodendro- 
cyte, perhaps by the fusion of vesicles 200 to 800 A in 
diameter, which condense into the compact myelin 
layers. The myelin layers are hence not concentric but 
are spirally wrapped about the axon. After many wrap- 
pings (or few, depending upon the fiber type) the 
Schwann-cell cytoplasm is pressed from among the 
layers which become condensed into the smectic, para- 
crystalline material of the finished myelin. This process 
requires a prolific biosynthesis of lipid-protein material. 
Where this is synthesized in the cell and how it comes 
to be incorporated in the cell membrane is not yet 
known. The myelination process would seem to provide 
an ideal system in which to study the molecular mecha- 
nism of lipid-protein synthesis and of membrane forma- 
tion in general. 

From Geren’s membrane theory, it immediately be- 
comes obvious why there are lwo, rather than one, bi- 
molecular layers of lipids in the x-ray identity period 
in the radial direction (see Schmitt, Bear, and Palmer’®). 
As the outer Schwann-cell membrane infolds, the outer 
surfaces of the membrane remain in contact. As the 
layers condense, with the expulsion of Schwann-cell 
cytoplasm, the membrane surfaces which faced the 
Schwann-cell cytoplasm also adhere, fuse, and form the 
darkly staining line seen in the electron micrographs 
(see Fig. 5). The alternate less-dense line represents the 


Fic. 4. Schematic representation of the membrane theory of 
the origin of nerve myelin, (after Geren"). Four stages in the en- 
gulfment of the axon by the Schwann cell and the wrapping of 


many double layers which, after condensation, form the compact 
myelin. 


SEE 
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Fic. 5. Illustrating enfolding of Schwann-cell surface membranes 
to form compact myelin, after the theory of Geren.” Note differ- 
entiation between inner and outer surfaces of enfolding membrane; 
in the double membrane, this gives rise to Finean’s “difference 
factor” and explains why the full repeat distance in the radial 
direction involves two bimolecular leaflets of lipo-protein (two 
membranes). 


fusion of membrane surfaces which originally faced out- 
ward to extracellular space. Finean," who has continued 
the x-ray analysis in a systematic fashion over the last 
decade, has proposed a molecular model of myelin in 
which the cholesterol molecules fit in between the phos- 
pholipid and cerebroside molecules in characteristic 
fashion (see Fig. 3). Further refinements in the x-ray 
and electron-microscopic techniques, together with the 
study of various lipid-protein models, may provide im- 
portant new information about membrane structure 
which will be of vital importance in the rapidly expand- 
ing field of molecular biology. 

As Schwann or satellite cells envelop the outgrowing 
axon and start to wrap their surfaces in multiple folds 
about it, adjacent Schwann cells come into contact and 
form the nodes of Ranvier. This nodal region of the 
finished, myelinated fiber has not yet been exhaustively 
studied in the electron microscope. In view of the critical 
nature of this region, suggested by the theory of salta- 
tory conduction, its ultrastructure merits careful study. 
According to a preliminary report by Robertson,” the 
Schwann cells on either side of the node form finger-like 
processes which interdigitate closely over the node. The 
region between the processes, presumably containing 
extracellular fluid, is of the order of several hundred 
Ångström units thick. To this extent, the axon surface 
membrane is exposed to extracellular fluid at the node, 
according to Robertson. 

The incisures of Schmidt-Lantermann presented a 
problem because, if they are continuous channels be- 
tween the axon and the exterior through the myelin, 
they would offer conducting paths for current. However, 
electron microscopy has demonstrated that this is not 
the case.-8 The incisures are produced by a systematic 
shift in the relationship of all of the helically wrapped 
membranes, but each membrane is continuous across 
the incisural gap. There are thus as many layers tra- 
versing the incisures as occur in the compact region of 
the myelin,”and the incisures are therefore not regions 
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through which ions and currents can freely flow into 
the axon. 


III. SATELLITE-CELL-AXON RELATIONSHIPS 
IN “UNMYELINATED” FIBERS 


It has long been known to neurohistologists, particu- 
larly to those of the Spanish school, that nerve fibers 
are everywhere covered by satellite cells. The fact that 
these cells and their possible functional role—in the 
mature as well as in developing and regenerating fibers 
—have not greatly interested physiologists, or even 
histologists, is to some extent owing to their small size 
(ca 0.1 to 1 in most invertebrate fibers) or to the fact 
that myelin (which has been much studied by histolo- 
gists and physiologists) was not recognized as satellite- 
cell substance (i.e., a compact helical layering of double 
surface membranes). The fact that, early in the develop- 
ment of histology, a sharp distinction was made between 
“myelinated” and “unmyelinated” fibers (primarily on 
the basis of staining reactions), led some physiologists 
to the mistaken notion that a substantial fraction of the 
animals’ nerve fibers are in effect naked axons. From 
this, it is obvious why the cellular covering of the axon 
attracted little interest except to those studying nerve 
development or regeneration, where the satellite-cell 
covering is unquestionably of functional significance. 

Polarization-optical studies (see Schmitt and Bear”), 

based largely on a reinterpretation of the mechanism of 
he so-called metatropic reaction discovered by Göth- 
lin,“ strongly suggested that at the surface of all nerve 
fibers, vertebrate or invertebrate, myelinated or unmye- 
linated, oriented lipid-protein material is present which 
has qualitative similarity of molecular organization 
though quite variable quantitatively from one fiber 
type to the next. This led to the notion that “all nerve 
fibers are to some extent myelinated.’ 

This matter could be clarified only when the region 
at the surface of nerve fibers—i.e., the monolayer of 
satellite cells—could be observed at high resolution by 
the electron microscope. Details of the discoveries thus 
brought to light are beyond the scope of this paper, but 
a few of the more significant aspects at the molecular 
level, particularly insofar as they bear on basic physio- 
logical concepts, are considered. 

Investigation with the electron microscope of the de- 
tailed structure of the satellite cells and their relation 
to the axon is not yet a decade old, and it is, therefore, 
not unexpected that knowledge of the subject is still 
very primitive. No doubt, special methods of fixation 
and preparation will have to be developed to permit 
: tigation of the most fundamental aspects of the 
Sas such as the possible variation of intercellular 
Le ee as manifested morphologically by the ap- 


aa or channels between axon surface and 
_ parent SP Jis and between satellite cells themselves, 


satellite Ce 
rate, unmyelinated (but not un- 


li teb 
aL such as those of the lobster leg 
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or claw nerve, are surrounded by satellite cells which 
are very thin (ca 0.1 u). Except at the region of the 
nuclei, there is relatively little space between the two 
parallel surfaces of the cells (i.e., facing the axon surface 
membrane and that facing the basement membrane, 
connective tissue investment), although mitochondria 
and some of the typical equipment of the cytoplasm of 
active cells may be seen. The axon surface is charac- 
teristically highly contorted, and axoplasmic mitochon- 
dria are seen with the long axes of their ellipsoids ori- 
ented into the folds of these contortions, suggestive of 
energy utilization at this interface (see Fig. 6). The 
outpocketings of the satellite cells into the axon may 
manifest all of the stages to be expected if particulates 
(mitochondria?) were actually being formed and ex- 
truded into the axon by filling such outpocketing with 
satellite-cell cytoplasm (including its particulates) and 
pinching them off into the axoplasm (see Geren and ! 
Schmitt!#45) 


Connective tissue 


Mitochrondria 


Fic. 6. Diagrammatic illustration of thinness of satellite cell in 
lobster fiber and of intimate relationship between mitochondria 
and axon surface membrane (after Geren and Schmitt"). 


Such very thin cells would be expected to show only 
a very weak birefringence and metatropic effect, since 
only two lipid-protein surface membranes are involved. 
Nevertheless, this optical effect was observed and cor- 
rectly interpreted, except that the layers were simply 
assigned to the region at the surface of the axon, rather 
than to the membranes of the satellite cells.‘ 

The satellite cells of squid giant fibers are somewhat 
thicker (ca 1 u) than those of lobster, though the ratio 
of thickness to axon diameter may actually be less. In 
any histological section of the giant fibers, as many as 
two or three nuclei may be present. If one were to chart 
the boundaries and domains of the satellite cells upon 
the axon surface membrane (e.g., by slitting the fiber 
and flattening out the membrane), one would see that 
many cells are involved in the formation of the cellular 
monolayer that surrounds the axon. 

The most distinguishing feature of the cytoplasm of 
these cells is the presence of many (a half-dozen or more) 
osmiophilic, dense membranous layers lying in planes 
predominantly parallel with the cell surface. Whether 
these layers are, related to those present in typical, 
metabolically active cells (“endoplasmic reticulum”), 
or whether they have a special structure and function 
in satellite cells remains to be established by more- 
detailed examination. However, if one assumes that the 
layers have the composition and ultrastructure typical 
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Fic. 7. Relationship of axon (Ax) 
and Schwann cell (S.C.) in the squid 
giant fiber (after Schmitt and Ge- 
schwind?). A, cross section of giant 
fiber showing very thin Schwann cell 
surrounding axon. X125. B, enlarged 
segment portion of Schwann cell, 
3000; C, same as B, showing intra- 
cytoplasmic layers, X9000; and Dı 
and Dz, possible relationships at the 
molecular level of axon surface mem- 
brane and Schwann-cell membrane. 
Dı presumes layers are in molecular 
contact. (Water layer does not exceed 
about 15 A in thickness.) De, showing 
thicker water layers (100 to 200 A) be- 
tween adjacent membranes. X 150 000. 


a 
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of lipid-protein membranes generally (hence, also of 
myelin), it is possible to account for the polarization- 
optical properties described by Bear eż al.“ and ascribed 
by them to a metatropic layer at the outer surface of the 
satellite cells—a kind of myelin-like layer. Geren and 
Schmitt’? have shown that the optical effects can satis- 
factorily be accounted for on the basis of the intracyto- 
plasmic dense layers present in the satellite cells. There 
is no differentiated myelin-like layer at the outside of 
these cells, only the amorphous basement membrane. 
One of the structural details that must await further 
investigation for clarification, but that may be of great 
ultrastructural and perhaps functional significance, is 


Fic. 8. Possible diffusion pathway 
for solutes passing between axoplasm 
and extracellular fluid (after Schmitt). 
(a) bare-axon membrane is exposed to 
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the nature of the boundaries between satellite cells and 
the axon surface membrane, between adjacent satellite 
cells and between intracytoplasmic membranes. Of par- 
ticular importance is the thickness of the aqueous regions 
between these membranes, because ions passing from 
extracellular space to axoplasm and in the reverse direc- 
tion must traverse such channels. 

The situation is illustrated schematically in Figs. 7 
and 8. If the adhesion between the surface membranes 
were as close as in the myelin sheath, there would be 
no substantial channels for ion diffusion. The aqueous 
phase would have a thickness of the order of 12 A ac- 
cording to the x-ray diffraction analysis of myelin struc- 


extracellular fluid (b) diffusion channel 
is 150 to 200 A thick, filled with ex- Nixon 


tracellular fluid. Active interchange 
occurs only between axon surface WW 


membrane and fluid in channels; pas- (a) 


sive diffusion occurs in other channels. 
(c) same as (b) except that active inter- 
change of metabolites and solutes oc- 
curs across all membrane surfaces; 
satellite cells participate actively in 
processes involving interchange be- 
tween axon and extracellular fluid. (d) 
satellite cells and axon membranes 
are in molecular contact; water 
channel not more than 10 to 15 A thick 
with structure comparable to that of 
compact myelin. 
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Fic. 9. Diagrammatic representation of relationship of very 
thin C fibers to Schwann cell (after Robertson‘), 


ture.?8 On the other hand, if these intercellular relation- 
ships are similar to those found by Robertson‘? and by 
electron microscopists to obtain between cells generally, 
the aqueous channels, communicating with the frank 
extracellular fluid of the ground substance of the sur- 
rounding connective tissue, would have a thickness of 
about 150 A. 

Robertson”!?48 has found such a membrane rela- 
tionship in the case of the vertebrate “C” fibers. In 
these nerves, the very thin (ca 0.1 u) axons pass through 
the satellite (Schwann) cell tube enclosed in a fold of 
the surface membrane of the satellite cell in a manner 
analogous to that assumed by Geren’s membrane theory 
to explain the origin of myelin (see also the earlier 
works of Gasser‘®—®), 

In analyzing the functional significance of the inter- 
membrane relationship and the thickness of the aqueous 
regions between limiting membranes in life, several 
factors must be considered. Indeed, this situation illus- 
trates certain indeterminacies that characterize the elec- 
tron microscope approach to molecular biology generally. 

From the properties of the constituent membranes, 
including the charge on the membrane molecules, a 

separation of the order of 100 to 200 A between the 
membranes might be expected.*~°6 This equilibrium 
distance will vary with the ionic strength (a statistical 
physicochemical concept that is difficult to apply to 
such thin capillary spaces). If the equilibrium distance 
between surface membranes is responsive to the ionic 

rength of the environment, it is obvious that a factor 
A ie tance is the ionic strength of the fluid used to 
OO Another factor is the osmotic pressure 


es. SUr 
s De ring the intercellular fluid. If this is 
pee ue teal with that of the intracellular fluid, then 
not 1 


hen the system is placed in the fixative (which pre- 
when 
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sumably quickly destroys membrane semipermeability 
and such vital equipment as “ion pumps”), an osmotic 
transfer of water may be expected, and this might im- 
portantly influence the distance between surface mem- 
branes as observed in the electron microscope after 
fixation and sectioning. 

It has become evident from electron-microscopical 
studies of metabolically active cells not only that the 
metabolic activity involves the presence of membranes 
separated by small distances, but also that the products 
of the metabolism, presumably including ions, must find 
their way into the aqueous space between the mem- 
branes. The presence of these metabolites in turn must 
affect the equilibrium separation between the mem- 
branes and the properties of the intercellular fluid within 
these thin capillary spaces. Such factors must be taken 
into consideration in dealing realistically with these 
spaces which are the channels of communication be- 
tween the axon surface membrane (presumably the seat 
of the rapid alterations responsible for the propagation 
of the impulse in the nerve fibers) and frank extracellu- 
lar space. According to Frankenhaeuser and Hodgkin,*” 
if such intercellular channels are the route of diffusion 
of potassium ions from axoplasm to the exterior, the 
electrical data would require a channel thickness of 
about 300 A in a structure similar to that shown by 
Geren and Schmitt" assuming a process of passive diffu- 
sion calculable on the basis of properties characteristic 
of macroscopic bulk phases. Whether or not additional 
properties, deriving from the microscopic dimensions of 
the intermembrane system and from the presence of 
solutes, particularly metabolic products of the satellite 
cells, require modification of this view remains for 
further analysis to determine. That the satellite cells of 
the squid giant fiber are indeed metabolically active 
has been shown by the measurements of oxygen con- 
sumption of the sheath cells after removal of the axo- 
plasm,°* and from an assay of enzyme activities of 
sheath cells and axoplasm.* 
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INTRODUCTION 


HE mechanism by which signals are rapidly trans- 
mitted over long distances in the body has 
formed one of the principal preoccupations of biophysi- 
cists for well over a century. Since the days of Du Bois- 
Reymond (about 1850),! it was known that nerves not 
only are excited by electric currents but also produce, 
in the course of their activity, electric currents of their 
own. His pupil, Hermann,” suggested that the propa- 
gation of a signal along a nerve fiber (or axon) involves 
a process of recurrent electric stimulation from point 
to point. Hermann also called attention to the apparent 
cable properties of nerve fibers, although he recognized 
that, by passive cable transmission alone, nerve signals 
could not travel over any appreciable length. Later, 
under the influence of Nernst, Bernstein’ developed the 
basis of the modern membrane theory. According to 
Bernstein, the key to our understanding of the nerve 
impulse is to be found in the properties of the surface 
membrane of the axon, and especially its selective and 
changeable permeability (or conductance) to ambient 
ions. His theory has been modified in important details, 
but in essence it still stands, and the advances which 
have been made in this field during the last two decades 
have confirmed the usefulness of Bernstein’s basic 
concept. 
The view which now has emerged is (a) that the 
electrical events in a nerve fiber are governed by the 
differential permeability of its surface membrane to 
sodium and potassium ions, and (b) that these perme- 
abilities themselves depend upon the electric field across 
the surface. The interaction of these two factors leads 
at a certain critical threshold level to excitation, that 
is, to a regenerative release of electrical energy from 
the axon membrane, and the propagation of this change 
along the fiber in the form of a brief, all-or-none, elec- 
trical impulse (the so-called spike or action potential). 
As F. O. Schmitt has pointed out (p. 455), when a 
physiologist speaks of the axon membrane, he usually 
has in mind an abstraction rather than a microscopically 
identified structure. This point, viz., the relation be- 
tween fine structure and functional properties, is dis- 
cussed later in this paper; here the main concern is to 
indicate what evidence there is for such physiological 


“abstractions.” 
RESTING NERVE FIBER 


se of discussing the principal features 
], the structural picture can be reduced 
lindrical tube with a surface mem- 


For the purpo 
of the nerve signa 


brane which separates two aqueous solutions of equal 
osmolarity but of different chemical composition. In the 
external medium, more than 90% of the osmotic balance 
is made up of sodium and chloride ions; while inside the 
cell, these ions account for less than 10% of the solutes, 
potassium taking the place of sodium, and various 
organic anions (which presumably are synthetized with- 
in the cell itself) taking the place of chloride. Our main 
concern is with the concentration differences of sodium 
(about 10 times higher outside) and potassium (about 
30 times higher inside) across the cell surface. With the 
use of very fine KCl-filled micropipettes, it has been 
possible to penetrate the fiber surface without serious 
damage and to measure the electric PD between inside 
and outside. There is a potential difference across the 
membrane of some 60 to 90 mv (inside negative), while 
there are no detectable potential differences within the 
interior of the normal resting cell. 

Observations of this kind have been made on a variety 
of nerve and muscle fibers. Most of the evidence de- 
scribed in the following has been obtained from the 
giant axon of the squid whose large size (nearly 1 mm 
thick) allows one to introduce an assembly of fine elec- 
trodes along the inside to analyze the ionic content of a 
single fiber and to observe the movement of radio-tracers 
within selected regions of such fibers (see Hodgkin,‘ ® 
also Huxley® and Katz’ for review and references). 

It should be mentioned perhaps that the nerve fibers 
of squid and other cold-blooded animals can be isolated 
and kept in a suitable salt solution. They remain in a 
functioning condition for many hours during which they 
can be made to propagate many thousands of impulses 
of approximately the same voltage and velocity as 
initially, in situ and with the blood circulation intact. 
Such fibers remain capable, to some extent, of replenish- 
ing the losses from their chemical stores which are drawn 
upon during periods of activity. Nevertheless, it is true 
that isolated tissues and, in particular, nerve fibers 
(which, after all, are only peripheral stumps of a cell) 
are no longer in a steady state, and their electrochemical 
accumulator gradually runs down. 

It is of interest that muscle fibers can be used to study 
propagation of impulses as well as nerve. Muscle fibers, 
unlike severed nerve axons, are self-contained cell units 
which possess a built-in apparatus of distributed nuclei 
(and which, incidentally, do not possess investing layers 
of satellite cells). The surface membrane of a muscle 
fiber has certain important properties in common with 
that of a nerve: it provides a selective barrier between 
cytoplasm and surroundings, and it serves to conduct 
electric excitation rapidly over the whole length from 
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the nerve-muscle junction to the tendon. This is an 
essential mechanism for the skeletal! muscle, for the 
action potential provides the local stimulus to the con- 
tractile process at each point of the fiber, and a high 
speed of propagation is needed to elicit an efficient 
synchronous twitch. As far as has been ascertained, the 
properties of the vertebrate-muscle impulse are exactly 
analogous to those of the nerve action potential. 


MAINTENANCE OF THE STEADY STATE 


The first question to consider is: What are the mecha- 

nisms responsible for the maintenance of a steady state 
in a resting (i.e., nonstimulated) nerve or muscle cell? 
In particular, how are the electrical and ionic con- 
centration differences kept up between interior and 
surroundings? 
, Is potassium chemically bound? It has been suggested 
from time to time that one does not require selective 
membrane properties in order to explain the preferential 
accumulation of potassium in the cytoplasm. It would 
be sufficient, for example, if the cellular proteins had a 
special chemical affinity to potassium rather than to 
sodium. This idea of chemical binding of potassium 
seemed unlikely on several grounds: it would be difficult 
to account for the osmotic pressure and electric con- 
ductivity of the cell contents unless at least a large 
proportion of the intracellular potassium was present 
in the form of free ions. The most direct evidence was 
obtained by Hodgkin and Keynes® in the following 
experiment. A region of an isolated Sepia axon (about 
150 » thick) was bathed in a droplet containing radio- 
potassium. After a quantity of the labeled ions had 
entered the axoplasm, the external droplet was washed 
away. It may be noted that the mixing between applied 
tracer and intracellular potassium takes many hours, 
but the exposure time was sufficient to build up an ade- 
quate amount of radioactivity within the axon. Subse- 
quently, the spread of this patch by diffusion along the 
interior of the fiber was determined; at the same time, 
a potential difference was applied to the ends of the 
fiber, and the speed with which the labeled patch moved 
toward the cathode was measured. The experiment 
showed that the diffusion coefficient and the electric 
mobility of the tracer ions were only slightly less inside 
the cell than in free aqueous solution. There was clearly 
a highly resistive surface barrier, for the entry of the 
externally applied potassium into the axoplasm was 
very slow; but once the tracer ions had passed this 
barrier, they continued to behave as free ions. 


BERNSTEIN MEMBRANE THEORY 


The experiment of Hodgkin and Keynes seems to rule 
out the suggestion of chemical binding of potassium and 
provides direct evidence for the existence of a resistive 
surface membrane as envisaged by Bernstein more than 
50 years ago. He postulated that the nerve membrane 
was selectively permeable to potassium alone, and im- 


+40 


Fic. 1. Action potential recorded with an internal electrode 
from a squid giant axon. The scale shows the internal potential 
in millivolts, relative to the outside bath. The time marks are at 
2-msec interval [from A. L. Hodgkin and A. F. Huxley, Nature 
144, 710 (1939)]. 


permeable to sodium, chloride, and the internal anions. 
This was an attractive hypothesis, for it would explain 
not only the maintenance of steady ionic concentration 
gradients, but also the existence of the resting potential 
as resulting from a potassium-concentration cell. More- 
over, it was possible to predict an inverse logarithmic 
relation between the external potassium concentration 
and the membrane PD, and over a fairly wide range 
there was satisfactory agreement between theory and 
experiment. Bernstein went one step further. He postu- 
lated that the membrane permeability to the other ions 
increased during electric excitation so that the impulse 
should be accompanied by a sharp increase of the mem- 
brane conductance and a drop toward zero of the mem- 
brane potential (or rather to the low level of a liquid 
junction PD between axoplasm and external fluid). For 
many years, these ideas were accepted as a reasonable 
explanation, and the evidence, rather inaccurate and 
indirect as most of it was, tended to support it. More 
recently, it was strengthened by the observation of 
Cole and Curtis? that the rising phase of the action 
potential indeed is accompanied by a rapid 40-fold in- 
crease of the ionic membrane conductance. 

The flaws began to appear soon after, when increasing 
use was made of the critical methods introduced by 
Cole and by Hodgkin, viz., the single axon technique 
and the use of intracellular recording, and also when 
improved chemical and tracer methods were employed 
in observing ion movements. In 1939, Hodgkin and 
Huxley" (see also Curtis and Cole!) discovered that the 
membrane potential during the nerve impulse does not 


move simply toward the zero level (Bernstein’s pre- 
dicted “depolarization”), but reverses substantially to 


a level of 40 to 50 mv—inside positive (see Fig. 1). 


This was clearly incompatible with the second part of 


the Bernstein theory just now outlined, Two years later, 
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Boyle and Conway” demonstrated that muscle fibers poisons has no immediate effect on the resting (or 
are permeable to chloride as well as to potassium, and action) potential, which runs down very slowly as a 
shortly afterward it became clear that even sodium can result of the gradual diminution of the ionic concen- 
enter the fibers, though apparently it encountersa much tration gradients. It appeared, therefore, that the ion 
higher membrane resistance than does potassium. pump works on a principle of electroneutrality, either 
taking sodium out in company with some anion (e.g., 
SODIUM PUMP bicarbonate or phosphate), or through a cation-exchange 
This raised an altogether new problem, for the demon- process. There is evidence that the latter is the prevail- 
stration of even a small permeability to sodium made it 2g mechanism, and that extrusion of sodium and in- 
impossible to regard the physical properties of the sur- ternal uptake of potassium are linked closely in the 
face barrier as sufficient to maintain the concentration Pumping process. The most suggestive piece of evidence 
and potential gradients in a steady state. Potassium ÍS the finding" that the extrusion rate of sodium becomes 
and chloride appear to be very nearly inelectrochemical reduced greatly when potassium is withdrawn from the 
equilibrium, their large concentration differences being bath solution, and that, in general, the rates of potas- 
balanced approximately by the membrane potential, Sium entry and of sodium efflux change simultaneously 
But, in the case of sodium, both driving forces act in nd in a parallel manner. — > I ' 
the same direction and tend to shift the ion into the To summarize present views of the ionic maintenance 
interior. If the membrane resistance toward sodium is Mechanisms of the resting cell, one imagines that thè 
not infinite, then some mechanism entirely different cellular metabolism is responsible for the upkeep of ionic 
from the ones discussed so far is required to keep its Concentration differences between cell and surroundings, 
internal concentration at a steady low level. (1) by synthetizing large organic anions which « annot 
It was at this stage that the postulate of a metabolic diffuse through the membrane, and (2) by providing 
sodium pump was introduced—that is, some secretory the energy for an active ion-exchange mec hanism, ex- 
process in which an energy-yielding reaction of the cell pelling sodium and accumulating potassium ions, possi- 
is utilized to perform the work of a Maxwell demon, bly one for one. The cell membrane has a generally 
moving sodium ions uphill and outward through the extremely low ionic permeability so that even in the 
membrane as fast as they leak into the cell in the direc- complete absence of pumping action it will take many 
tion of the electrochemical gradient. There was nothing hours before the Na/K concentration gradients run 
intrinsically implausible about the suggestion, for the down. The ionic permeability of the resting membrane, 
permeability of the resting cell surface to sodium is &part from being small, is also differentially very selec- 
normally so low that the leakage rate remains very tive, that to potassium being much higher than to 
small, and the work required of the pumping process sodium, so that the PD across the resting membrane 
amounts to only a small fraction of the energy which is @PProximates to that of a potassium-concentration cell, 
continuously being made available by the metabolism though it does not quite reach this level (it would do 
of the cell.13 so only if the sodium permeability were negligible). 
That there is, in fact, a direct relation between cellular The evidence for the existence of a cation pump is 
metabolism and sodium extrusion was first shown by impressive; there is little doubt that the required Max- 
Hodgkin and Keynes" who observed a cessation of the well demons are being paid for; the account seems 
efflux of tracer sodium when the axon was treated with Satisfactory not only in terms of available calories, but 
certain metabolic inhibitors (dinitrophenol, azide, cyan- even in the form of the available currency (made up 
ide). This was a reversible change and one which spe- of ATP and other acceptable phosphates). But, as in 
cifically affected the extrusion rate of sodium, not its most instances of specific cellular utilization of meta- 
rate of leakage into the cell. More recently, Caldwell bolic energy yields, there is no theory at present to 
and Keynes" have shown that “pumping” is temporarily explain how the so-called driving reaction is geared to 
resumed by the cyanide-inhibited axon if a dose of the particular purpose of expelling sodium and piling 
adenosinetriphosphate or arginine phosphate is in- UP Potassium. This is one of the familiar challenging 
jected into the axoplasm. gaps waiting to be bridged, even if only by a temporary 
At one time, it was thought that the sodium pump working hypothesis. 
operated by expelling sodium alone, and in doing so CABLE PROPERTIES OF NERVE 
able of separating electric charges at the fiber ; 
was cap z a e fericting potential Tt has been seen from the tracer experiments that the 
surface, moving them aga i aes surface membrane presents a formidable barrier to the 
dient, and so contributing directly to the build-up dison of faa Ge ah Ah ene 
BE A This idea has been made unlikely .,.. d EOSS TAGA e Convento 
of the resting emf. ee se oi sodium is picture of the membrane consisting of, or containing, & 
i by the findings (a) that t S a ee cd, i AER thin layen of insulating lipid material. It may correspond 
a naffected by the potential gradient against v to the “double contours” of 50- to 100-A thickness which 
A u chemical pump would have to work, and have been seen in electron micrographs of various cellu- 
a an electro sodium extrusion by metabolic lar or intracellular surface structures. 
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The conclusions which have been reached from chemi- 
cal and tracer measurements are in general agreement 
with what is known about the electrical properties of 
nerve and muscle fibers. By studying alternating-current 
impedance, or by analyzing the attenuation and tempo- 
ral distortion along the fiber surface of a small voltage- 
step signal applied at one point between inside and 
outside, it had been shown that the interior of axons 
and muscle fibers behaves as a conducting cylinder of 
slightly higher electrolytic resistivity than the outside 
fluid but is separated from the outside by a surface layer 
of very low conductance (of the order of 10~* to 10-* 
ohm! cm~?) and of high capacitance. The fibers possess, 
therefore, electrical properties which are analogous to 
those of a cable, and one may ask to what extent these 
properties can be used for the purpose of signal trans- 
mission. Now, taken as a passive transmission line, the 
axon would be of little use, for its cable losses are very 
great: its surface leakage and the resistivity of its core 
are some 108 times greater, and its sheath capacity 
about 10° times higher than those of an ordinary com- 
mercial cable. In fact, a weak applied square-pulse signal 
(i.e., an electric pulse of subthreshold intensity which 
fails to excite the inherent relay mechanism of the axon 
membrane) fades out and becomes badly blunted within 
a few millimeters of its origin. It is clear, therefore, that 
the cable properties of the nerve are quite insufficient 
to serve the propagation of a message over the required 
distances, unless the deficiencies of the cable are made 
up by a special boosting process. This is indeed the case, 
and the sole purpose of the excitatory mechanism is to 
regenerate and to reamplify the signal at each point of 
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Fic. 2. Diagram illustrating the response of a nerve or muscle 
fiber to an electric stimulus applied at a distant point. The inset 
diagram shows the arrangement of stimulating and recording 
electrodes. In (a), a recording micropipette is inserted into the 
fiber interior, showing the resting potential across the cell mem- 
brane. Vertical scale indicates level of potential at the tip of the 
micropipette relative to the outside bath. Current pulses are 
applied through the distant stimulating electrodes, first inward, 
then outward through the membrane. No response is produced 
until, in (b), the outward pulses exceed a critical threshold when 
an all-or-none action-potential wave is recorded. 
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Propagating 
action potentials 


died away locally 


Fic. 3. Local and propagated potential changes, recorded at 
the site of stimulation from a single crab nerve fiber. Inset dia- 
gram shows arrangement of recording (1 and 2) and stimulating 
(3 and 4) electrodes, all four being placed on the axon surface, 1 
and 3 close together. The records show, superimposed, the local 
potential changes produced by 6 successive pulses of threshold 
strength (duration of pulse ca 2 msec, beginning and end marked 
by brief artifacts). The potential change either died out locally 
or flared up after a variable delay into a large propagating action 
potential (about 100 my peak-to-peak). Its diphasic nature is due 
to the wave of surface negativity passing first point 1, then point 2. 


the line, and so to insure its forward passage without 
attenuation or distortion of its brief wave form. 

It is worth emphasizing that the presence of a cable- 
like mechanism, insufficient though it is for long-distance 
signaling, forms an essential link in the process of im- 
pulse propagation: it allows the action potential which 
has arrived at one point to impart a stimulus to the 
next region and to excite its latent relay mechanism. 
Moreover, the passive cable spread is of great importance 
for the integration of converging messages which takes 
place in the cells of the central nervous system. These 
cells become excited, or inhibited, by the interaction 
and summation of many subthreshold electrical changes 
which are imposed on different points of their surface 
membrane, all within a short range of the central cell 
body. The effective summation of such converging elec- 
trical influences is possible, because over short distances 
(a fraction of a millimeter) the cable properties of the 
cell are quite adequate, and the subthreshold potential 
changes do not suffer much attenuation. They can, 
therefore, sum up to the firing point at which an impulse 
emerges, capable of traveling the whole length of the 
axon process. 


PROCESS OF EXCITATION 


What is the nature of the excitatory relay mechanism 
by which the impulse is boosted to its full strength? 
The process can be demonstrated by recording the elec- 
trical events at the point of the fiber at which an impulse 
is initiated by an applied electric stimulus (Figs. 2-4). 
When a square pulse of current is passed inward through 
the axon membrane, the membrane potential is displaced 
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Fic. 4. Diagram showing local potential changes across axon 
membrane, due to rectangular current pulses (4-msec duration, 
varying direction and amplitude). At threshold, the membrane 
potential is displaced to a level at which it is in a state of unstable 
equilibrium, either flaring up into a propagating signal or sub- 
siding locally. 


from its resting level of, say, —60 mv inside to a higher 
level of internal negativity. The potential change takes 
a certain time to develop and to decay; this time course 
depends upon the resistance and capacity of the fiber, 
as in an ordinary leaky and capacitative cable structure. 
Currents of this direction, which increase the membrane 
PD, do not excite, but simply produce local potential 
changes approximately proportional to current strength, 
and fading-out along the fiber with a decrement of 
about 50% per 1 or 2 mm, characteristic of its cable 
coefficients. 
When the polarity is reversed and weak current in- 
tensities are employed, very similar local potential 
changes of the opposite direction (reducing the mem- 
brane PD, or partially depolarizing) are observed. How- 
ever, as the current strength is increased the picture 
alters. The membrane potential changes progressively, 
more than in proportion to the applied current, and a 
point is reached at which the depolarization becomes 
regenerative and, after withdrawal of the current, the 
decay of the effect greatly retarded. A slight further 
increase in current strength causes a flare-up of the 
potential change which now passes through an auto- 
matic cycle; it rises to the full height of the propagating 
action potential and then returns to the initial level 
within 1 or 2 msec. This event (known as the spike) 
travels at constant speed (about 25 m/sec in the squid 
fiber at 20°C) down the axon, without loss of amplitude 
anywhere; it leaves behind a short refractory period of 
a few msec, after which the axon is again excitable and 
capable of repeating the event. It is interesting that 
even below the firing point (or threshold), some local 
ii Jdering” seems to go on. Indeed, there is a good 
smo. between the local electrical events in a nerve 
PREY, ny other form of regenerative chain reaction, 
fiber an e A initially applied stimulus is reinforced by 
in which ` which it produces. It would be possible, 
the reaction to obtain a very similar family of kinetic 
for instance, 
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curves, if one used an explosive gas mixture (say hydro- 
gen and oxygen), applying or withdrawing heat instead 
of current, and plotting on a suitable time scale the 
temperature of the gas instead of membrane potential. 
Well below the ignition point, the temperature changes 
along an exponential time course and equilibrates at a 
level determined by a balance between heat leakage to 
the surroundings and rate of heat supply. Close to the 
ignition point, some gas molecules combine and produce 
extra heat, but not at a sufficient rate to keep the re- 
action going when the applied heat source is switched 
off. At the ignition point, the rate of reactive heat pro- 
duction just balances the leakage to the cooler surround- 
ings and, for a while, maintains the system in unstable 
balance from which either an explosive flare-up or a 
return to the ambient stable temperature occurs. Apart 
from this formal analogy, there is, of course, little 
resemblance between an explosive exothermic process 
and the much more subtle and repeatable release of 
electric energy by the nerve membrane. 


PROPERTIES OF THE NERVE IMPULSE 


Several questions arise at this point: (1) What is the 
nature of the regenerative factor in a nerve fiber which 
causes an initially applied potential change to amplify 
itself? (2) What brings the action potential down to the 
original baseline? (3) What causes the system to become 
temporarily inexcitable? (4) How does the axon recover 
from its impulse activity? 


Regenerative Factor 


It has been shown!*:” that the sodium conductance 
of the membrane depends upon the electric field across 
it; the sodium conductance is very low at the normal 
level of the resting potential, but increases, temporarily, 
when this potential difference is reduced. How this 
comes about is not known and remains one of the out- 
standing problems, but the statement that sodium 
permeability is raised by a depolarization of the axon 
membrane is based on very strong evidence. The conse- 
quences of this effect are far-reaching : suppose a partial 
depolarization is produced by passing current outward 
through the membrane. The consequent rise in sodium 
permeability causes an increased leakage of these ions 
down their concentration gradient into the fiber. As a 
result, positive change is transferred to the interior and 
the membrane potential is diminished further. This is 
the regenerative step which causes the permeability to 
sodium to rise again and more sodium ions to enter. At 
a certain level of the membrane potential, the threshold, 
this process becomes progressive and flares up into the 
action-potentigl wave. Like the ignition point of the 
explosive-gas analogy, threshold is a point of unstable 
equilibrium at which restoring and regenerative tend- 
encies just balance each other. The regenerative factor 
responsible for the automatic ascent of the action poten- 
tial is the progressive relation between sodium-conduct- 
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ance change and fall of membrane potential; the re- 
storing factor is to be found in the potassium (and 
chloride) conductances which tend to reset the mem- 
brane potential to the initial stable baseline. 

Once the threshold level is exceeded, the sodium 
conductance becomes rapidly the dominant factor and 
the regenerative entry of sodium would continue until 
the membrane potential swings well to the other side, 
and the interior attains sufficient positive charge to 
balance the inward concentration gradient and prevent 
any further net influx of sodium ions. In other words, 
if the initial excitatory process continued, a new equi- 
librium level would be reached corresponding to that 
of a sodium-concentration cell. This is, indeed, the 
accepted explanation for the reversal of the membrane 
potential during the peak of the impulse. The level 
which is attained falls somewhat short of the sodium 
potential, because the initial permeability change is only 
transient: within about a millisecond, the high, specific 
sodium conductance is switched off and apparently is 
converted into a state of high potassium conductance. 


Restoring Factor 


This secondary changeover, from sodium to potassium 
permeability, ensures a rapid return of the potential to 
its original baseline, for potassium ions now will leave 
the fiber rapidly in the direction of their electrochemical 
gradient. Such electric restoration would be effected 
even without a secondary increase of potassium con- 
ductance, provided the initial sodium-permeability rise 
is cut off (a process known as inactivation). In this 
event, however, the decline of the action potential would 
take a relatively long time and proceed with the time 
constant of the resting-fiber membrane. In the squid 
axon, it would take several milliseconds at the ordinary 
leakage rates of potassium (and chloride) to restore the 
membrane potential to the resting level. By opening up 
the potassium channels, this process becomes greatly 
accelerated, and a few milliseconds later the axon is 
ready to fire again. 

The work of Hodgkin and Huxley has shown that a 
depolarization imposed on the axon membrane (by the 
so-called voltage-clamp method) produces these two 
successive permeability changes (Na, then K), the in- 
tensity of both changes increasing similarly with the 
amplitude of the enforced voltage step. During the 
normal action potential, the sodium-conductance rise is 
regenerative and self-reinforcing, while the potassium- 
permeability change is restorative and gradually cuts 
itself down as the membrane potential is caused to 
return to the original stable level. 

It should be noted that the electrical return of the 
system during the impulse is not brought about by a 
reversal of the ionic movement (which would mean a 
forced expulsion of the quantity of sodium which had 
entered during the ascending phase of the action poten- 
tial), but by the leakage of an equivalent quantity of 
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Fic. 5. Diagram of action potential traveling (in direction of 
arrow) along the axon. During the rise of the wave, sodium ions 
enter the fiber and charge the interior positively. During the 
decline, potassium ions leave the fiber and restore the initial 
membrane potential. 


potassium ions (see diagram in Fig. 5). Both sodium 
and potassium are moving downhill during the impulse; 
the quantities, however, are so small that no detectable 
change of the internal concentration would occur during 
one impulse, and there is plenty of time for the pumping 
process to restore the chemical situation in resting 
periods later on. 


Nature of the Refractory Period 


The secondary permeability effects which bring about 
the decline of the action potential are also responsible 
for the short period of inexcitability (refractoriness) 
which follows the propagation of each impulse. Clearly, 
during the state of inactivation—i.e., while the regener- 
ative dependance of sodium permeability on membrane 
potential is in abeyance—an applied electric current 
fails to excite; secondly, a state of very high potassium 
permeability implies that the restoring factor is greatly 
strengthened and tends to oppose any displacement of 
the membrane potential from the stable potassium-equi- 
librium level. Hodgkin and Huxley have determined 
the rate constants of these changes and their time 
course of decline following a restoration of the normal 
membrane potential. It was found that the period of 
recovery of excitability after the impulse is simply a 
measure of the time by which the return of the normal à 
permeability relations lags behind the return of the 
membrane potential. 


Chemical Recovery 


There are several phases of restoration : the recovery 
of the membrane potential is followed by a return of 
excitability, but the system has not yet reverted com- 
pletely to its initial state; in fact, a minute fraction of — 
the intracellular potassium store has been lost and re- — 
placed by sodium. The downhill movement of sodium 
and potassium ions across the membrane has provided — 


+ 
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the energy for the nerve signal. The amount of ions 
which exchange during the impulse is extremely small; 
it has been determined for a series of impulses in tracer 
experiments, and by chemical analysis of single fibers, 
and was found to exceed the minimum coulomb require- 
ment by a factor of 3. The transfer of charge needed to 
alter the membrane potential by 100 my, with a ca- 
pacity of 1 u f cm™ is 107 coul which requires a net 
transfer of about 10} M of univalent ions: the net 
exchange of sodium and potassium per impulse is about 
3 to 4 times as much. These quantities produce only a 
minute internal concentration change per single impulse 
(about 0.0001% in the large squid axon, but considera- 
bly more in small fibers whose volume/surface ratio is 
correspondingly less). Even if the ion pump were not 
active, a squid axon could conduct some hundred 
thousands of signals before its accumulated ion store 
would be exhausted. As in many other forms of brief, 
high-intensity biological action, a period of impulsive 
activity is completed at the expense of small but cumu- 
lative chemical losses; these are made good during 
periods of rest by continuous chemical recovery mecha- 
nisms which Connelly discusses (p. 475). 

A rough qualitative picture has been given here of 
the conclusions derived from the extremely accurate 
quantitative work of Hodgkin and Huxley and others, 
and reference is made to their papers for a description 
of the experimental evidence on which these conclusions 

re based.” Summarizing briefly, it has been found 
pat the resting potential is related to the external 
otassium but independent of sodium concentration, 
zhile the peak of the action potential depends on the 
sodium-concentration ratio in the way predicted by the 
Nernst theory. In experiments in which the axon mem- 
brane was subjected to a voltage clamp, it was found 
that a sudden stepwise depolarization (e.g., to zero 
membrane potential) was followed by a surge of positive 
inward current followed by a maintained outward cur- 
rent. The first, transient, component (in which current 
flows “in opposition to Ohm’s law” indicating the pres- 
ence of the required negative resistance, or regenerative 
factor) could be identifed as owing to the inrush of 
sodium, for it could be made to reverse in polarity by 
reversing the concentration gradient of sodium ions. 
With a variety of sodium concentrations, the reversal 
was shown to occur invariably at the electrochemical 
equilibrium potential for sodium ions. The second, main- 
tained, component was identified by tracer-efflux meas- 
urements as a maintained high-intensity outward cur- 
f potassium ions. The rate constants of the several 
eae conductance changes were determined, and 
pecs Mee ation the properties of the impulse, its 
Oe A threshold, its amplitude, duration, and con- 
E A velocity, could be synthetized with a high 


degree of precision: 


Wh i i igations has been brought 
i +. series of investigations 
ile EEE IOn, this should, of course, not 


a dedeye rs that the nature of the permeability 
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relations and the reason why membrane potential and 
ionic conductances are linked in such a specific manner 
remains an altogether unsolved problem at the present 
time. 


SOME SPECIAL CASES 
Medullated Nerve Fibers 


The picture presented so far describes the situation 
in the giant axon of the squid, on which most of the 
crucial experiments have been carried out. But sub- 
stantially the same conclusions apply to all other types 
of nerve fibers and to most types of skeletal-muscle 
fibers which have been studied. There is, however, an 
important structural difference between the so-called 
nonmedullated nerves, and the medullated (myelinated) 
axons of vertebrates which possess relatively thick, 
segmented sleeves of myelin broken at regularly spaced 
intervals (the nodes of Ranvier about one or a few 
millimeters apart) at which a small area of axon mem- 
brane is exposed to the outside fluid. In the ordinary 
nonmedullated fibers, the propagation of the signal de- 
pends upon the existence of a continuous relay mecha- 
nism which is built in along the whole line at many 
scattered random points of the axon membrane. This 
makes up for the deficiencies of the poor cable trans- 
mission : without it, the action potential would be atten- 
uated rapidly in the region ahead. Experimentally, this 
can be shown to be the case by poisoning or anaesthe- 
tizing a narrow region of the axon cylinder and so 
abolishing locally the regenerative mechanism. Pro- 
vided, however, the attenuated signal which spreads to 
the other side of the blocked region exceeds the threshold 
depolarization (which amounts to only 3} to 4 of the full 
amplitude of the spike), the relay process will start 
again and the impulse will jump the block. 

In medullated nerves, transmission of signals norm- 
ally occurs in jumps.”!:*? The segmented myelin sheath 
provides a relatively good, low-capacity insulation 
around the axon, and the relay mechanism is concen- 
trated and restricted to the discrete nodes of Ranvier 
where the axon membrane is exposed. There is, in other 
words, a rather close miniature analogy to a submarine 
cable with relay stations interposed at the nodes. In the 
internodal region, the fiber behaves as a relatively good 
cable with only slight capacitative losses which are made 
good at the next node. The final result is a great im- 
provement in speed and economy of signaling: the ionic 
losses are restricted to relatively few microscopic areas 
of the fiber surface, and the propagation velocity of the 
impulse is some 10 times greater than for a nonmedul- 
lated axon of the same size. To achieve the speed which 
medullated axons attain without the provision of a 
myelin sheath, the only alternative way for nature 
would have been to reduce the internal core resistance; 
that is, to increase the fiber size. Considering the need 
for a large number of separate signaling channels within 
a confined space (there are one million medullated axons 
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in the human optic nerve alone), the great advantages 
of myelination are obvious. 


Other Tissues 


Most of the tissues, nerve or skeletal muscle, whose 
function depends upon the rapid propagation of an 
all-or-none impulse, appear to make use of a regenera- 
tive entry of sodium ions in the course of excitation; 
there is, however, some doubt as to whether the second- 
ary increase of potassium permeability is a similarly 
wide-spread phenomenon, or whether it may be a more 
or less dispensable effect. In some tissues, e.g., certain 
medullated axons, the potassium resistance and the 
electric time constant of the resting membrane appear 
to be sufficiently low to insure a rapid decline of the 
action potential once the sodium conductance has 
dropped back towards the initial low level. 

In heart muscle, the action potential has a different 
shape, of “flat top” and long duration. The initial fast 
rise is probably governed by a sodium mechanism just 
as in nerve, but the subsequent changes differ at least 


in quantity. It may be that the maintained plateau of 
depolarization arises from the increased sodium con- 
ductance not being switched off completely, and from a 
failure of the potassium efflux to rise above its steady 


rate. It is interesting to consider the usefulness of this 
long flat-topped heart potential: its duration apparently 
governs the systolic contraction of the heart muscle, 
and because of the relatively low frequency of its regular 
beat, there is no need for rapid electric recovery of the 
membrane potential. 

In skeletal muscle, a quantitative difference from 
nerve was found in the large capacity of the surface 
membrane (5 to 10 uf/cm? as against 1 uf/cm? in 
nerve). There is a correspondingly larger transfer of 
charge during the action potential, and recent tracer 
measurements by Hodgkin and Horowicz* have shown 
that the amounts, per unit surface area, of sodium and 
potassium exchange are about 3 times greater than in 
the nerve fibers. There appears to be less temporal 
overlap of the two ionic permeability changes; indeed, 
resistance measurements indicate that there are two 
discrete phases of high membrane conductance during 
the action potential. This would enable the sodium ions 
which enter to produce a relatively larger net effect. It 
also seems that the delayed rise of potassium conduc- 
tance is shortlived and stops before the action potential 
has returned to the baseline, giving rise to a character- 
istic long tail (the so-called negative afterpotential) of 
the muscle spike. 

There are finally some tissues (e.g., crustacean muscle 
and certain conducting plant cells) in which action 
potentials of the usual propagating all-or-none type can 
be elicited by electric stimulation, but sodium entry 
does not seem to be responsible for their regenerative 
ascent. In many of their features, these action potentials 
resemble those of the nerve axon very closely. It appears 
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that the underlying events are similar and involve 
ionic permeability changes, but that the specific chan- 
nels which are being opened during the depolarization 
of the membrane are different. For example, in certain 
algae, a regenerative exit of chloride from the cytoplasm 
takes the place of sodium entry, while in crustacean 
muscle, entry of divalent cations possibly may be re- 
sponsible for the rise of the action potential. 


Structure and Function 


To return to some of the problems raised by F. O. 
Schmitt (p. 455): Can one try now to identify our 
physiological mechanisms with any one of the structures 
revealed by the electron microscope? The question has 
been raised, for instance, whether ion permeabilities are 
entirely to be attributed to the axon membrane, or could 
they be properties of Schwann cells which envelope the 
axon? Insofar as the observations have been repeated 
on skeletal-muscle fibers, which are not invested with 
satellite cells, and insofar as they corroborate the main 
conclusions derived from the nerve experiments, one 
may feel that the presence of a Schwann-cell envelope 
does little to mask or mimic the electrical properties of 
the axon membrane. It is important also to recall that 
the cellular investment of the axon is not complete, 
but there are gaps through which the axon surface 
communicates with the extracellular fluid. Although 
these channels may be narrow (only a few hundred 
Angstrém units wide), they are also short (a few microns 
long) and probably add only a small resistance in series 
with the axon membrane. There is evidence for such an 
external nonreactive resistance of about 5 ohms in series 
with a resting membrane of approximately 1000 ohms 
Xcm?. Recent work of Frankenhaeuser and Hodgkin”é 
suggests that the presence of this external bottleneck 
gives rise to a slow accumulation of potassium ions on 
the surface of the squid axon during a period of pro- 
longed impulse activity. This is associated with cumu- 
lative changes in the membrane potential, but these are 
rather minute by comparison with the main features of 
the spike. Nevertheless, one must bear in mind the 
possibility that the distribution of current flowing 
through the axon membrane may be modified locally by 
Schwann cells which cling to its surface and by the 
presence of discrete channels between them. 

It may seem a little disappointing that, with all of 
the high-resolving power which has been attained in 
recent years, both in the study of fine structure and of 
the functional mechanism of nerve, there is still such a 
conspicuous lack of correlation between these two fields 
of work. One of the reasons is undoubtedly that the 
details which are revealed in their most striking form 
by such different techniques are not necessarily related 
to the same features. In the electron micrograph, the 
almost complete investment of an axon by a Schwann 
cell, or the almost intimate contact between the two 
surfaces, may be the most impressive phenomenon, 
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while from the point of view of electric current distri- 
bution the appearance of a fine gap or crack in this 
investing layer (which is dificult to find and might 
easily be discarded as an artifact) may be a much more 
important feature. 

A second reason for the apparent discrepancy is that 
there must be structural counterparts of functional 
aspects other than those with which the student of the 
nerve impulse is directly concerned. The purpose of 
this paper is to describe some efforts that have been 
made to elucidate the nature of the nerve signal. But, 
in addition, there is built into the nerve a machinery 
for chemical recovery processes, and for the mainte- 
nance of the axon and its enzymic apparatus, which one 
associates with the so-called trophic influence of the 
nerve-cell body and its nucleus. Little is known about 
this process, but there are many indications that satel- 
lite cells are actively involved, both in trophic mainte- 
nance and in the disintegrative processes which follow 
the severance of the axon from the cell body. 

The elementary membrane structure which has been 
seen in electron micrographs, e.g., of permanganate- 
fixed axons, is a double contoured-layer 50 to 100 A 
thick. It is not known what fraction of this layer is 
composed of the insulating lipid material which we hold 
responsible for the ion-impermeable and capacitive 
properties of the axon membrane. Nor can one be certain 
that improvements in microscopic resolution will not 
show further subdivisions and even thinner membrane 

lements. And ultimately, the physiologist is interested 
iot so much in the lipid bulk phase of the membrane 
which occupies most of it, and which seems never to 
change nor to take part in the transport of ions, but in 
those very sparse molecular patches—possibly at con- 
tinuously varying sites, at which ionic gaps or channels 
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are thought to be formed, and where “carrier mole- 
cules” are believed to be active in transferring sodium 
and potassium ions from one side to the other. 
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INTRODUCTION 


HE prime function of peripheral nerve is the 

conduction of impulses. In biological terms, the 
nerve impulse is the means of transmission of an item 
of information from one point in an organism to another. 
In terms of physics and chemistry, the impulse is a 
spatially propagating, transient disturbance in the state 
of a complex reaction system; it might be detected by, 
or it might be described in terms of, any of those con- 
ceivably measurable changes which are a part of it or to 


which it gives rise. In thermodynamic terms, the impulse 


is a dissipative process in which the free energy of the 
nerve and its environment is decreased, principally by a 
flow of ions down gradients of their electrochemical 
potential. In this discussion, attention is focused upon 
processes by which resting nerve maintains itself in a 
steady state, ready to function, and upon processes 
which serve to restore nerve to its resting state after it 
has conducted impulses. A central question is how 
energy-yielding processes may be coupled to those re- 
quiring energy. In an approach to a more specific 
formulation of this question, the discussion emphasizes 
correlations between changes in rate of oxidative me- 
tabolism and electrochemical manifestations of ionic 
movement. 

In resting nerve there are at least two processes which 

may be presumed to require energy. These are the 
maintenance of structural integrity and the maintenance: 
of the ionic-distribution characteristic of the resting 
state. About the former very little can be said. Perhaps 
the cell bodies from which the excised nerve fibers have 
been severed are the primary site of synthetic reactions 
underlying such maintenance. During activity and re- 
covery therefrom energy demand may be increased in 
two ways: there must be an acceleration of ion transport 
processes in order to reverse the exchange of ions which 
occurs during the passage of impulses and, secondly, 
there may be a dissipation of chemical energy associated 
with the permeability cycle that allows the ionic ex- 
change to take place. 
"Insofar as nerve at rest is in a steady state and 
following activity returns to the same steady state, the 
over-all process of impulse conduction involves no net 
external work and results only in the conversion of 
chemical energy to heat. 

The following discussion indicates“ that oxidative 
metabolism in nerve can serve as an adequate energy 
source. Brink! has recently reviewed the evidence that 
the biochemistry of peripheral nerve of the frog re- 
sembles that of other animal cells; the nerve appears to 
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contain and utilize the Meyerhof-Embden pathway of 
carbohydrate breakdown, the Krebs tricarboxylic acid 
cycle, and the cytochrome chain of enzymes. The energy 
turnover appears to be mediated by the usual system of 
phosphate compounds (adenosine phosphates and crea- 
tine phosphate) which are replenished principally by 
oxidative phosphorylation carried out by mitochondria. 
In what follows, it will be assumed (1) that any proc- 
esses which rely upon metabolic energy do so by chemical 
reactions coupled to the breakdown of adenosine tri- 
phosphate, or its equivalent, and thereby increase the 
intracellular level of phosphate acceptor; (2) that the 
kinetics of oxygen uptake by nerve are some reflection or 
measure of the changes in concentration of phosphate 
acceptors at the mitochondria. A particularly dramatic l 
experiment directly supporting the first of these assump- l 
tions is that in which Caldwell and Keynes? injected | 
ATP into a metabolically poisoned squid axon and 
observed a partial restoration of the rate of sodium 
extrusion. 

Firstly, the kinetics of the increases in oxygen utiliza- 
tion associated with the conduction of impulses is 
described and compared with the heat measurements of | 
Hill, and then it is shown how the kinetics are modified | 
under circumstances in which rates of ion transport 
have presumably been affected. Secondly, data on the 


jonic fluxes across the nerve surface are examined and 


the estimated energy requirements of transport proc- 
esses are compared with energy available from oxidative 
metabolism. Thirdly, some observations are described 
of prolonged positive afterpotential (post-tetanic hy- 
perpolarization) which appears to be closely related to 
ionic transport processes. | 

Most of the observations which follow come from : 
experiments on the excised sciatic nerve of the frog. | 
This preparation has some advantages over the giant 
axon of the squid in that its metabolic and electrical 
properties remain more nearly constant over an experi- 
mental period of ten to fifteen hours. It has, however, | 
the disadvantage of being a bundle of fibers of several = 
different types and there are complications arising from . 
the existence of an appreciable extracellular space. The 
minimum environment required by frog nerve to main- ; 
tain function and a satisfactory steady state at 20°C i is 
a balanced bathing solution containing sodium, potas- 
sium and calcium salts, and dissolved oxygen. It shoul 
be emphasized that many or most of the electrical 
metabolic properties of nerves appear to be bas 
similar from one species of animal to another, 
similar also, in fact, to the properties of other excit: 
tissues such as skeletal and cardiac muscle. 
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OXIDATIVE METABOLISM AND 
HEAT PRODUCTION 


The oxygen uptake by resting frog nerve is usually 30 
to 40 mm*/g(wet)/hr or about 1.5 umoles/g(wet)/hr. 
When a nerve is tetanized, its rate of oxygen uptake 
increases and approaches a new steady level in about 
30 min, closely following an exponential time course 
with a time constant of 5 to 8 min, Fig. 1, upper curve.’ 
The amplitude of the increase is greater the higher the 
frequency of tetanus, at low frequencies, and approaches 
a maximum limiting value (of about 1 umole/g (wet)/hr) 
at about 100 volleys/sec,* as shown in Fig. 2. At the end 
of tetanus, the rate of uptake declines slowly toward the 
resting value along an exponential whose time constant 
is usually 15 to 25 min. [ After tetani at frequencies lower 
than about 10/sec, recoveries may be more rapid than 
this (see Fig. 11 in an article by Connelly et al.*). ] 
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Fic. 1. (Upper) Relative time 
courses of the increases in oxygen up- 
take resulting from activity in intact 
frog sciatic nerve (perineurium not re- 
moved) and in stripped nerve (per- 


ineurilum removed). Measurements 
made in a flow respirometer® with a 
polarized oxygen cathode. Polaro- 


graphic current is a linear measure of 
dissolved oxygen remaining in solution 
after it has flowed past the nerve, a 
downward deflection in the trace, as 
shown, indicating an increase in oxy- 
gen uptake. (In all tetani in this and 
the following figures, only type A 
fibers stimulated.) (Lower) Time 
courses of changes in amplitudes of 
compound action potentials of the two 


A-Fibers R. Pipiens 
Nerves in 2.1 mM KË 

PH7 2%CO2g 20°C 
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occasionally by test shocks during 


e Intact nerve recovery 


OStripped nerve 


The over-all increases in heat production by tetanized 
frog nerve show kinetics similar to those of the increases 
in oxygen consumption.’ The quantitative comparison 
of steady-state heat rate and oxygen consumption as a 
function of frequency is shown in Fig. 2. The ratio of the 
right-hand to the left-hand scale is within 15% of the 
accepted value of the calorific equivalent of oxygen, 5 
cal/cc. The agreement is quite satisfactory, in view of 
the difference in species. 

A detailed analysis of the heat production of nerve 
during short tetani reveals the presence of a component 
whose onset and termination are abrupt and correspond 
to the beginning and end of the tetanus.’ This “‘initial 
heat,” only a few percent of the total heat associated 
with the tetanus, appears to have no counterpart in the 
time course of the increase in oxygen uptake, the curve 
of which rises linearly from the beginning of the 
tetanus. In recent elegant work, Hill and his co- 
workers? have resolved the initial heat associated with a 
single volley in crab nerve into a positive phase (heat 
production) followed by a smaller negative phase (heat 
absorption). These phases may result from the heats of 
dilution or mixing of the sodium and potassium ions 
exchanged during the impulse, but other events of the 
permeability cycle have not been excluded as con- 
tributing causes. 

The second curve of Fig. 1 illustrates the time course 
of increase in rate of oxygen uptake by a nerve trunk 
from which the perineurium, or connective-tissue sheath, 
has been removed. The striking difference between this 
and the upper curve is that the recovery is characterized 
by two components, the first lasting no more than about 
30 min and the second extending over several hours. 


nerves, measured during tetanus and ; 
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The lower part of Fig. 1 shows the accompanying 
changes in the amplitudes of the compound action po- 
tentials of the two nerves, the stripped nerve showing 
much less rapid recovery than the unstripped nerve. 
The possibility that these differences in the metabolic 
and electrical behavior of stripped and intact nerve 
might result principally from differences in the extra- 
cellular level of potassium ions was tested by the series 
of experiments illustrated in Figs. 3 and 4. These com- 
pare the kinetics of the changes in oxygen uptake and 
action potentials, respectively, of stripped nerves bathed 
in solutions containing different concentrations of po- 
tassium ion. The three lower curves of Fig. 3 possess fast 
components of about the same relative magnitude and 
time course whereas the slow components of these 
curves show gradation from effectively no recovery in 
K-free solution, to a slow almost linear recovery in 


2.1 mM K*, to a somewhat more rapid curvature 


characteristic of an exponential of about a 40-min time 
constant, in 5 mM K+. In the top curve (8.5 mM K*), 
the over-all recovery is even more rapid than any ob- 
served in intact nerve (in 2.1 mM K+), with an ex- 
ponential time constant of only 11 min instead of 15 to 
25. It is not possible to distinguish two phases of re- 
covery in this case and it is not clear whether the slower 
phase observed in the lower curves has been speeded up 
to merge with the rapid phase or whether it has de- 
creased in magnitude to zero. In Fig. 4, one sees that 
the lower the concentration of potassium in the bathing 
solution, the more rapid is the decline in height of the 
action potential during tetanus and the less rapid is its 
recovery. Recovery of action potential is incomplete in 
K-free solution, as the recovery of oxygen uptake was 
seen to be. In 8.5 mM Kt, however, not only does the 
action potential not decline in amplitude during tetanus 
but it shows about 13% increase in amplitude during the 
post-tetanic period. 
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Fig. 3. 
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Fic. 3. Relative time courses of the increase in oxygen uptake 
by stripped nerve bathed in Ringer’s solutions containing different 
concentrations of K+. Ordinate as described for Fig. 1. 


A possible key to the interpretation of these observa- 
tions is the finding of Hodgkin and Keynes? (see also 
Hodgkin’s Croonian Lecture” for an extended discussion 
of the movements of ions in giant nerve fibers) that the 
efflux of sodium from Sepia fibers is increased in a solu- 
tion containing more potassium than does sea water and 
is depressed by K-free solution to one-third or one- 
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quarter of the efflux into sea water. If, in frog nerve, the 
rate of sodium extrusion is similarly dependent upon the 
level of potassium in the external solution, the slow 
component of recovery respiration may be interpreted 
as reflecting the energy demand of the transport ma- 
chinery. The fast component, on the other hand, might 
result either from cessation at the end of tetanus of 
energy demand by the excitability cycle, or from a rapid 
reduction in rate of ion transport, at the end of tetanus, 
because of relaxation by diffusion of the internal ionic 
gradients maintained near the nerve surface by the 
entrance of sodium and by the exit of potassium during 
each impulse. The following considerations suggest that 
additional factors may affect the kinetics of the fast 
component. 

The principal locus of ionic flow during an impulse in 
a myelinated fiber appears to be across the membrane 
at the node."-” If ionic recovery takes place principally 
across the nodal membrane, the question arises as to 
whether or not phosphate compounds participating in 
this recovery are rephosphorylated by mitochondria 
close to the node or by those farther down the internodal 
space. Thus, longitudinal diffusion of phosphate ac- 

ceptors in the internode may determine in part the 
kinetics of changes in rate of oxygen uptake. These 
considerations apply, as well, to the kinetics of the 
rising phase of uptake that occurs during the tetanus. 
The fact that the time constant of this rising phase in 
intact nerve is as long as it is (5 to 8 min) and does not 
vary markedly with tetanus frequency‘ is consistent 
with the idea that the kinetics of this phase are re- 
stricted in part by configurational parameters. 

The differences in the kinetics of the changes in 
oxygen uptake and in the action potentials of intact and 
stripped nerves in 2.1mM K* Ringer’s appear, then, to 
be due to restrictions imposed by the perineurium on the 
diffusion of ions between the extracellular space and the 
bathing solution. The differences arise from the second- 
ary effects of a changing ionic composition around the 
fibers of the intact nerve, especially as regards the 
concentration of potassium. 

A most important and revealing experiment is that of 
measuring the respiration of nerve bathed by 2 solution 

in which lithium has replaced sodium ion. It is known 
that lithium can substitute for sodium in the conduction 
process. As soon as lithium Ringer’s is substituted for 
sodium Ringer’s, the resting respiration begins a slow 
decline. Tetanization produces almost no increase in 
rate; there may result a slowing or interruption in the 
decline of the resting respiration, but over all the effect 
of activity during exposure to lithium Ringer s ae 
tainly not more than 5% of the effect measured in 

Alam Ringer’s. The observation suggests strongly that 
ag, 7 ‘ons are extruded from frog nerve very slowly 
Jithium ion i], Swan and Keynes” have reported that 
or not anga aE extruded from frog muscle much less 
lithium jons re sodium ions. Another conclusion from 
rapidly cee a is that energy demand originating in the 
this experim: 
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excitability cycle is less than 5% of the total. Therefore, 
it appears unlikely that the fast recovery components of 
the three lower curves of Fig. 3 result from the cessation 
of such a demand. 


ION TRANSPORT AND ENERGY REQUIREMENTS 


In Table I are shown data from the work of Hurlbut 
and Asano on water distribution, ion distribution and 
fluxes, and the net ion changes which result from ac- 
tivity or exposure to an altered environment. The water 
distribution refers to intact nerve. The assignment of 
water and ions is based upon the assumption that the 
nerve consists of two compartments, intracellular and 
extracellular, the latter occupied by bathing solution. 
The internal concentrations of ions are based upon meas- 
urements in which the extracellular space had been 
washed free of sodium and potassium ions and include a 
small correction for sodium lost during this washing." 
The fluxes of sodium and of potassium were measured as 
outfluxes from resting nerve which had previously been 
equilibrated with radioactive ions. 

At the bottom of Table I the sodium gained and the 
potassium lost during activity and during exposure to 
oxygen-free and K-free environments are compared. It 
is apparent that in each case the sodium gain is nearly 
equal to the potassium loss. Asano and Hurlbut’? have 
shown that ionic recovery does take place in the three 
or four hours following activity (50 volleys/sec, 1 hr, 
in 2 mM K* Ringer’s). Figure 5 indicates that recovery 
from rather large ionic shifts takes place in such a way 
that the potassium movement in one direction balances 


TABLE I. Frog nerve. 


45% wet weight* 
29% wet weight* 
26% wet weight? 


Extracellular water content 
Intracellular water content 
Dry weight 


Concentration Flux 
Intracellular 
mmole Extra- mmole 
——————— | cellular ) 
kg intracell water (mM) kg intracell water, hr 
Sodium 472 116 23> 
Potassium 1595 2 10-28» 
Net changes resulting from 
Activity K-free 
50 volleys/sec Asphyxia _ Ringer’s 
1 hr 5 hr 5 hr 
Sodium gain 18° 354 31> 
( mmole ) 
kg intracell water 
Potassium loss 21° 415 34b 


( mmole ~ ) 
kg intracell water 


a Reference 14, 
>W. P. Hurlbut and T. Asano (unpublished observations). 
© Reference 15, 
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the sodium movement in the other. The rates of ionic 
recovery during the first hour or two (Fig. 5) produce 
concentration changes of about 12 mM (intracellular) 
per hour. This net ionic exchange that is measured 
during recovery is presumably superimposed upon the 
normal fluxes, measured in resting nerve, of about 23 
mM (intracellular) per hour for sodium and probably 
about the same for potassium. 

For a two-compartment system, the thermodynamic 
expression for the energy required to move one mole of 
sodium from inside (z) to outside (0) and one mole of 
potassium in the opposite direction is 


Na, K; 
RT( m+n), 
Na; Ko 


where Nao, . etc., are ion concentrations (strictly 
activities). This expression has a value of about 3000 
cal/mole for the concentrations of sodium and potas- 
sium given in Table I. If the energy available from the 
hydrolysis of one mole of ATP to ADP, under intra- 
cellular conditions, is 7000 to 12 000 cal, then energeti- 
cally it should be possible for some 2 to 4 sodium ions to 
be extruded for each molecule of ADP produced. Assum- 
ing, conservatively, that one ADP is produced for each 
sodium ion transported, it may then be asked whether 
the observed oxygen consumption appears to be capable 
of providing sufficient energy, via phosphorylated inter- 
mediates, to transport ions at the rates observed. For 
frog nerve at rest, the oxygen uptake is about 1.5 
umole/g (wet)/hr and the sodium flux about 6.6 umole/ 
g (wet)/hr. Thus the ratio Na:O is 6.6/(21.5)=2.2. 
If, in the functioning cell, mitochondria maintain an 


| 
R. Pipiens 
Sciatic nerve 
Perineurium removed 
PH7 2% CO% 20°C 
Ringer's: 2.0 mM K* ~ 


Average change in content (uM/g dry) 


O——0 In O2-free Ringer's 
e— -~-e In K-free Ringer's 


\ Recovery in Ringer's 


(0) 1 2 3 i OO i . @ 3 4 5 
Hours 


Fic. 5. Time courses of the changes in ionic contents of stripped 
nerves during and after exposure to oxygen free or potassium-free 
Ringer’s solution. (Partially unpublished observations of W. P. 
Hurlbut and T. Asano.) Ordinate scale symbol uM means 
micromole, 
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average P:O ratio (phosphate acceptors phosphorylated 
to oxygen atoms reduced) of about 3, as isolated mito- 
chondria do,'® then it would seem that oxidation pro- 
vides adequate energy for transporting one sodium ion 
per ATP broken down. If the Na:P ratio is actually 
unity, the comparison implies that only about 30% of 
the resting respiration is available for processes other 
than ion transport. 

An estimate of the increase in rate of sodium extrusion 
during a period of activity can be made from other 
observations of Asano and Hurlbut.” They found that 
after one hour of activity at 50 volleys/sec the net gain 
of sodium averaged 32 umole/g (dry) when the nerve 
was bathed with K-free solution and only 16 when the 
Ringer’s contained 5 mM K+. This is an equivalent 
difference in rate of 4.2 umole/g (wet)/hr. If, in the two 
solutions, about the same amount of sodium entered the 
nerve during each impulse, the observed difference may 
be attributed to effective, restorative transport taking 
place during activity. The ratio of this difference in rate 
to the increase in respiration during activity in5 mM K* 
is approximately Na:O=4.2/(2X.8)=2.6. This figure 
suggests that, in the case of active nerve also, there may 
be little energy utilized for purposes other than ion 
transport. 

It should be emphasized that only one plausible line 
of thought has been followed in making these estimates 
and comparisons and that assumptions without direct 
experimental support have been invoked. Alternative 
lines of reasoning may ultimately prove to be more 
acceptable than the one outlined here. 


POST-TETANIC HYPERPOLARIZATION 


Experiments which indicate that there are measurable 
electrical changes associated with the ionic transport 
events of recovery" are now described. 

A stripped frog nerve is mounted in a plastic chamber 
so as to pass through five compartments (separated by 
grease-seals) containing oxygenated solutions. The first 
and fifth compartments are connected via calomel half- 
cells to a stable dc amplifier (chopper amplifier, time 
constant about 1 sec). The first compartment usually, 
and the second, fourth, and fifth compartments always 
contain a 5 mM K* Ringer’s, in the experiments to be 
described. The nerve is stimulated before it enters the 
first compartment and the impulses are blocked in the 
third which contains a choline chloride (sodium-free) 
Ringer’s. By this arrangement, it is possible to follow 
the changes in potential developed by an active or 
recovering region of nerve, the steady potential of an 
inactive region serving as a reference. 

Figure 6 shows the potential changes observed during 
and following a 25-min period of activity at 50 
volleys/sec. The downward deflection at the beginning 
of the tetanus is the time average of the negative action 
potentials. Once each minute the tetanus was inter- 
rupted for 5 sec and the recorded potential showed a 
positive deflection which reached its peak within 


te 
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A-Fibers R. Pipiens 
Stripped nerve in 5 mM Kt 
PH7 2%CO2 20°C 
£750 volleys/sec 


Fic. 6. Changes in average 
membrane potential recorded 


during and after 25-min teta- 


nus. Tetanus interrupted for 5 


sec each minute [from C. M. 


Connelly, ‘‘Post-tetanic hyper- 
y, yI 


polarization in frog nerve,” 
in Proceedings of the National 


Biophysics Conference, 1957 


Change in potential (mv) 


(Yale University Press, New 


Haven, to be published) J. 


Minutes 


interval. The amplitude of this hyperpolarization (net 
increase in membrane potential) reached a maximum 
after about 4 min and thereafter changed little. After 
the end of the tetanus, the hyperpolarization lasted for 
more than an hour; its recovery time course cannot be 
described as exponential. 

Figure 7 shows how the prolonged positive after- 
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i tracings of changes in average membrane 

Fic. i SUPERE pa a after 25-min tetani in Ringer’s 

nee ANAA different levels of K+. Traces shown during 
golunons tetanus are envelopes of changes similar to those shown 
period 0 of the two experiments the first tetanus was in 
1 econd tetanus carried out after 


: z + the s t 
Te Ringer’s with 5 mM K', ana second solution. Zeroes of potential 


potential is affected by the level of Kt in the bathing 
solution (in the first compartment). A stimulation of 25 
min in a solution containing 5 mM Kt was first carried 
out as a control, and after recovery the solution was 
changed either to 8.5 mM K* (upper) or to K free 
(lower). The potential changes recorded during tetanus 
under the modified conditions have been superimposed 
on the control observations. The amplitude of the 
hyperpolarization varies appreciably with the level of 
potassium; recovery is rapid in high potassium and 
appears to be very slow and prolonged in K-free solu- 
tion. During recovery from activity, the kinetics of 
oxidative recovery and the kinetics of the positive after- 
potential are both affected in much the same way. 
Other characteristics of this after-potential have been 
examined. It has been found that the observed ampli- 
tude of hyperpolarization approaches a maximum, or 
“saturation level?” as the frequency of tetanus is in- 
creased above about 25/sec.17 This maximum is about 
2 mv. Depolarization produced by introducing isotonic 
potassium chloride into the first compartment averages 
10 mv. Thus, maximum hyperpolarization corresponds 
to an increase in membrane potential of about 20%. 
Post-tetanic recovery may be rapid or long-lasting 
depending on duration and frequency of tetanus. At 
frequencies above about 25/sec the effect of duration is 
striking, as illustrated in Fig. 8 which shows, superim- 
posed, the recoveries from four tetani of different 
durations, at 50 volleys/sec. The longer the tetanus, the 
longer recovery takes, as if continued activity resulted 
in an accumulation of something, the final dissipation of 
which has associated with it an emf. Areas under the 
total hyperpolarization-time curves (i.e., area above the 
zero-potential line, including the periods of tetanus and 
recovery) have been measured as a function of duration 
of tetanus. Within experimental limits, area varies 
linearly with duration suggesting that the magnitude of 


hyperpolarization is approximately a linear measure of. 


the rate of a recovery process. 
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A-Fibers R. Pipiens 
Stripped nerve in 5 mM K 
PH7 2%COg 20°C 
C—150 volleys/sec 


Change in potential (mv) 


Hours 


FıG. 8. Superimposed tracings of changes in average membrane potential recorded during and after a series of tetani of different 
durations, ‘Traces shown during period of tetanus are envelopes of changes similar to those shown in Fig. 6 [from C. M. Connelly,“ Post- 
tetanic hyperpolarization in frog nerve,” in Proceedings of the National Biophysics Conference, 1957 (Yale University Press, New Haven, 


to be published) ]. 


A variety of observations appears to be consistent 
with the idea that the recovery process involved is the 
outward transport or extrusion of sodium ions. One 
experiment which has a direct bearing on the question 
is illustrated in Fig. 9. The positive afterpotential as- 
sociated with a tetanus is effectively eliminated upon 
complete substitution of lithium for sodium in the 
bathing Ringer’s. The parallelism between this result 


and that described earlier on the effect of lithium 
Ringer’s on activity respiration furnishes strong support 
for the idea that the activity respiration and the positive 
after-potential both have their origin in the process 
involving the outward transport of sodium coupled to 
the inward movement of potassium. 

One further point should be discussed. Ritchie and 
Straub,! in studying the positive afterpotential of 


Change in potential (mv) 


Ringer's 


A-fibers R. Pipiens A 
Stripped nerve in 5 mM K 
20°C 


pH7 2% C02 
7 25 volleys/sec 


Fic. 9. Changes in average membrane potential recorded during and following tetani in solutions havi 
sodium and lithium ions. Potential zero common to the three records. During intervals (about 30 min) between records, 
bathed in the next solution. First and second records, 25-min tetani interrupted for 5 sec each minute. Third . 


100% Lithium 


ng different proporti 


with 13 interruptions of 5 sec and 3 interruptions of 1 min each [from C. M. Connelly, “‘Post-tetanic hyperpolarization | 


record, 2( -min tet 


in Proceedings of the National Biophysics Conference, 1957 (Yale University Press, New Haven, to be published) J. 
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Fic. 10. Changes in average membrane potential recorded during and after 


Hours 


2-hr tetanus at 10 volleys/sec. 


Interrupted 5 sec each minute. 


mammalian C-fibers, came to the conclusion that their 
observations could be explained solely by an elec- 
troneutral pump operating to restore the normal sodium- 
potassium balance following activity. The after-posi- 
tivity was described as a variation of membrane potential 
in response to the variation in the concentration of 
potassium in the fluid immediately outside the nerve 
membrane. This happens as follows: during recovery, the 
pump produces a net flow of potassium inward across 


Change in potential (mv) 
pa 


Claw nerve Libiñia 
20°C Crab Ringer’s 
C—) 1 volley/sec 
(0) 
(0) 20 30 40 
o 1 Minutes 


the membrane. By this action, the concentration of 
potassium at the outside surface of the fiber is lowered 
below its value in the body of the bathing solution. 
Since it is known that a decrease in external concen- 
tration of potassium does produce an increase in mem- 
brane potential, the conclusion appears to be un- 
assailable, qualitatively. One must agree that any ion 
pumping system, whether electroneutral or inherently 
electrogenic, must have this property. It is to be taken 


Fic. 11. Changes in average 
membrane potential of spider- 
crab nerve recorded during and 
following 43-min tetanus at 1 
volley/sec. Not interrupted 
during tetanus. Measured depo- 
larization of spider-crab nerve 
in isotonic potassium chloride 
about 30 my. 
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for granted that the transport system in frog nerve 
produces some potential by this mechanism. On the 
other hand, Fig. 10 illustrates an experiment in which 
the Ritchie and Straub mechanism is not sufficient to 
explain the observed variations of potential. During a 
tetanus, the average membrane potential (exclusive of 
the action potentials) should be either negative or zero, 
according to the Ritchie and Straub scheme. The 
average should be negative during periods in which 
there is a net loss of potassium from the fibers and would 
approach zero if the system approached a steady state 
(i.e., K* pumped back inside during pulse interval 
=K* lost/impulse). The lower envelope of the record in 
Fig. 10 shows that the membrane became hyperpolarized 
four minutes after the beginning of the tetanus and 
remained so for the remainder of it. With correction for 
the apparent depolarization introduced by averaging 
the action potentials, the statement could be made that 
hyperpolarization began at the beginning of the tetanus 
and reached a final average level (in the 100-msec 
interval between pulses) of almost one millivolt. Thus, 
the ionic transport system of frog nerve appears to be 
inherently electrogenic (positive outward). 

A similar statement, based on similar evidence applies 
to unmyelinated limb nerve of the spider crab, shown in 
Fig. 11. Here the membrane also develops a net 
hyperpolarization during low-frequency tetani. 

If one takes as a working hypothesis the proposition 
that the magnitude of hyperpolarization is proportional 
to the rate of sodium extrusion, how can the shapes of 
the recovery curves of Fig. 8 be explained? The kinetic 
behavior of a simple reaction mechanism showing satu- 
ration kinetics is portrayed in Fig. 12. It is the classical 
enzyme-substrate reaction of Michaelis, the rate of 
which is given by the second expression in the figure, 
where K w is the concentration of substrate that pro- 
duces half-maximal rate. The curve describes the rate 
of reaction as a function of time as an initially large 
concentration of substrate decreases. If nerve were to 
extrude the sodium remaining within it with the kinetics 
of this mechanism, the hyperpolarization should follow 
this curve exactly, starting at a point appropriate to the 
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Fic. 12. Calculated time course of rate of disappearance of 
substrate with Michaelis type of enzyme-substrate reaction mech- 
anism. Ky is concentration of substrate producing half-maximal 
rate [from C. M. Connelly, ‘‘Post-tetanic hyperpolarization in frog 
nerve,” in Proceedings of the National Biophysics Conference 1957 
(Yale University Press, New Haven, to be published) ]. 


amount of sodium accumulated during the tetanus. But, 
the recovery curves of Fig. 8 cannot be superimposed by 
displacement along the time axis, and, therefore, do not 
conform to this model. 

If sodium ions enter the active fiber at the nodes and 
are also extruded principally from the nodes, then ions 
not extruded immediately after entry must tend to 
diffuse into the myelin-insulated internodal space. To 
determine the possible effect of diffusion in the internode 
on the kinetics of ionic recovery, the analog model 
shown in Fig. 13 was constructed. The model corre- 
sponds to one-half a node and to the adjacent one-half 
internodal space. A pentode, whose plate current-plate 
voltage curve is approximately a hyperbolic saturation 
curve of the Michaelis type, simulates the ion-extruding 
mechanism at the node; a filter network of 10 RC units 
is the analog of the diffusion field in the internode. 
Positive charges correspond to sodium ions; potentials 
at various points in the plate and filter circuits to sodium 
concentrations; plate current to rate of sodium ex- 
trusion; and injection of a constant current to the plate 
circuit simulates a tetanus. Seconds in the model corre- 
spond to minutes in the nerve. Figure 14 shows superim- 
posed records of the output of the analog circuit, with 
parameters chosen to duplicate as closely as possible the 
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Fic. 13. Analog circuit [from C. 
M. Connelly, ‘Post-tetanic hyper- To d-c 1000 
polarization in frog nerve,” in Pro- amplifier luf 
ceedings of the National Biophysics 
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Fic. 14. Superimposed 
records of changes in 
plate current of analog 
circuit of Fig. 13, during 
and after 25-uA injec- 
tions of different dura- 


tions. Grid bias adjusted 
to give maximum plate 


Current (uA) 


current of 10 wA [from 
C. M. Connelly, “Post- 


tetanic hyperpolariza- 
tion in frog nerve,” in 
Proceedings of the Na- 
tional Biophysics Con- 


Jerence, 1957 (Yale Uni- 

versity Press, New 
Pein Minutes Haven, to be pub- 
25 uA lished) J. 


observations of Fig. 8. The time constant of the network 
in this case is about twice as large as the corresponding 
time constant for the diffusion of sodium chloride in the 
internodal space (estimated assuming free aqueous 
diffusion and node-to-node spacing of 2 mm). It is 
encouraging that the shape of the recovery curve 
changes more or less in the proper manner as the 
duration of tetanus is increased. This oversimplified 
analysis tends to support the hypothesis that the posi- 
tive after-potential has its origin in a saturable, ion 
extruding mechanism operating at the nodes. 


CONCLUDING REMARKS 


The events of recovery may be described tentatively 

s follows: Sodium ions enter a fiber during activity. 

The increase in internal concentration accelerates a 
saturable ion-transport process in which sodium ions and 
high-energy phosphate compounds are obligatory par- 
ticipants. Sodium ions are liberated to the exterior, and 
potassium in the exterior medium participates in the 
over-all cycle, being transferred to the interior. The 
operation of the mechanism generates an emf, directed 
positively outward. Oxidative metabolism maintains the 
supply of high-energy phosphate compounds. 

The observations outlined in this discussion have 
emphasized the coupling between oxidative metabolism 
as an energy source and ionic transport processes as an 
energy sink. The molecular mechanisms of active ionic 
transport across membranes are almost a complete 


mystery. Proposed mechanisms range from shuttling 
carriers to micro-pinocytosis to enzymatic modification 
of pore configuration or charge distribution. This is an 
outstanding problem in cellular and molecular biology 
today. 
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HIS paper is concerned with those aspects of 
organized neural complexity that permit organ- 
isms, such as man, to deal successfully with their sensory 
environment. Examples presented below illustrate how 
man, as an organism, detects, orders, and identifies 
events in his environment to which his sense organs are 
sensitive. The roles that stimulus intensity and time 
play in these operations are emphasized, as opposed to 
those aspects of sensory quality (pitch and hue, for 
example) for which one might expect the underlying 
mechanisms to differ rather considerably in going from 
one sensory modality to another. 


Studies of communication processes in the nervous 
system should be based on a realistic view of the way 
in which the total organism reacts when it reaches 
selectively into its surroundings—be it under self- 
instruction or under instructions from others—to pro- 


cess stimuli that have informational value. This em- 
phasis upon the organism’s behavior in communication 
tasks leads to a preoccupation with certain dependent 
and independent variables. It leads to an inquiry into 
the “operating characteristics” of the organism in order 
to be able to specify, albeit in a statistical manner, its 
maximum sensitivity, its resolving power, its dynamic 
range, its characteristics in the frequency and time 
domains, its information-handling capacity, and so 
forth. 

The data presented in this paper come mainly from 
contemporary psychophysics. Subjects are not asked to 
introspect, but, rather, to indicate by standardized 
motor responses (most often verbal responses) whether 
they judge a stimulus to be present or not, whether 
they judge two stimuli to differ, whether they can order 
a set of stimuli or identify members of the set. 

Assume that one is dealing with well-instructed 
cooperative subjects who are skilled in the execution of 
the expected response and who are familiar with the 
set of stimuli that will be presented to them. 

Assume also that there is agreement on a program of 
stimulus presentation and on a method of response 


* This work was supported in part by the U. S. Army (Signal 
Corps), the U. S. Air Force (Office of Scientific Research, Air 
Research and Development Command), and the U. S. Navy 
(Office of Naval Research). i 

+ Experiments parallel to those considered here could be con- 
ducted with monkeys or even with rats or pigeons, if appropriate 
behavioral techniques were used. However, few experimenters! 
have been willing to train their animals thoroughly enough, or to 
reformulate the required response in a way that was compatible 
with the animal’s behavioral repertory. The short-cut of employing 
highly motivated and intelligent subjects permits experimenters 
to bypass a lengthy learning process. 
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analysis. Thus, one must decide whether the admissable 
response categories shall be simply: “yes?” and “no,” 
“same” and “different,” “more” or “less,” or whether 
they should include the set of natural numbers. There 
must also be rules for dealing with false responses 
(false-alarm rate), and, finally, one must decide whether 
or not to quantify the temporal aspects of responding 
in addition to recording the mere emission of responses. 

Such considerations may seem unnecessary details 
that should be left to methodologists, but unless there 
is a realistic understanding of the measurement and 
quantification problems in fields like psychophysics and 
neurophysiology, one can hardly evaluate the store of 
knowledge they have produced, or understand the 
relation of such knowledge to the knowledge that exists 
in the several areas of biophysics. 


INFLUENCE OF INSTRUCTIONS UPON THRESHOLD 
OF DETECTION 


The task of detecting the presence of a stimulus 
includes the classical absolute threshold as well as the 
generalized “masked” threshold—i.e., detection of the 
stimulus against a background of sensory stimulation. 
Such “background noise” refers necessarily to the sense 
modality under study. Even the best isolation booths 
do not reduce to zero all unwanted sensory influx other 
than that under the experimenter’s control. 

It has long been known that instructions signifi- 
cantly influence both the shape and the anchor points 
of “frequency-of-seeing” or “frequency-of-hearing” 
curves. A recent study by Smith and Wilson? has force- 
fully illustrated this dependence upon the judgmental 
criteria that are suggested to observers. Ten observers 
were asked to report the presence of a tone in noise. 
The three curves of Fig. 1 show how the identical group 
of observers performed this task under three different 
sets of instructions. Examples of the instructions that 
were designed to induce a “‘conservative” or a “liberal” 
attitude in the reporting of signals follow: 


Conservative groups: 


“Keep in mind that it is important for you to be sure 
you hear the tone. In many cases, there will be no tone, 
and you will be making a mistake if you indicate that 
you hear one. None of the tones is really easy to hear, 
but don’t push the switch unless you are sure that you 
heard the tone. If you are in real doubt as to whether pn 
or not you heard a tone, assume that there was none.” 
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Fic. 1. Cumulative percentage of instances in which the 
presence of an 800-cps tone was reported as a function of signal/ 
noise ratio. Attitude towards listening tasks is varied by means 
of instructions (see text). Each of 10 observers made approxi- 
mately 150 judgments per point.? 


Liberal groups: 


“Keep in mind that all of these tones are hard to hear, 
and that you will rarely be absolutely sure that you 
heard something. But if you think you heard something, 
probably you did—report it. If you are very sure that 
zou didn’t hear anything—then don’t touch the switch.” 

The results illustrate clearly that one cannot hope to 

predict accurately from a knowledge of signal-to-noise 
ratio alone the performance of a group of observers, 
even if the task calls for nothing more than the reporting 
of the presence of a single class of signals against the 
background of an unvarying noise. 

Statistical models of the detection process have been 
formulated? in which performance is predicted not only 
on the basis of stimulus values and organismic variables ; 
there are additional parameters that are sensitive to 
instructions and to amounts of reward or punishment. 
Such models may account for behavior such as is 
depicted in Fig. 1. A physiological interpretation of 
this sensitivity of the organism to nonstimulus variables 
remains to be given, though some physiological phe- 

nomena that provide grist for speculation have been 
reported in recent years. “las 
When tasks that involve some ordering of stimuli are 
ed to organisms, three problems are encountered. 
These stimulus-ordering operations may be listed under 
the headings of (a) differential sensitivity, (b) identifi- 
Eon, and (c) psychophysical scaling. 
DIFFERENTIAL SENSITIVITY 


s as just-noticeable difference (jnd), dif- 
sholds, and difference limens have been 
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with various degrees of certainty, between two stimuli. 
All of these measures of differential sensitivity are 
statistical in character and the jnd increases in size if 
a listener is required to identify correctly the more 
intense of two sounds in 90% of the trials than if he is 
required to be correct in 75% of the trials only. 

More than a century ago, Weber formulated, on the 
basis of his own experiments, a generalization according 
to which the jnd between two stimuli gets larger as the 
stimuli get larger. This assertion has been called Weber’s 
law, and the expression} 


Al/I=k (1) 


has been written to express this relative constancy of 
human differential sensitivity. Here, Z is the intensity 
of the reference stimulus, A/ represents the difference 
along the intensity dimension that is discriminable (in 
75% of the trials, for example), and k is a constant. 
Weber’s law states, therefore, that the jnd will remain 
proportional to the intensity of the reference stimulus. 
Today, there exist Weber functions (see Fig. 2) that 
cover a wide enough dynamic range to provide a fair 
test of the validity of Weber’s law. The data indicate 
that, at least for auditory and visual stimuli, this rela- 
tion breaks down as the absolute threshold is ap- 
proached. Both frequency and intensity discrimination 
worsen as the stimulus becomes weaker. The existence 
of the so-called achromatic and atonal intervals’ is 
further evidence along these lines. A better fit to the 
data of Fig. 2 is then provided by an equation of the 
form. 

AI=kUI+N;); (2) 


N; has such a value that for low values of J, AJ depends 
primarily upon N;, while for medium and high values 
of I, N; becomes negligible. 

The introduction of N; can be interpreted in different 
ways. Years ago, both Fechner and Helmholtz suggested 
that Weber’s law be modified to acknowledge the 
presence of intrinsic interfering stimulation that could 
not be eliminated. In contemporary jargon, one might 
say that N; corresponds to the “internal noise level” of 
the organism. This expression should, however, not be 
taken too literally. Several theoretical models exist in 
which a generalized noise term is a crucial feature of 
the discriminating organism. Thus far, modifications of 
Weber’s law have aimed at agreement with the data 
near the absolute threshold. There is some evidence 
that the presence of extremely intense sensory stimula- 
tion leads again to some deterioration of discrimination. 
It is easy to see that the model could be further modified 


_} The symbolic notation employed in writing this Weber frac- 
tion or Weber ratio seems to emphasize the intensitive aspects of 
a stimulus. Actually, what is involved is not necessarily some 
aspect of energy flux, but rather loose usage of the term “‘inten- 
sity.” Thus, fractions haye been determined for lifted weights, 
length of line, pressure on the skin, and so forth, and comparisons 
between sense modalities have been made on the basis of these 
numbers. 
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by the introduction of another term corresponding to 
an equivalent noise level that would only come into 
play for large values of J. 

One further comment regarding the amount of in- 
variance that it is realistic to expect in psychophysical 
data. Figure 3 summarizes the findings from more than 
half a dozen investigators, all of whom were trying to 
measure man’s ability to discriminate pitch. In different 
experiments, different subjects, several psychophysical 
methods, several methods of stimulus presentation, and 
even several response criteria were employed. While it 
does not seem possible to read from this graph a value 
for the jnd for frequency, it is possible to point out that 
practically all of the data lie within the two dashed 
lines; i.e., their total spread is approximately a decade. 
The compilers of handbooks or of “Critical Tables on 
Sensory Performance” would thus be well advised to 
emphasize those aspects of psychophysical judgments 
that exhibit what might be called “relative invariance.” 
Instead, they often insist upon quoting numerical values 
(sometimes with several significant figures), perhaps 
with the forlorn hope that the users of such tables will 
take these values with a grain of ceteris paribus. 


SPAN OF ABSOLUTE JUDGMENT 


From data on just-noticeable differences, estimates 
have been made of the total number of distinguishable 
sounds, lights, or smells. Usually, the authors of these 
estimates have simply divided the range over which an 
organism is sensitive by the corresponding average jnd. 
Thus, the “auditory area”? has been estimated to 
represent several hundred thousand distinguishable 
sounds, and the number of distinguishable colors has 
been estimated to be of the same order. 

Several unjustified assumptions underlie these extra- 
polations. Subjects may find it quite easy to detect a 
difference between two stimuli without being able to 
identify either of them. In everyday situations, an 
individual rarely has difficulties in distinguishing be- 
tween two other individuals even though he cannot tell 
which one is Jones and which one is Smith. Absolute 
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identification requires much more information than 
differential discrimination. 

Under the influence of the mathematical theory of 
communication," with its concept of channel capacity, 
many experiments have been carried out to determine 
man’s capacities to process selective information. By 
considering man as a communication channel with 
stimuli for inputs and responses for outputs, it has been 
possible to estimate maximal rates of transmission 
through him. The amount of input information can be 
varied either by varying the amount of information per 
stimulus or by varying the rate at which stimuli are 
presented. As Quastler!? has pointed out, the maximal 
rate of transmission is limited by two factors: (a) people 
cannot emit more than 5 to 9 responses per second, and 
(b) people get confused when they have to discriminate 
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rapidly among too many alternatives. Thus, even under 
optimal conditions, transmission rates of more than 25 
bits per second are hardly ever observed. The type of 
motor response required and the degree to which stimuli 
and responses are “compatible”! determine the par- 
ticular figure that will be obtained for the transmission 
rate. 

A different aspect of man’s information handling 
capacity comes into focus if subjects are no longer 
required to respond as quickly as possible. They now 
need simply to tell which of several alternative stimuli 
occurred on a particular stimulus presentation. Given 
knowledge of the probability of occurrence for the dif- 
ferent stimuli and of the accuracy with which the sub- 
jects perform, one can calculate the amount of informa- 
tion in these absolute judgments. This experimental 
paradigm has been much used in connection with 
“simple” stimuli, such as pure tones or monochromatic 
lights. The results from these experiments in several 
sense modalities have led to the emergence of what 
Miller™ has called “the magical number 7 plus or minus 
2.” This number represents the number of alternative 
stimuli that subjects can accurately identify as long as 
only one aspect (“dimension”) of the stimulus is varied. 
This magical number thus corresponds more or less to 
the upper and lower bounds for the “span of absolute 
judgment.” More-precise estimates of performance 
figures will need to take account—in addition to indi- 
vidual differences—of such experimental conditions as 
the range of stimulus variation, the spacing of stimuli 
over a given range, and the number of available response 
categories. Subjects seem also to perform better if 
knowledge of results is fed back to them after each 
response. 

Sensory communication in everyday situations is, 
however, not restricted to making unidimensional 
judgments. The sounds, shapes, tastes, and smells upon 
whose successful recognition one’s sensory commerce 
with the environment depends are basically ‘‘multi- 
dimensional displays.” Laboratory experiments with 
such displays indicate that the span of absolute judg- 
ments increases in size as an increasing number of stimu- 
Jus aspects are varied. Pollack and Ficks!® obtained 7.2 
bits per judgment when they presented to their subjects 
sounds that varied along six different acoustic ‘“dimen- 
sions.” This performance corresponds to correct identi- 
fication of approximately 150 different categories. 
Comparison of the Pollack-Ficks data (1.2 bits/judg- 

ment/dimension) with the characteristic figure for 
unidimensional judgments (2.3 bits/judgment) shows 
that, while each new dimension helps to transmit more 
information, these increments tend to diminish as the 
dimensionality of the display increases. But it is not 
where the asymptote lies for this relation be- 
known = : fo Te 
n bits per judgment and stimulus dimensionality. 
tyes anisms seem much better in absorbing infor- 
Thus, E the environment by making several crude 
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distinctions simultaneously, instead of by making ex- 
tremely precise discriminations of a single aspect of a 
sensory stimulus. This finding will not surprise those 
who are familiar with the facts of speech communication 
where the precise spectral composition of speech sounds 
is rather unimportant in conveying information. In- 
stead, man’s ability to recognize speech sounds seems 
to depend upon the detection of the presence or absence 
of “distinctive features’’!® that are thus rather analogous 
to the dimensions of the acoustic display. 


PSYCHOPHYSICAL SCALING 


Among all the attempts to establish orderly relations 
between verbal responses and sensory stimuli, Fechner’s 
law is probably the best known. By the use of indirect 
methods in which he used the just-noticeable difference 
as his scale unit, Fechner concluded that psychological 
magnitude varied as the logarithm of stimulus magni- 
tude. There has been much argument about the meas- 
urability of sensation, about the validity of Weber’s 
law, about the legitimacy of the Fechnerian integration 
of Weber’s law which Fechner used to derive his own 
law; and yet, by and large, Fechner has been correct in 
predicting that his psychophysical edifice would stand 
because his detractors would not be able to agree on 
how to tear it down. In recent years, S. S. Stevens,” 
whose concern with measurement and scaling had led 
him to investigate systematically many of the funda- 
mental assumptions in psychophysics, has produced a 
series of painstaking and ingenious studies of how sub- 
jects order stimuli by direct methods. Stevens did not 
accept the assumption according to which the unit of 
resolving power§ is the natural unit for experiments in 
which observers are asked to order a stimulus con- 
tinuum by assigning numbers to it and in which these 
numbers are to reflect the judged or subjective magni- 
tude of the stimuli. Stevens! divides perceptual con- 
tinua into two general classes: continua that have to do 
with kow much are called prothetic; continua that deal 
with what kind and where (position) are called metathe- 
tic. Well-known examples of prothetic continua are 
loudness and brightness; pitch is an example of a 
metathetic continuum. On the basis of experiments 
involving more than a dozen prothetic continua, Stevens 
concludes: “. . . there is a general psychophysical law 
relating subjective magnitude to stimulus magnitude 

. this law is simply that equal stimulus ratios pro- 
duce equal subjective ratios. On numerous perceptual 
continua, direct assessments of subjective magnitude 
seem to bear an orderly relation to the magnitude of 
the stimulus. To a fair first-order approximation the 
ratio scales constructed by “direct” methods (as opposed 


§ This indirect derivation of the scale of psychological magni- 
tude parallels in some sense the extrapolated number of dis- 
tinguishable sounds or colors. In both instances, the just-notice- 
able difference is the key concept of quasi-theoretical deductions 
without regard for operational compatibility. 
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to the indirect procedures of Fechner) are related to the 

stimulus by a power function of one degree or another.’”!8 

Figures 4 and 5'°.?° present two samples of the numer- 

ous results obtained. In Fig. 4 there are plotted the 

magnitude estimates of loudness made by listeners who 

had not been given standard sounds of fixed intensity 

for comparison purposes. The eight different intensities 

were presented irregularly, and the subject estimated 

the loudness by numbers of his own choosing. These 

numbers were multiplied by appropriate factors to make 

each subject’s estimate at 80 db the same, namely, 10. 

The points represent the medians of these “normalized” 

judgments, and the vertical lines represent the inter- 

quartile range of the judgments. By plotting the esti- 

mated magnitudes on a logarithmic scale versus the 

logarithm of stimulus intensity, the slope of the straight 
line is the exponent of Stevens’ power function 


$ Y=kS"”, (3) 


Here, Y is the psychological magnitude, S is the stimu- 
lus magnitude, and k and n are constants. For the 
stimuli that have been studied, the approximate value 
of n varies from 0.3 for loudness and brightness|| to 
4 to 5 for electric shocks. The exponent for vibratory 
stimulation applied to the finger has a value of approxi- 
mately 1. It is interesting to speculate that these ex- 
ponents stand in inverse order to the dynamic ranges 
for the relevant sense modalities. The dynamic range 
for hearing and vision spans approximately 12 logarith- 
mic units, the range for tactile vibration spans perhaps 
3 to 4 logarithmic units, and the range for electric shocks 
spans only about 1 logarithmic unit. Does the fact that 
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Fic. 4. Magnitude estimates of loudness made with 
no fixed standard. 


a || The exponent is 0.6 if stimulus magnitude is expressed in 
terms of sound pressure instead of sound energy. 
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Fic. 5. Magnitude estimates of brightness of a luminous spot; 
the median estimates of ten observers are plotted.» 


the product of the exponent multiplied by the dynamic 
range yields numbers of the same order of magnitude 
have any deeper significance? Does it reveal anything 
about the functioning of the sense organ, about the 
functioning of the nervous system, or does it rather 
reveal a certain constancy in verbal behavior—i.e., 
about the way in which people use numbers between 
the threshold of detection and the threshold of pain? 

Another recent study by Stevens! deserves to be 
reported in this context. Once scales of subjective 
intensity have been established for several sensory 
continua, can these scales be used to predict what 
people will do when asked to match directly the loudness 
of a noise against the subjective intensity of a vibratory 
stimulus or an electric shock to the finger? Cross- 
modality matches were made between each modality 
and each of the other two. To a good approximation, 
the predicted functions were confirmed by direct experi- 
mentation, and the internal consistency of the direct 
approach to scaling has been further validated by this 
cross-modality study. 
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TEMPORAL CHARACTERISTICS OF SENSORY 
PERFORMANCE 


. 


Since Helmholtz’s time, experiments on reaction time - 
have been used to make inferences on the processing of | 
sensory signals in the nervous system and on the dura- 
tion of so-called mental processes such as discriminati 
recognition, and choice. It was established that rea: 
times to sounds, electric shock, and touch were 
than reaction times to visual stimuli; also, that rea 
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Fic. 6. The straight line represents the best fit to data obtained 
by Hyman” from one of his subjects in response to visual stimuli; 
the open and closed circles represent reaction-time data obtained 
by Albert? from one of his subjects in response to auditory stimuli; 
here, the stimulus set was composed of either two or four elements. 


times decrease—at first rapidly and then more slowly— 
as the intensity of a sensory stimulus is raised above 
threshold (Fig. 2). 

In addition to these data on simple reaction times, 
other experiments have attempted to assess the effect 
of task complexity on the speed of more discriminative 
reactions. It has been shown that reaction time increases 
as the stimulus ensemble increases in size and as its 
members become harder to distinguish from each other. 
This whole topic had lain more or less dormant for 
several decades until it became obvious that the appli- 
cation of certain concepts from information theory 
might make it possible to explore another aspect of 
man’s capacity to handle information. A series of experi- 
ments” 4 indicated that, at least over a certain range, 
choice time (or disjunctive reaction time) was propor- 
tional to the average amount of information transmitted 
per stimulus-response event. This finding applies equally 
well to visual as to auditory stimuli (see Fig. 6), and 
there is sufficient agreement between the data obtained 
by different experimenters that it is possible to talk 
about a characteristic figure for the “rate of gain of 
information” whose value lies near 6 bits per second. It 
is not possible to account for these time intervals in 
terms of delays on either the afferent or the motor sides 
of the nervous system. Enough is known about the 
Jatencies of evoked electric activity to permit one to 
say that these events consume only a fraction of the 
time involved in even the fastest reaction. The dis- 
crepancy becomes much greater, of course, upon con- 
sidering sensory stimuli that are barely above the 
absolute or masked threshold or reaction times that 

involve complex discrimination. Thus, one must infer 
that a large fraction of the total reaction must be spent 
“central information processing.” 

Recent experiments by Davis’*7 and others throw 
w light on these inferences. Davis presented to 
his observers two visual stimuli succession T Ea 
them to respond to each stimulus in tum. He then 

‘od the interval between his two stimuli randomly 
varie 50 and 500 msec. As long as the separation was 
between 50 msec, the reaction time to either stimulus 
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was approximately 200 msec; but when the time interval 
between the two stimuli fell below 250 msec, the second 
reaction time increased sharply. The second reaction 
time reaches its maximum value, approximately 400 
msec, for an interval of 50 msec. The same phenomenon 
occurs when one stimulus is visual and the other is 
auditory, no matter what their order. Next, the subjects 
were instructed not to respond to the first of two lights. 
The results show little improvement over the delays 
that occurred when motor responses were made to both 
signals; the only difference was a slight shortening of 
the second reaction time, by about 20 to 30 msec. In a 
final experiment, the subjects were asked to make a 
“response” without a signal having occurred. Following 
the pressing of the key, a light came on (after an interval 
that was again randomly varied between 50 and 500 
msec) to which the subjects were required to respond. 
This time there was no additional delay for any time 
interval. The subject was just as efficient in responding 
to a signal that came 50 msec after he pressed the first 
key as he was in dealing with a signal that came 500 
msec afterwards. It would thus appear that the actual 
performance of the response does not contribute much 
to the delays that are observed, but that these delays 
are imputable to the processing of a signal that is still 
going on when the second signal is presented. There is 
further evidence that emphasizes the amount of time 
that central processes play in the handling of sensory 
information. In hearing and in vision, there is almost 
perfect integration of energy at the absolute threshold, 
as long as the stimulus is being presented for time 
intervals that are shorter than 200 msec. There is a 
whole class of phenomena in both hearing and vision in 
which a strong succeeding stimulus interferes with the 
“proper” analysis by the nervous system of a preceding 
weaker stimulus. The shorter the time interval between 
the two stimuli and the greater the discrepancy in 
intensity, the more severe is the interference. Since the 
delay times on the afferent side of the nervous system 
are longer for weak stimuli than for strong ones, the 
more intense stimulus has a chance to “catch up.” 
However, the time intervals involved are sufficiently 
long that these differences in delay times along the 
ascending pathway are hardly the whole story. There 
is the further fact that interference with the more com- 
plex discrimination seems to be effective over longer 
time intervals than with more primitive tasks (such as 
simple detection of a preceding stimulus). 

The preceding illustrations all point to the conclusion 
that the analysis of sensory information is a time- 
consuming task that occupies an organism’s central 
nervous system in relation to the amount of information 
to be processed. This is hardly surprising; yet, it is 
often overlooked in practical situations. 

The foregoing briefly reviews man’s sensory perform- 
ance as it appears when examined by the quantitative 
methods of contemporary psychophysics. Organisms 
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are able to deal with the sensory bombardment that 
impinges upon them by being satisfied to make crude 
discriminations, unless specifically instructed to make 
precise ones (at the cost of spending more time making 
them and of neglecting other sensory events than those 
under analysis). The flexibility of organisms in switching 
from one sense modality to another, from one aspect of 
sensory display to another, is bought at the expense of 
relatively low rates for the transmission of information. 
The mere existence of organisms seems to entail a high, 
internal noise level that prevents responses to environ- 
mental stimuli from being anything but statistical in 
character. As long as thresholds of tolerance are not 
approached, the more energy a sensory stimulus de- 
livers to an organism, the higher is the probability that 
the signal will be processed in the central nervous system 
with both discrimination and dispatch. One may para- 
phrase this situation by saying that the higher the 
energy level of a sensory signal, the greater the redun- 
dancy with which the representation of this signal will 
be handled in the central nervous system. These are 
some of the considerations that led us to select certain 
aspects of the electrical activity of the nervous system 
for quantitative study. 
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N this discussion of biological transducers and coding, 
no attempt is made to describe in detail the specific 
codes in which diverse animal (and plant) transducers 
operate, for this attractive topic in comparative physi- 
ology and biophysics is discussed at length in other sec- 
tions of this book by Bullock; Hartline, Katz, and 
others. Nor is an attempt made here to correlate over-all 
organismic behavior with transducer functions for that 
is, to a large extent, the burden of the communication 
by Rosenblith. Instead, the niche between organismic 
behavior and detailed transducer codings is examined 
to determine whether or not it is possible to develop 
new and useful analytical tools by which one can better 
formalize the behavior of individual transducer elements 
and their interrelationships, so as to approach a deeper 
and more intuitive understanding of the resulting whole 
organism’s stability, decision making, intelligent be- 
havior, and homeostatic performance. 

While it is impossible to develop the point of view 
adequately in a few pages of text, it is the firm convic- 
tion of the author that biophysical theory, where it has 
developed at all, has developed too closely in the foot- 
steps of physical theory and has not yet evolved mathe- 
matical models and analytical techniques specifically 
tailored to fit biological phenomenology. Instead of 
adapting physical-science terminology and theoretical 
formulations to its needs, it has adopted them as though 
they were “true” rather than merely convenient ap- 
proximate descriptions of experimental experience. This 
has been especially true in the studies of biological 
transductive processes where the biologist, armed only 
with experimental facts, has for the first time come up 
against the new and highly formalized jargon ofinforma- 
tion theory and automatic-control engineering. Through 
modesty or lack of mathematical dexterity, he usually 
tries to make-do with these mental tools as furnished 
him ready-made by physics and engineering. It is not 
the author’s purpose to deny that one must start with 

available techniques, but rather to propose that one 
must quickly amend and supplement these with uniquely 
biophysical hypotheses and an appropriate calculus, if 
the problems of this field are to be reduced to a com- 
plexity where intuitive insight into them can readily 
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It is necessary, however, to set up some ground rules 
before launching into a discussion of biological trans- 
ducers, their transfer functions and other possible means 
for studying their organization and coding. As there 
are no uniformly adopted definitions for transducers in 
biology and biophysics, an explanation is probably 
necessary for the term biological transducers. The usage 
of the author may be somewhat different from that of 
others in the biophysical field, and perhaps different 
even from that of other authors contributing to this 
volume. 

What, then, is abiological transducer? The term trans- 
ducer is borrowed from the engineering sciences, where 
it is usually defined so as to include those devices which 
“lead across” or transform energy from one form into 
another, and thus translate signals from one energy 
modality—such as sound—into another—such as a 
varying electrical current. While the physical-science 
definitions do not usually specifically exclude gross- 
energy converters such as steam boilers, storage bat- 
teries, and electric light bulbs from being considered as 
transducers, it is tacitly assumed that only devices 
which serve a communication or signal-converting func- 
tion will be called transducers. Thus, a motor turning 
the rudder or shifting the ailerons of an airplane in 
response to instructions from the autopilot is clearly re- 
garded as a transducer; one driving a washing machine 
is not. Similarly, a resistance-wire strain gauge is con- 
sidered a transducer in its function of converting me- 
chanical position changes into electrical-current varia- 
tions, but its incidental function of converting electrical 
power into heat isnot usually thought of as transduction. 

Biological transducers, then, should logically be the 
biological counterparts of the physical sciences’ trans- 
ducers and should comprise all of those biological units 
and systems which perform transformations of energy 
modality within the context of a communication or in- 
formational transformation function. Thus, biological 
transducers should include not only the familiar sensory 
transducers such as the rods and cones of the retina, 
the thermal, olfactory, and gustatory receptors, the 
vestibular and cochlear receptors, and the many pro- 
prioceptive organs, but they should also include numer- 
ous synaptic and related intermediary units involved in 
biological-data processing and storage, and the output 
or motor transducers as well. Nor is there any reason 
to exclude humorally mediated transducers. The con- 
cept of a transducer fits the function of a cortical neuron 
or of the motor end plate at least as well as it does that 
of the muscle-spindle strain gauge. 

It is convenient to use the terms afferent and efferent 
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in reference to biological transducers to distinguish in- 
coming from outgoing transduction functions, and it 
seems desirable to adopt directly from engineering the 
terms “passive” and “active” transducers, to distinguish 
between those where the energy of an incoming signal 
provides directly the output energy and those trans- 
ducers, very common in biological systems, where the 
incoming energy triggers or controls the release of locally 
supplied energy into an output modality. 

Transducers that operate in one direction only are 
called unilateral transducers, while those that operate 
in either direction are called bilateral. The quite special 
transducers that not only operate in either direction, 
but also obey the reciprocity theorem, are called re- 
ciprocal transducers. 

The linear transducer is one where the output can be 
related through linear-differential or algebraic equations 
to the input function. It is a type of transducer not 
often found in biological nature, but a form to which 
most experimental transducer results are approximated, 
as its behavior is easily formulated and understood 
mathematically. The linear transducer does deserve 
special mention, for it is out of the notion of a linear 
transducer that the transducer transfer-function idea 
arose. 

The transfer function of a linear transducer is usually 
expressed simply as the ratio of its output to its input 
signal with frequency as a parameter. This ratio is 
usually a complex number which may be dimensionless 
in the case of a transduction between similar modalities, 
or may carry within it the ratio of the dimensions in 
which output and input are expressed. It is valuable to 
remember that, around any properly described, closed, 
biological control loop, transfer functions must cancel 
dimensionally. Thus, a mathematical description of a 
biological loop system can be tested much as elementary 
physics equations are tested dimensionally by cancella- 
tion of dimensions and by reduction to mass, length, 
and time units. 

In studying biological transducers, it is often desira- 
ble to regard the transfer function as a much more 
general operator than merely a linear complex ratio, 
for many of the merits of transfer-function analysis re- 
main even where the transfer operator is a very elabo- 
rate time- and pattern-dependent code. 

It is evident from these definitions and the associated 
discussion that the foregoing concept of biological trans- 
duction is very inclusive. The term “biological trans- 
ducer,” is used here to include all of those organs and 
processes wherein the purpose of the reaction involved 
is to convey, store, or process information about the 
environment and organism for its preservation and effec- 
tive operation. Included, therefore, are all of those 
processes whereby information is transduced or led 
across some kind of a transformative exchange, with no 
restriction, for example, to sensory receptor organs. In 
this definition, genetic determination, many processes 
of differentiation and growth, as well as reflex and higher 
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level abstraction mechanisms, become specifiable in 
transductive terms. It may be felt that this is too in- 
clusive a definition, for it obviously includes a great deal 
of biochemistry, hormonal control, enzyme dynamics, 
growth processes, and central nervous whole-organism 
behavior control. It can be limited, however, and made 
meaningful by adding the loose criterion that has entered 
into some European engineering literature to distinguish 
between what are called large energy processes and small 
energy processes, and by recognizing that this distinc- 
tion is one of convenience rather than one of funda- 
mental importance. 

Behind this categorization is the idea that, while every 
energy turnover process, whether power-plant genera- 
tion of electricity or ATP utilization in muscle, involves 
a certain amount of direction, and while every informa- 
tional biological activity, even so explicitly directive 
as the axonal propagation of an impulse, requires the 
expenditure of energy, one can usually choose one of 
these as being of dominant importance in the particular 
process in question. 

One often finds in complex control systems that up 
to about 10 or 15% of the total energy consumed will go 
into processes that may legitimately be considered as 
primarily informational. This fits reasonably well with 
biological systems where, if one considers the human as 
an example, something like 10 to 15 w of power goes 
into the total nervous-system maintenance and opera- 
tion, whereas the total average power turnover for a 
normal-size person of average activity is somewhat over 
100 w. It is sound economical biological design that one 
should put perhaps 10 to 20% of the available metabolic 
energy to work for learning about the environment, 
deciding what to do about it, and issuing proper in- 
structions to do something appropriate. 

Now, to turn to a different aspect of biological trans- 
ducers and coding, it is rather hard to break from tradi- 
tion to frame variables conforming to codes in which 
biological processes are described by the organism, as 
against those terms in which one is used to describing 
similar phenomena in the physical sciences. In engineer- 
ing analysis, one very firmly separates time as a vari- 
able apart from space coordinates, whereas one is forced 
to conclude that the organism uses in almost every 
transducer a conveniently mixed code and pays very 
little attention to whether it separates out temporal 
and spatial aspects of pattern. 

Figure 1 illustrates one of the classical sensory-receptor 
code patterns (that of the carotid-sinus pressure re- 
ceptor) to show that these receptors do not slavishly 
report on the value of a function being transduced or 
on the time derivative of this function exclusively. In- 
stead, they answer the biologically important questions: 
“What is the present state of affairs??? “What is 
changing?” and “‘What has been the recent past ex- 
perience?” These reports correspond loosely to engineer- 
ing transducer language in which function, time deriva- 
tive, and integral transducers are involved, but in 
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Fic. 1. Carotid sinus pressure transducer response (dog). Central record represents multifiber nerve response. Upper curve envelope 
is directly measured blood pressure referred to lower base line as zero. Mean blood pressure 150 mm Hg, interruption rate on pressure 
measuring trace 60/sec. Each pattern represents one heart beat (records by courtesy John W. Trank). 


engineering transducers these factors usually are rather 
carefully separated out or very cautiously combined. 
Figure 1 is a record of the action potentials recorded 
from a few nerve fibers leading from the carotid-sinus 
end organs in a dog. This organ is concerned with re- 
porting on the mean blood pressure in the animal and 
also on the details of pressure during the heartbeat. In 
this case, the average pressure is an important item and 
its absolute value remains important. Therefore, this 
must be a relatively nonaccommodating organ in the 
sense that it must continue to report the present state 
of affairs. 

On a short time basis, it has a large “derivative” term, 
and thus is modulated from a high frequency almost to 
cutoff during each beat. Its steady-state response as 
reflected in the mean frequency, and especially in static 
records like those of Fig. 2 for a similar organ in a rabbit, 
is shown to convert static pressure almost linearly into 
pulse frequency. In this static sense, one might call this 
carotid-sinus organ a linear pulse frequency-modulation 
transducer. One must however, point out the danger of 
specifying a transducer as “linear” or “nonlinear” on 
this basis by referring to the two graphs accompanying 
Fig. 2 where data from the same impulse records are 
plotted in (a) as a frequency vs pressure-transfer rela- 
tionship and in (b) as a pulse interval vs pressure graph. 
Note how the choice of variable determines “linearity” 
or “nonlinearity” in this casual sense. One must, there- 
fore, define quite carefully on a theoretically sound basis 
what kind of nonlinearity is meant when that term is 
used. 

This transducer an engineer might still consider rea- 
sonably well-behaved, but now consider the photo- 
receptors of Fig. 3 which use an amplitude, nonlinear 
heavily time-influenced code. These transducers operate 
on a pulse-code modulation characteristic much like 
that which would be anticipated for a leaky, resistance- 

condenser coupling decay characteristic. Frequency is 
very high initially and decays for constant illumination 
so that frequency increases, but not linearly, with in- 
tensity, and decays with time for any given intensity. 
In addition to this general pattern of response, some 
mal photoreceptors also give oppositely polarized 
nses so as to remark pointedly that the light is 
sing Or going out; an event worthy of mention, 
the negative sense. Some sensory trans- 
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ducers, such as those in the tongue concerned with 
measuring temperature, use their nerve frequency spec- 
tra doubly, once in an ascending sense, once in a de- 
scending, the range in use being determined by a sepa- 
rate signal. 


It was only about fifteen years ago that it was dis- 
covered that “contrast” or accentuation of detailed 
changes in a one- or multi-dimensional field of transduced 
data could be greatly enhanced by systematic suppres- 
sion in a readily specifiable manner by each region of 
every adjacent region. In fact, it was only in about 
1945 that spatial, maximum, and minimum finding net- 


works were first produced, temporal peak predictors 
having come only a little earlier. 

Now it is found that a very similar system exists in 
optical, acoustic, and probably other sense-modality, 
biological transducer systems, so that much of the 
pattern finding is done before data reach the central 
nervous system. 

Where there is ample signal energy to operate many 
transducers at substantial output, as in foveal bright- 
light vision, this detailed differential discrimination is 
common, while in threshold signal performance, adja- 
cent units are often pooled to improve statistics. There 
is an interesting auditory case reported where noise 
heard in one ear can be used to subtract out coherent 
noise from noise plus signal in the other ear, to permit 
recognizing signal otherwise submerged below recogni- 
tion. In this case, two ears cooperate in a field-subtrac- 
tive type of selective filtering. 

It is often stated that the sensory codes are essentially 
logarithmic in the sense that equal fractional increases 
in intensity of stimulation rather than absolute increases 
are required to evoke correspondingly detectable in- 
creases in response. While this generalization has been 
elevated to the status of a “law” and is often roughly 
true over a decade or two, it is usually noticeable that 
there is a threshold region where logarithmicity is not 
obeyed and that there is a bending over of even the 
logarithmically plotted line at extremely high intensities. 
In long-range senses, there will usually be one or more 
automatic-gairi control systems operating to extend the 
working range. It will interest engineers and physicists 
to know that biological transducers often operate in 
pseudo-instability, much like that of the superregenera- 
tive radio receiver in the absence of signal, and share 
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Fic. 2. Steady-state response of carotid sinus pressure trans- 
ducer [from D. W. Bronk and G. Stella, Am. J. Physiol. 110, 708 
(1934) ]. Data from original records of this figure are replotted 
in Fig. 2(a) as frequency vs pressure and in 2(b) as pulse interval 
vs pressure to emphazise importance of descriptive variable chosen 
in determining whether transducer is “linear” or “nonlinear.” 
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with the superregen, exceedingly high sensitivity near 
threshold and a tendency to lock onto» any furnished 
rhythms. This point again deserves attention in con- 
nection with biological rhythmic processes, mentioned 
briefly in a later section. 

It is perhaps safer to generalize that the sense-organ 
receptors which report in neural pulse-code modulations 
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usually spread the available frequency band width 
roughly equally, in terms of importance, over the con- 
tinuum of intensity time space in which they have to 
report, than it is to assume that they adhere to the 
Weber-Fechnier ideal of true logarithmic response. The 
accuracy with which a sense organ can report over a 
given neural channel depends on the length of time in 
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Fic. 3. Record showing adaptation of a sense organ (photo- 
receptor). Stimulus was maintained at constant strength through- 
out, but interval between discharges steadily diminished. Signal 
indicates duration of stimulus application and time is in 0.2-sec 
intervals [record of H. K. Hartline cited from D. W. Bronk, 
Research Publs., Assoc. Research Nervous Mental Disease 15, 


60-82 (1934)]. 


which it is allowed to report for a given discriminating 
mechanism. In general, the organism must have its re- 
port quickly, to avoid vacillatory behavior and sluggish- 
ness, yet reasonably accurately over a wide range. Thus, 
the neural codes turn out, for the narrow band width 
in frequency of pulses available, to be apt compromises 
between adequate and quick reporting. 

One is familiar with the interestingly interchangeable 
way in which temporal and dimensional variables enter 
into the ordinary wave equation describing the progres- 
sion of a pattern or wave of form fin space « and time /. 
Equations (1) show this relationship for one dimension : 


F= f(x—ui) 
OF /dx= f'(x—ut) 
OF /dt=— uf’ (x—ul) 
dV /dt=—udV /dx, 


but it is easy to see how it can be generalized for more 
dimensions. There is a linear relationship between the 
position of a particular feature of the wave in space 
and the passage of time. 

As a practical illustration of this equivalence, it is 
recognized that one frequently interchanges mentally 
the picture of a nerve action potential in the space-time 
world. Usually, one measures the potential variations 
at one place along the nerve as a function of time, but, 
without feeling any strain, one substitutes the picture 
of a wave of identical form progressing smoothly down 
the length of the nerve as a fixed pattern. 

It will certainly be important to discover the mecha- 
nisms within neural net systems whereby such inter- 
changes become easy and in which this kind of spatial- 
temporal pattern analysis can be accomplished with the 
available neuronal networks and fields. It must be ex- 
pected in this seemingly messy and complicated situa- 
tion that the usual biological optimization of representa- 
tion will apply. One must expect near-logarithmic 

compression of past and future in this representation, 
and similarly one must expect that spatial features will 
be compressed in their representation as they become 
farther removed pattern-wise. 

It is attractive to think of the neural net as a multi- 
nalytical space in which distance along 
cited, short neuron extensions corresponds 
| or spatial separation. Figure 4 shows how 
ase velocity of an unpropagated nerve impulse 
at near-normal speed in a passive axon, ready 
e reinforced into partial or full excitation 


(1) 


dimensional a 
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by surrounding fields. A corresponding lower-velocity 
process easily can be anticipated for the short neuronal 
extensions in the central nervous system. In this con- 
nection, it is worth mentioning that application of 
readily available, distributed negative-resistance net- 
works to in vilro nerve preparations allows exploration 
of this type of reinforcement. 

Consider now some basic differences in approach to 
informational coding that must apply in biological 
studies as contrasted to the conventional engineering 
analysis of comparable systems. In an engineering sys- 
tem, the coding is generally superimposed from without 
by the machine design, so that there is real justification 
for dissociating the message from the code. In biological 
systems, the organism must create the complex of a 
code plus the messages in this code with which it will 
function. 

It consequently becomes possible in the biologica! 
case to compare meaningfully the cost of a very elabo- 
rate code, in which a one- or two-bit message can tell 
one to record the complete contents of the /i/ernational 
Critical Tables or of the Encyclopedia Britannica, with 
a very elaborate message in a simple code whereby one 
dictates one or the other of these works at length. Ina 
genetic plus sensory-motor biological code complex, the 
organism seeks to minimize the combination of these 
two aspects. 

Where there is to be very frequent repetition of similar 
messages, it becomes economical to devise fairly elabo- 
rate codes in which these messages can be shortened 
and sharpened in discrimination. For the very rare or 


Fic. 4. Phase velocity of 
an unpropagated subthresh- 
old sinusoidal wave in living 
axon. Data for squid axon, 
mean diameter 0.41 mm, 
300 cps, 21°C. Phase ve- 
locity 10 meters/sec vary- 
ing slightly with taper of 
fiber. Record is directly 
traced from machine drawn 
curve without smoothing. 


mm along fiber to probe electrode 


msec delay to time 
of zero phase 


CC-0. Gurukul Kangri University Haridwar Collection. Digitized by S3 Foundation USA 


BIOLOGICAL TRANSDUCERS AND CODING 


one-time situations, it is economical to spell out the 
message in detailed, simple versatile code so long as it 
is not essential that the communication be completed 
in a very short time. Undoubtedly, some of the complex 
innate behavioral responses of lower and higher organ- 
isms have to be spelled out rather expensively in infor- 
mational, genetic, and developmental symbols, as they 
are essential to survival and can be left only slightly 
sketchy. 

It is an almost universal observation that, wherever 
a specific biological organic structure or organization 
functionally need not be built to a firm pattern, it will 
be left to grow as dictated by local circumstances. This 
relieves the informational system of that much superflu- 
ous burden. If it is necessary only that there be a sub- 
stantial number of muscle fibers or connective-tissue 
units in a particular structure, it is very improbable 
that an exact number will be present, while it is equally 
* certain that a nine-plus-two or a correspondingly im- 
portant topological or chemically significant structure 
will be firmly designed-in. The informational design, 
once having been paid for genetically, will be used 
widely, thus minimizing the necessity for superfluous 
duplicate instruction. Biological organisms are proba- 
bilistic mechanisms, and their transducer codings reflect 
their willingness to take a statistical chance for survival 
in a risky world, especially if the odds are good. Where 
an item of information is especially important, it will 
usually be protected by replicate and nonreplicate re- 
dundancy. Such redundancy may become too expensive 
in some cases and may have to be sacrificed. This may 
be the case in some genetic transformations where cells 
and indeed the continuity of life for the organism be- 
come susceptible, briefly, to minor external mutational 
influences. 

To avoid further digression from specific, experi- 
mentally accessible material, some of this free-wheeling 
thinking should be tied down to mechanisms and bits 
of analysis where there is a chance for specific solutions 
to specific problems. This set of analytical tools is still 
rather dull, but undoubtedly can be sharpened by use 
and extended to the effective analysis of the behavior 
of complex informational systems prevalent in animals. 

As a starting point, consider some of the ways in 
which one can elaborate on the standard notion of a 
transfer function, which forms the usual stock-in-trade 
of the servo-engineer and electronic designer when he is 
trying to understand the behavior of control-communi- 
cation and computer systems. The transfer function is 
usually represented as a black box into which a signal 
is put and out of which a related signal emerges. 
Normally, it is very tightly hemmed in with formal 
restrictions of allowable behavior nog unrealistic, in 
most cases, for amplifiers and physical transducers, but 
usually very unbiological. As a result, many a biologist 
has been led down the garden path when he has at- 
tempted to use this concept, for, unlike the engineer, 
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he is unable to make his system conform to the limita- 
tions of the model on which he bases his reasoning. 

In the conventional cases, a transfer-function black 
box will be classed as either passive or active. If passive, 
it will require that there be put in as a signal all of the 
power required to actuate the box. It merely gives out 
a proportion of this power modified in some internally 
determined manner. Alternatively, it may have a local 
power source, as in an amplifier, and thus actively give 
an enlarged or otherwise modified more-powerful or 
effective output. In either case, the transfer box is 
usually assumed to be linear, except for special cases. 
The output is, thus, in terms of magnitude at any one 
frequency, a kind of phase-shifted, differently scaled 
replica of the input, although the relationship may be 
different at different frequencies. For the simple cases 
then, the performance of the box is adequately repre- 
sented by describing the ratio of the output to the input 
as a function of frequency when driven by a sine-wave 
input variation. Output is presumed proportional to 
input, but is acknowledged to vary in relative magni- 
tude and also to advance or retard in relative phase as 
a function of frequency. Consequently, for many pur- 
poses, a complex plane phase-amplitude plot of this 
transfer function is considered to convey adequately 
the behavior of the box. 

Usually, the transfer-function box is used as a build- 
ing block to construct functional block diagrams, often 
including loops like those in Fig. 5(a) to confer a measure 
of stability absent ina system that is not back-compared. 
Occasionally, a long, compound loop system like that 
of Fig. 5(b) will be used, because it has still greater 
stability than the simpler loop systems for a given avail- 
able gain, but it is also more likely to become violently 
unstable if certain limits are exceeded. 

Note that the feedahead loop illustrated in Fig. 5(c), 
while not intrinsically as stable as the feedback loop, 
has an advantage in being capable of introducing ex- 
actly compensatory control corrections without having 
infinite gain, whereas the feedback loop can only ap- 
proach this ideal asymptotically. Feedahead can be 
combined with feedback to give a very desirable com- 
pound operation and is now gaining widespread accept- 
ance in engineering applications. 

Figure 6 illustrates a more typical, biological-transfer 
loop system which characteristically includes feedahead 
as well as feedback loops and almost always displays 
control parameter dependency upon signal. It should be 
emphasized that the traditional transfer-function box 
clearly carries with it the notion of signal pathways 
firmly fixed spatially; that is to say, there is some kind 


the system. There may be switches, but these merely 
direct activities between alternative discrete pathways. — 


Each transfer box is a point-to-point function; whatever 
enters, comes in from a source at a specifiable place in 


the diagrammatic model, and what comes out, com 
out at a specified place or places. Such function di 
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grams are hardly ever generalized to anything more 
complicated than a substantial interconnected block of 
such units, possibly with systems of switching arrayed 
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in a logical, or occasionally in a statistically determined, 
pattern. 


In all fairness, it should be pointed out that there is 
another and quite different set of packages, sometimes 
also called transfer functions, where the black box, m- 
stead of performing linear operations, now performs 
logical operations. These units, which figure prominently 
in digital-computer design, do something specific at an 
output when received signals in some predetermined 
logical set appear or fail to appear at its input. Typical 
signal sets might represent both, neither, or, a before b, 
time coincidence, equality, inequality, etc. These two 
quite different kinds of transfer-function black boxes 
can, in fact, be considered as epitomes, respectively; of 


BIOLOGICAL 


the functional organs of analogical-computer systems 
and of digital-computer systems. 

Biology does not have a priori knowledge of a real 
difference between analogical and digital computers, 
and consequently is not hampered by this distinction. 
In fact, this difference is much more of the experimenter’s 
making than it is an intrinsic property of the system 
being analyzed. 

Digital systems are really those where enough proba- 
bilistic configurations of a system are lumped redun- 
dantly into a few significant categories to make the 
whole system approximate the all-or-nothing pattern 
characteristic of Aristotelian logical arguments. Analog 
systems are those where the available system configu- 
rations are dispersed over a significance range and, if 
numerous enough, resemble a value continuum. In the 
systems where energy redundancy is reduced to a mini- 
, mum and nearly all of the available time band width of 
an information-handling system is utilized, the conven- 
tional thinking about digital and analog distinction 
becomes hazy and a new and more general type of 
analysis has to be substituted. 

It is perhaps unfortunate that the term “analog 
system,” which originally implied that one physical 
variable was being substituted for another for compu- 
tational purposes, has come also to be used to describe 
any system utilizing a continuous range of variables in 
its operation. One is thus left with the anomaly that an 
analogical computer may be digital in makeup and that 
circuits performing quite direct control operations, and 
not particularly analogical physically to anything else, 
will be called analog-control systems. 

It would be unfair to suggest that the modern engi- 
neer is not aware of the importance of nonlinear systems 
or to accuse him of thinking that digital and analogical 
systems are intrinsically different. He does, however, 
tend to build apparatus so as to conform essentially 
with one or the other of these because these analyses 
are familiar and easy, and he most likely has not been 
forced to cope with distributed transfer-function sys- 
tems, particularly those that are highly re-entrant. 

A little futher on, the behavior of two importantly 
different types of re-entrant systems is discussed, but 
first some places in biology where one can use fragments 
of available analyses are examined and embroidered on 
so as to make them more biological in their character- 
istics. In this way, they may be made applicable to the 
increasingly important cases where there are biological- 
transducer elements, particularly neural nets, which are 
simultaneously a continuum and a system of tight topo- 
logical connectivities. This is a complicated way of 
saying that they are to an extent switchboard con- 
nected, but are simultaneously a system of diffuse field 
interactions. In this case, these field interactions mean 
simultaneously the chemical field, e.g., the hormonal or 
other chemical concentration-gradient field, and equally 
important, the electric field contributed by the sur- 
rounding active elements. 
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There will almost certainly be found in this complex 
an understanding of the efficient large-scale behavioral 
control of higher organisms, and mathematical tech- 
niques will have to be expanded to give facility and 
simplicity in handling systems which are hopelessly 
complex if treated as telephone exhanges gone berserk 
or soaked in electrically leaky saline. 

One cannot offer more than an introduction to this 
kind of thinking at this time—first because it is compli- 
cated, and second because most of the answers are still 
unknown—but several extensions of ordinary field theory 
can be stated that are not presently being used in 
neural-net analysis. 

There is one experimental and theoretical approach 
to this area that should be explored. This is the approach 
based on cardiac transfer-impedance theory, a technique 
used successfully in our laboratory in examining the 
superficially accessible action potentials of heart muscle, 
i.e., the spatial electrocardiogram. This theory, de- 
veloped to handle biophysical heart data, will in all 
probability prove even more valuable in examining 
neural field interaction than it has in the purpose for 
which it was intended. Strangely enough, the key notion 
of reciprocity inherent in this theory was beautifully 
stated by Helmholtz, one of the early biophysicists, just 
about a century ago. It was not applied to biophysical 
problems until quite recently, and then it was intro- 
duced in another disguise. 

The following demonstrates how the transfer-function 
notion can be spatialized. Consider Eq. (2), the defini- 
tion of transfer impedance in its usual engineering form 
for a linear passive network: 


output voltage =input current transfer impedance. 
(2) 


Such a network may include any combination of capaci- 
tive, resistive, or inductive components that do not 
contribute external energy. For any combination of such 
linear components, one can determine the current ap- 
plied between any two terminals in the system and the 
resulting potential difference developed between any 
other two terminals. The ratio of this voltage to the 
current will be dimensionally an impedance, and will 
have a constant value at any specific frequency if the 
system is linear. 

Substitute for the passive circuit a biological tissue, 
nervous or otherwise, which is electrically at least ap- 
proximately isotropic and linear although not neces- 
sarily homogeneous. These qualifications are included 
to avoid tensor notation with its complications when 
simple vector expressions are probably adequate for the 
present state of the art. In the low-frequency range 
which is of interest neurophysiologically, the brain 
approximates such a system of linear electrical com- 
ponents, except for microregions. Consequently, if any 
two points are picked as terminals through which to $ 
apply current, and any other two points are pice 
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read off the potential difference, one finds that the 
voltage and current will be in a constant linear relation- 
ship which is formally described by their transfer- 
impedance ratio. If one now brings the two current 
points close together, they can be treated as a dipolar 
source-sink combination, and such a dipole is the kind 
of source that is encountered in biological tissue. 

Because of the relatively short, length constants in- 
volved in muscle, nerve, and other biological current- 
generating systems, one can generally assume quite 
safely that source and sink are dipolar on a millimeter 
basis, and that macroscopic linearity prevails. In an 
isotropic medium, another generalization can be added. 
One can observe that, so long as the source dipole re- 
mains essentially in one region, it with its orientation 
becomes vectorial in the sense that the voltage trans- 
ferred to a specified pair of terminals will obey a cosine 
relationship with respect to some specifiable preferred 
axis. 

Thus, the dipole can be thought of as projecting itself, 
to a constant scale, on an effective axis not necessarily 
going through the recording points. In the diagram 
(Fig. 7), lead-off points c and d may be thought of as 
effectively represented by a vector on an axis c’d’ at 
the source. The magnitude of potential realized at the 
lead-off electrodes will be proportional to the magnitude 
of the current source times its distance of separation 
(i.e., its current dipole moment) multiplied by the cosine 
of the spatial angle between it and a reference line c’d’. 

It is seen at once that this has become a vector dot 

product system and that the c’d’ reference has become 
a reference vector characteristic of a particular lead 
system. One, therefore, vectorializes Z in the alternating- 
current version of the Ohm’s law relationship E=JZ. 
I becomes a dipole current moment (amperes times 
separation in centimeters). Z becomes a transfer im- 
edance which is dimensionally ohms per centimeter in 
magnitude, þut assumes the spatial attributes of a 
vector. Voltage, of course, remains a scalar quantity 


which is consistent with its representation as the scalar 
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With Z vectorialized, one now has a function capable 
of being distributed spatially throughout a tissue sys- 
tem; that is to say, the transfer function has been 
successfully distributed in a neural field space. This 
function must necessarily be a vector-point function— 
that is, a quantity having specifiable magnitude and 
direction at every point throughout the space involved. 
By evaluating this function, one can characterize the 
interaction of a source anywhere in the field space with 
a receptive region anywhere else in the field, and can 
quantitatively predict their mutual electrical coupling. 

Itisat this point that Helmholtz’s reciprocity theorem 
comes to bear. One must be rather careful in stating 
the theorem for a distributed medium, but it can be put 
in the following form. For any given system of lead-off 
electrodes, there is a spatial vector-point function which 
will relate the potential difference measured in this lead 
to the current dipole moment introduced anywhere in 
the medium in any orientation. The potential will be 
Vi=L,-Z: where V; is the led-off voltage, In the dipole- 
source moment, and Z, the transfer-impedance function. 
Upon applying an adaptation of the reciprocity relation- 
ship, it is found that the electric field or potential 
gradient E, anywhere in the medium owing 10 introduc- 
tion of current J into the former lead-off electrodes will 
be exactly Z.I, so that the spatial transfer-impedance 
function Z; is identical in the two cases for any electrode 
or medium configuration. 

What is the value of this Z, function and its reci- 
procity property? In the first place, it permits one to 
predict, by applying currents to finite leads and meas- 
uring the resulting potential distribution with micro- 
electrodes, what potential will be set up as a field by 
microsources such as neurons. Model measurements and 
in vivo measurements of transfer impedances by the 
reciprocal method now become feasible and valuable. 
While these measurements have not yet been done to 
any great extent on brain tissues, it is possible to illus- 
trate the kind of results to be expected by comparison 
with results obtained in the electrocardiographic field. 

Figure 8 is an example of such an application where 
a set of orthogonal leads was developed experimentally 
by linear combination of partial derivatives of the 
transfer-impedance function. Knowing that transfer- 
impedance functions must obey the superposition theo- 
rem as well as the reciprocity rule, it was possible to 
develop leads sensitive, for practical purposes, only to 
sagittal, vertical, or horizontal sources of current irre- 
spective of position within the heart tissue. 

Inversely, it was possible to show theoretically and 
experimentally that fields from neurons in certain con- 
figurations can add up so as to be quite strongly con- 
centrated at substantial distances from the sources, 
while in other cases they die out even more rapidly 
than the inverse cube law. Measurements of transfer- 
impedance fields via models give surprising insight into 
such complicated field patterns. The analysis indeed 
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Frc. 8. Orthogonalization of transfer impedance fields. For any tissue system to which a specified lead system has been applied, there 
is a characteristic associated transfer-impedance field which takes the form of a continuous spatial vector-point function. By examining 
the principal partial derivatives of this vector function for several different lead systems applied to the same tissue, it is usually possible 
to devise weighted combinations of leads which yield nearly orthogonal functions in a finite source region so that potentials picked up 
from the combined leads report uniformly and orthogonally on sources in the corrected region, and stimuli applied through the leads 
stimulate the source region uniformly and in a constant known direction. Alternatively, highly distorting leads can be devised which 
will record and stimulate selectively. The patterns shown here as examples were developed to yield a uniform sagittal, or front to back, 
lead for the human chest to be used in electrocardiography. Ideally, the transfer impedance should be large and constant in Z and should 


have negligible X and Y contamination. 


resembles a kind of degenerate directional antenna de- 
sign procedure. 

Spatial transfer-function analysis is very attractive 
in studying field stimulation of neurons, for it gives an 
intuitive insight into the remarkable time sensitivity of 
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stimulation patterns. If one remembers that the con- 
duction time at constant velocity is equivalent to dis- 
tance, and that the inverse equivalence is also valid, 
then one has in a neural net a most interesting mecha- 
nism for the spatial-temporal localization of a patter 
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Fic. 9. System for automatic synthesis of equivalent concentrated dipole-current moment sources. A conductive model of the 
tissue system is fitted with a lead system equivalent to that on the experimental animal or person, and a vector dipole source is 
installed in the model at the chosen locus for which the equivalent vector source is to be calculated. A system of high gain amplifiers 
is looped around the model lead system to the dipole, and the amplified signal from the biological subject is inserted as a forcing 
function. In order to maintain all the amplifier inputs at near-zero potential simultaneously, the amplifier system must generate a 
dipole source which just cancels the complex of lead potentials and thus one which duplicates, in a negative sense, the desired 
source. The solution is, of course, unique only for a specified dipole source locus and electrode configuration as can be demon- 
strated by rearranging the electrodes on model and subject or by moving the dipole source about in the model and noting the contin- 


ually altering resultant pattern of source dipole moment. 


and hence an ideal medium for distributed probabilistic 
field effects. 

Another closely related extension of transfer-function 
analysis was introduced by Burger in Holland a few 
years ago. In this approach, an image space is developed 
as a complementary space to the real field space. In the 
image space, vectorial summation of superimposed field 

ontributions becomes beautifully simple, even though 
orking out the shape of the image space is a more 
omplicated process. 

There is a very interesting feedback method for syn- 
thesizing spatial excitation sources (Fig .9). If one builds 
a conductive three-dimensional scaled model of a tissue 
system and immerses in it an orthogonal, triple dipolar 
source at a place corresponding to a supposed bioelectric 
source, this artificial source will excite a transfer-im- 
pedance pattern similar to that of the original living 
system. If each component of the dipole is fed from a 
high-gain amplifier which is fed, in turn, from one of a 

set of three reasonably orthogonal lead sets correspond- 
ing to the dipole orientation, the system then becomes 
an automatic synthesizer of excitation sources in terms 
of the actual picked up potential. Explained in be- 
havioristic terms, the feedback system tries constantly 
to null itself in all directions simultaneously and can do 
so only by creating a source equivalent to the real but 
inaccessible one. To what extent this general scheme 
can be extended beyond three loops to synthesize more 
complex sources, and to what extent noise can be used 

a tracer to establish source localization, are not yet 

as but this approach seems to be very promising. 
known, does not permit amplification of many addi- 

Space ects of the biological-transducer function 
tional asP f great importance, but the transductive 
which are a “orientation” behavior must not be 


totally omitted. Animal orientation includes especially 
the navigational processes whereby an organism “steers” 
itself toward a desirable goal literally as well as figura- 
tively. It is well known that pigeons home, that fish 
almost unerringly find their native stream after months 
at sea, that birds migrate to very specific targets, and 
that bees communicate to their fellows the range and 
azimuth of food depots. Birds are even believed to 
utilize astronavigation, as suggested by recent experi- 
ments. All of these processes are biological-transducer 
problems at a higher level and ought to be investigated 
on this basis, as well as by the traditional types of 
experiment. If a moth knows which way to fly to find 
another moth a kilometer away, this is a transduction 
and coding problem, and one should seek not only the 
sensory transducers involved but also the transfer func- 
tions describing the reduction of massive sensory data 
to net decisive instructions for a flight plan. 

Another set of biophysical phenomena classed as 
animal orientation is the family of rhythmic or oscil- 
latory processes found in almost all biological organisms 
for which one has yet to discover a biophysical mecha- 
nism. It is well known that animals and plants exhibit 
strongly rhythmic properties with periods ranging from 
1 or 2 msec in the case of some nerve processes to periods 
of a month, a year, or even several years in other cases. 
These rhythmic processes often have very nearly, but 
not exactly, a one-day period. Again, the period may 
coincide with the succession of high and low tides in a 
particular place, with the lunar month, or with the 
terrestrial year. While these periodicities are related 
to those of the normal environment, many of them per- 
sist for many cycles in experiments where the organism 
is carefully shielded from all known environmental 
clues to these periods. 


CC-0. Gurukul Kangri University Haridwar Collection. Digitized by S3 Foundation USA 


1“. 
ORS 


BIOLOGICAL 


Some of the simple biorhythms like the cricket chirp- 
ing rate have a reasonable chemical-process temperature 
coefhicient but others have an almost exactly zero tem- 
perature coefficient. Some of these temperature-inde- 
pendent periodicities can be shifted in phase by a sys- 
temic shock such as dunking in cold water. Others can 
be “pulled,” as a crystal oscillator can be pulled slightly 
off frequency by coupling to another resonator of similar 
frequency. None of the basic oscillatory mechanisms 
has been identified nor has an adequate hypothetical 
master oscillator been devised which will conform with 
experimentally determined properties of these natural 
oscillations. 

Crabs change color on a tidal-periodic basis; some 
bacteria luminesce periodically on the basis of when it 
should be day or night; some flies emerge from pupae 
on an integral diurnal but not fixed illumination-period 
basis; some poisons change toxicity on a diurnal basis; 
and cilia beat rhythmically. No one seems to have found 
the transducer mechanisms by which these rhythms are 
loosely but adequately coupled to the environment. 
Surely, there should be a search for mechanisms of 
periodicity hinted at by the three major classes of physi- 
cal oscillating systems. In any case, one should be em- 
barrassed to have failed completely in turning up even 
approximate explanations for these phenomena that are 
of undoubted prime biological importance at every level 
from the intracellular to that of the entire organism. 

Finally, there are two items referred to earlier 
bivalent codes and hierachical re-entrant control sys- 
tems. It is evident that biological control systems are, 
in general, organized in a series of semi-autonomous 
hierarchically arranged levels where each subordinate 
system left to itself will achieve a more primitive, less 
flexible, but still effectual, type of homeostatic control. 
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Single isolated cells get on somehow in tissue culture, 
and a heart will beat in parts if it fails to receive co- 
ordinating excitations from its pacemaker. What are 
the mechanisms and the general control principles 
whereby several, often contradictory, courses of motor 
behavior are coordinated so as to remain unified and 
influenced by the several operator controls, yet not 
uselessly vacillatory or indecisive in action? Evidently, 
whole patterns of feedback and feedahead control are 
coordinatedly included or excluded from taking domi- 
nance in effective control. What does an animal do when 
exposed to unfamiliar, conflicting sensory information 
such as might originate from a minor malfunction of 
one of its transducer mechanisms? Generally, it chooses 
one set of the feasible interpretations of the available 
data and ignores incompatible parts of the others. 
Sometimes, as in the case of Von Holst’s fishes (where 
the fish, used to having 1g gravity downward and 
light from above, are experimentally exposed to light 
from the side and to a centrifugally increased apparent 
g), they seek a weighted compromise. Very seldom do 
they fail completely, as do our rockets, from some 
single small malfunction. 

In summary, one may conclude that there is at present 
no unified, formally manipulable system in the bio- 
physical sciences for examining and describing system- 
atically the functional characteristics of biological trans- 
ducers and for studying their aggregate behavior when 
combined into functional ensembles. There is ample 
mathematical experience in the engineering fields to 
permit a start in this direction, but existing techniques 
fall far short of adequacy, and it is certain that new, 
uniquely biological theory and techniques for transduc- 
tive-function manipulation and code specification will 
have to be developed. 
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Initiation of Nerve Impulses in Receptor and 
Central Neurons 
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HE objective of this paper is twofold: to point out 
some of the problems which represent the present 
State of physiological analysis of some sense organs 
as detectors and transducers, and to point out a 
current view of the complex chain of events between 
transducing a stimulus and initiating a nerve impulse. 
It will be argued that this chain is similar in receptor 
and central neurons and that information about either 
one is relevant to the other. 

Biophysicists long have been interested in sensory 
receptors, since they present in such a conspicuous form 
the conversion of physical events of the inanimate world 
into physiological events in the organism.! Some of the 
best-known achievements of biophysical science lie in 
this realm. Indeed, a vast literature exists on sensory 
Capacities, analysis of the parameters of reception, and 
quantitative description of the subvarieties of unit re- 
ceptors within modalities’. There continue to appear in 
the literature new kinds of receptors physiologically 
identified, and there remains a host of histologically 
known sense organs and diffuse receptors, differentiated 
visibly but not yet defined physiologically (e.g., pectines 
of scorpions, esthetes of chitons, osphradia of snails and 
clams, etc.). 

This situation, virtually unique among organ sys- 
tems, is not surprising if one thinks of behavior as 

) the principal attribute in which the great wealth of 
f animal types is differentiated, if one ascribes behavior 
' to differentiation in the central nervous system and 
remembers the intimate and perhaps causal correlation 
between differentiation in the central nervous system 
and that in the sense organs. One has learned to expect 
startling discoveries in each new batch of journals— 
polarized-light detection, wind-speed indicators, gyro- 
scopic-deflection sensing, infrared directional devices, 
olfactory separation of optical isomers, hydrostatic- 
pressure detection in supposedly gas-free organisms, 
ultrasound reception—not to speak of systems achieving 
a complex degree of analysis of signals such as the 
ultrasonic-pulse-reflection analysis in bats, the ampli- 
tude-modulation frequency analysis of the ears of 
crickets, the integration of proprioceptive information 
he strike of a praying mantis, and of we do not know 
41 information in many cases such as orientation of 
-. turbulence and to other unseen fish. This list 

9 turbulcn ð 5 
j ion of eyes of various kinds and 


pe L more omits ment 3 . ae 
‘aa ntastic achievements, of which Hartline writes 
D T 


p. 515). : 
Since a systematl 


eee ee 


c survey is patently impossible, I 


offer here only a small selection of some recent instances 
of progress in the identification and characterization of 
receptors with respect to their capacities and properties. 
A closer look is taken later at the parameters of impulse 
initiation. 


ELECTRORECEPTORS 


Recent work’®~® has opened up a new sensory modality 
in the sensitivity to normally occurring electric fields, 
found in many species of fish of at least three unrelated 
families, the Gymnotidae, the Gymnarchidae, and the 
Mormyridae (Fig. 1). These fish have electric organs 
which emit pulses of low voltage—of the order of 1 v— 
in some species in brief, low-frequency bursts, and in 
other species continuously, hour after hour, with charac- 
teristic frequencies and pulse form and duration, diag- 
nostic of the species (Fig. 2). The frequencies commonly 
lie between 60 and 400/sec, but in some cases exceed 
1000/sec. The pulse duration—from the whole animal, 
representing activity of many electroplates—ranges 
from 10 msec to less than 0.2 msec. What is significant 
for us is that the behavioral evidence clearly shows in 
some species the use of these signals in orientation with 
respect to objects, apertures, and other emitting fish 
in the immediate environment. Clearly, the fish detects 
alterations in the pattern of the electric field in the water 
surrounding it (Fig. 3). Marked signs of agitation are 
elicited by a wire brought within range or by the dis- 
charges of other fish. Conditioned-reflex experiments 
show the ability to detect the presence of a stationary 
magnet outside the aquarium and to discriminate between 
conductors and nonconductors in the aquarium. The fish 
respond to the movement of a small electrostatic charge 
such as that produced by combing one’s hair with a 
vulcanite comb. Sensitivity to imposed electric currents 
has been known, but what is of interest for us is the 
demonstration that a normally developed sensibility 
exists which is apparently used in nature to detect and 
to analyze in a complex way naturally occurring elec- 
tric fields. 

The sense organ is presumably the lateral line, hereto- 
fore regarded as a special form of mechanoreceptor. We 
will have to make room in our lists of modalities for a 
new category—electroreceptors. It is perhaps more 
curious that this modality is not more widespread. 
Muscle-action potentials can be recorded in the water at 
some distance from fish as well as from other animals, 
but so far the tests outlined in the foregoing, when ap- 
plied to fish of other families than these three, have 
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Iic. 1. Examples of continuously 
discharging, low-voltage electric fish. 
Representatives of the Mormyridae, 
Gymnotidae, and  Gymnarchidae 
[from H. W. Lissman, J. Exptl. Biol. 
35, 156 (1958) ]. 
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indicated no such sensitivity (with the possible excep- 
tion of the Siluridae). One may conjecture that it is 
perhaps not so much the deficiency in peripheral sensi- 
bility as the lack of central apparatus for making use 
of the signals, which distinguishes the ordinary from 
the electric fish. All electric fish share, among other 
things, the enormous development of the valvulae 
cerebelli, the so-called mormyro-cerebellum. 

The sensitivity of the receptor has not been measured 
directly but Lissman and Machin’ have calculated the 
intensities of the electric fields which are effective in 
altering behavior, for example in conditioned response 
tests to electrostatic charges or magnets outside the 
aquarium. The field in the water around the fish must 
be changed by about 0.003 uv/mm at threshold, in 
Gymnarchus niloticus. This corresponds to a current 
through the fish of 2X10-> wA/cm?, some hundred 
thousand times smaller than a still subthreshold stimu- 
lus current density across the membrane of a squid axon 
(2 mv across 10? ohms/cm?). The problem of the possible 
meaning of such high sensitivity to electrical events 
already has been raised by Terzuolo and Bullock.” Our 
evidence suggests that even in ordinary nerve cells the 
sensitivity, while not nearly as high as in these electro- 
receptors, is several orders of magnitude higher than 
the usually recognized threshold changes of 10 to 20 mv 
required to excite a silent nerve fiber. In the first place, 
this high sensitivity is manifested only as an alteration 
of frequency of an already discharging cell. In the next 
place, to be sensitive to minute changes in membrane 
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potential, both the critical potential for spike initiation 
and the rate of rise of the prepotential must be extra- 
ordinarily stable—and must be localized in a limited 
part of the cell. The basic problem of high electrical 
feld sensitivity—of special interest because it may be 
the one case where one does not need to search for a 
transducing mechanism as we do in photoreceptors, 
mechanoreceptors, etc.,—is the matter of stability. What 
are the requirements in terms of stability of the thresh- 
old and of the prepotential such that a given channel 
can provide a useful signal within a reasonable time 
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(4) Eigenmannia virescens 


Fic. 2. Examples of types of electric pulses produced by four 
species of the Gymnotidae. Both pulse duration and pulse fre- 
quency are different and characteristic for each species. The fish 
are continuously discharging day and night. Time marker 50 cps 
[from H. W. Lissman, J. Exptl. Biol. 35, 156 (1958), 
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hair in a certain direction.*- In the lobster statocyst 
(an equilibrium sense organ where this is especially 
clear and for which recent analyses have been made), 
there are several specific types of hairs, differentiated 
for sensitivity to movement in one or another direction, 
or for sensitivity to a maintained position in one or 


FF another plane, or for sensitivity to vibration. Since the 
stimulus seems likely to be limited to a very restricted 
(a) region—namely, the point of articulation of the hair 


near its base—one is faced with the general problem of 
transducing a mechanical event into a physiological one 
in perhaps a more simple and discrete form than in the 
more familiar mechanoreceptors such as stretch, touch, 
and auditory organs. Although the sensitivity must be 
high in terms of displacement, it cannot be estimated 
accurately, owing to the fact that, at the base of the 
long hair, where the relative movement that must act 
as a stimulus occurs, the mechanical disadvantage is 


r Fıc. 3. The influence of objects upon the electric field around a 

| fish. (a) An object of low conductivity, and (b) one of high con- 
ductivity. The fish detects the distortion of the field and reacts to 
both classes of objects [from H. W. Lissman and K. E. Machin, 
J. Exptl. Biol. 35, 451 (1958)J]. 


| 
4 
(e.g., 1 sec)? At another level is the important problem 
of whether or not the high electric field sensitivity in the 
7 modulation of ongoing discharge plays a role in normal 
situations in the nervous system. If so, it permits effects 
not only of changes in electric fields normally present 
l but also of changes in the chemical milieu which may 
act through the same mechanism, by changing the 
frequency of the spontaneous discharge for a given mem- 
brane potential at the critical locus. 


Fic. 5. The infrared sensitive facial-pit organ of rattlesnakes, 


MECHANORECEPTORS 


Turning to the realm of mechanoreceptors, an attrac- 
tive problem exists in the wide variety of arthropod 
hair-like exoskeletal projections (Fig. 4), each of which 


Crotalus. Here, a wedge has been removed to show the deep pit 
and the sensitive surface—the thin membrane at its back (shown 
crinkled but normally smooth), with an air chamber both in front 
and behind it. This membrane, 10 to 15 x thick, is heavily inner- 
vated and vascular [from T. H. Bullock and F. P. J. Diecke, J. 
Physiol. 134, 47 (1956)]. 


| receives a single distal process of a primary sensory 


d is specialized for signaling deflection of the ; f 
4 e an P 8 8 maximal and the small threshold movements of the tip 


of the hair must be reduced enormously in absolute 
magnitude. Fundamental questions remain, regarding 
the zltrastructure and the local events in the region of the 
mechanical deformation of the sensory nerve ending. 
As an example of the specialization of which arthro- 
pod mechanoreceptors are capable, Burkhardt and 
Schneider" have found that units in the Johnston organ 
of the antenna of flies (Calliphora) are hardly less sensi- 
tive than the human ear to sound frequencies between 
150 and 250 cps, the range of the wing-beat frequency. 
Even beyond this range the units follow the sound 
frequency faithfully, giving the animal sensory informa- 


Fic. 4. Diagram of arthropod sensory exoskeletal hair of the tion corresponding to each wing beat with a delay of 


A ingle primary sensory neuron. The distal z 
type a gaat call eee ees the RADÍ articulation of the only about 1 msec. The speed of flight is controlled by 


pOeesS Cohen and S. Dijkgraaf in The Physiology of this sense organ. F me other recent studies in thi 
fair [from M. Je i Academic Press, Inc., New F ganon SOMe 0 S 
atean, editor (Academi as area, see references 12-17. 
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Fic. 6. The sensory endings in the pit-membrane of a rattlesnake. This is a faithful drawing from a silver-stained whole mount of 
the 10 to 15 u membrane. The nerve fibers end freely in palmate expansions with branching processes. 500 to 1500 such expanded 
endings occur per mm?*. Apparently all are of one functional type—so-called warm receptors. Width of picture, 150 [from T. H. 


Bullock and S. W. Fox, Quart. J. Microscop. Sci. 98, 219 (1957) ]. 


TEMPERATURE RECEPTORS 


Referring briefly to temperature receptors, there is 
some information regarding the extraordinarily de- 
veloped long-infrared-radiation detectors found in rat- 
tlesnakes and other pit vipers in the thin membrane at 
the base of the facial pit (Figs. 5 and 6)!*8. This struc- 
ture, richly provided with a special form of free nerve 
ending and specialized in other ways—such as, for the 
directional estimation of radiant sources—responds to 
very small doses, of the order of 10~" small calories in 
zp sec on the area of the terminal ramification of one 
nerve fiber (2000 »?). The evidence indicates that this 
response is due to the change in temperature of the 
tissue, which both by indirect and direct methods is esti- 
mated to be of the order of 0.001°C, close to the value 
already found for the human.” Expressed as Qio, the 
frequency of nerve impulses in a single fiber increases 
with temperature with a Q10 of about 10%, a figure which 
offers considerable room for speculation about high am- 


plification with preservation of reasonable stability (Fig. 7). 
Much more could be said about each of these cases with 
respect to physiological properties and characteristics of 
response. It is the purpose here only to call attention to 
the variety of opportunities and problems presented by 
a few recently studied receptors. 


CHEMORECEPTORS 


In the area of chemoreceptors, I choose only two 
recent reports. Hodgson and Roeder! have discovered 
labellar hairs in various flies in which two primary sen- 
sory neurons send distal processes to the chemoreceptive 
area at the tip of the hair. These two neurons have 
different modes of sensitivity: one responds only to 
sugars; the other to salts, acids, and alcohols. Not 
only is this preparation of interest because of the op- 
portunity for the physiological study of unit receptors of 
different chemical specificity, but also, because of its 
additional sensitivity to mechanical stimuli and to tem- 
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Fic. 7. Response of typical receptor unit of rattlesnake infrared sense organ, measured as nerve impulse frequency at different rates 


LO 


"vere 


of increase of radiation and at different flux levels [from T. H. Bullock and F. P. J. Diecke, J. Physiol. 134, 47 (1956) ]. 


perature changes as small as 7p C, it presents in a clear 
form the general problem of analyzing the world through 
receptors which are not unambiguously detecting one 
aspect of the environment only. Furthermore, these 
labellar hairs are remarkable, as shown by Dethier,” in 
that complete behavioral rejection or acceptance re- 
sponses of the intact animal occur to stimulation by a 
microdrop which can reach only one single sensory 
neuron. Schneider has discovered an electrophysio- 
logical response in the antennae of silk moths (Bombyx 
mori) in the males only, which is highly specific ap- 
parently to the naturally occurring odorous material 
produced by the female. This odorous material attracts 
the males from great distances. In microelectrode rec- 
ords from the isolated antenna, picked up extracellu- 
Jarly from many units, Schneider finds a slow wave of 
several millivolts which is negative for some substances 
and positive for others. Spikes are superimposed prefer- 
entially on the negative phase of the electroantenno- 
gram. As in the preceding case, there is here an appar- 
ently peripheral filtering of DE ey D eae 
lus, which filtering 1S achieved ya specific chemica 
pEr tor and again one is faced with high 
sensitivity of a recep ; 
‘tivity. de Vries” has calculated that at threshold 
PANGA timulating molecules impinging 


far fewer S é 
there ines sensory epithelium in man than there are 
upon 


Ifactory receptor cells in the same area. 
olfac 


ABSOLUTE RECEPTION 


Extending the remarks by Rosenblith (p. 485), it is 
highly desirable that further attention be given the 
problems raised by the sensory reception of absolute 
stimulus values as opposed to reception where com- 
parison can be made by the receptors with a status in 
the recent past. Besides body temperature, blood pres- 
sure, and the like in man, there are many other indica- 
tions of reception of values which may be called absolute, 
e.g., preferred temperatures in cold blooded animals, 
levels of light which day after day induce a given 
response, CO» concentration reactions, and pitch recog- 
nition. Even in many humans lacking phenomenal 
ability, the cross-modality subjective-intensity match- 
ing and many psychophysical phenomena illustrate 
scales not dependent upon relative stimulation as meas- 
ured against a just-preceding level. One faces here the 
problem of stability and drift correction—not only the 
control loops, intra- or inter-cellular, but the standard of 
reference for detecting error. 


SEQUENCE OF LABILE COUPLINGS LEADING 
TO IMPULSE INITIATION 


Consider now what makes a neuron fire—namely, the 
specification of the distinguishable processes within the 
neuron leading to an explosive firing of a propagated 
all-or-none nerve impulse. The interesting conclusion 
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Systematization of 
Cell Membrane Potentials 


-n 


Action 
(may be graded, decremental, 
or all-or-none, propagated) 


Internal response 
(from antecedent activity 
within same cell) 


Transduced 
(from external events) 


Sawtooth Sinusoidal Generator Synaptic Local Spike 
(= receptor potential) 
polarizing depolarizing polarizing depolarizing 


permitted by current evidence is that the determination 
of firing is nol merely the build-up of an adequate stimulus 
or its transduced and amplified resultant to a critical 
level, the threshold, but rather it is a sequence of labile 
couplings between graded (analog) events, each occur- 
ring in a limited fraction of the neuron. Normal firing is 
preceded by a series of steps, with alternative pathways. 
The several processes are separate, partly in space and 
partly in time, but causally are interconnected. Some 
of these processes are reflected in membrane-potential 
changes and some are not. 

The processes reflected in the membrane potential in- 
clude slow shifts of the average membrane potential, 
synaptic, generator, local, and pacemaker potentials. 
Figure 8 makes it clear that generator and synaptic 
potentials are parallel types of transduced potentials 
representing amplified response to external events 
whether from a presynaptic cell, from a non-nervous 
sense cell, or from the external environment. It also 
points out that both of these classes of response may 
occur either in polarizing or depolarizing directions. De- 
polarizing responses are commonly called excitatory 
because they are likely to increase the probability of 
firing. Polarizing, or as they are sometimes called, hy- 
perpolarizing, responses generally decrease the proba- 
bility of firing and are called inhibitory. 

First among the processes which determine the initia- 
tion of impulses in receptor and central neurons, we 
should list what is actually no doubt a whole class of 
different and complex processes by which the level of 
the so-called resting potential is under the influence of 
the milieu and nonspecific agents (nonspecific to a given 
cell, though possibly specific to a cell type). Hormones, 
inorganic and organic constituents of the medium, de- 
formation of the cell (even in neurons other than specific 
mechanoreceptors), and other factors are known to have 
some effects, although in most cases their importance in 


normal physiology cannot be assessed yet. In addition 
to these, we now can recognize as an important class the 
general shift in membrane-potential level accomplished 
through specific nervous pathways. For example, Otani 
and Bullock” found in certain cells of the 9-celled cardiac 
ganglion of lobsters that certain presynaptic fibers exert 
an influence, but without any discrete synaptic poten- 
tials. With repetitive stimulation they cause a slow, 
smooth shift of membrane potential (Fig. 9) as seen by 
an intracellular electrode in the soma. Terzuolo?® has 
shown that certain neurons in the spinal cord of the cat 
similarly respond to stimulation of certain parts of the 
cerebellum by a shift of membrane potential level. This 
absence of discrete synaptic potentials even with a small 
number of incoming pathways possibly means that the 
main factor in determining the slow, smooth shift of 
potential is the distance of the synapses from the soma, 
If the distance is considerable, individual deflections 
would be smoothed out by the spread through leaky 
cables. But, if one believes that the membrane potential 
level at the basis of the axon or some such limited locus is 
crucial to the determination of cell firing, any such prop- 
erties of the dendrites and soma—such as the electro- 
tonic conduction of slow potential changes, the smooth- 
ing out of potential changes, or the discrimination 
against rapid deflections—would be of decisive impor- 
tance in spreading the influence of the decrementally 
propagating activity of the much-branched dendritic 
processes to the region in which the spike originates. 
This may be one reason for having long dendrites. 
Among sensory neurons, there are many whose distal 
process divides to supply a considerable number of re- 
ceptor regions of the periphery (Fig. 10, from the work 
of Meyer”) and in which the histology is suggestive 
that these loci are not alternative sources of full-fledged 
impulses but may, in some cases, make graded contribu- 
tions to the probability of an impulse arising at some 
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Cat, motoneuron 
cerebello-spinal input 
(Terzuolo) 


Not-maintained, 


Depolarizing, 

Facilitating f 3 
Lobster, cardiac ganglion cell 
cardio-accelerator axon input 
(Bullock and Terzuolo) 

Same, 

Polarizing 
Same, 
inhibitor axon input 

Depolarizing, 


Defacilitatin 
£ Same, input from pacemaker 


(Hagiwara and Bullock) 
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Sensory 


-Z n a a 


Crayfish, muscle receptor organ 
stretch stimulus (subthreshold) 
(Eyzaguirre and Kuffler) 


N 


Rock slater (Ligia; Isopoda) retina 
two flashes of light 0.5 sec apart (ext) 
(Ruck and Jahn) 


—_ 


Moth, antennal chemoreceptors (ext) 
odor stimulus (not facilitating) 
(Schneider) 


__ | Ua 


Cat, Pacinian corpuscle (ext) 
repetitive stretch, with defacilitation 
(Loewenstein and Altamirano-Orrego) 


Fic. 9. Some of the types of subthreshold potentials, to show the correspondence between synaptic and sensory types. Hand- 
drawn approximations of original records, intracellular (except when marked ext=external); time scale is not uniform; upward 
deflection is depolarization or, in external recording, negativity of closer electrode. 


point, such as a confluence of main branches of the 
distal process. So much for the processes that act upon 
the spike probability through the so-called resting 
potential. 

More typically, or at least more familiarly, fhe re- 
sponse to an external event is the transient deflection, called 
a synaptic potential or a generator potential according 
to whether the external event is presynaptic or whether 
it is a sensory stimulus (Fig. 9). These responses arise 
in restricted portions of the neurons and have their 
significance in the effects they produce in other regions 
after nonregenerative spread—with decrement and 
phase delay. They may be polarizing or depolarizing, 
and this alternative may be decided by the nature of 
the external event or, in certain circumstances by the 


Distal 100 u 
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]l from the leg of a centipede, Lithobius; 
Fic. 10. Sensory nerve as fine terminal fibers, B—cell body, 


e blue. A—lattice of 
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F. Meyer, Zool. Jahrb. Anat. 74, 381 (1955)]. 
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level of the membrane potential obtaining at the mo- 
ment. Some normally polarizing, hence inhibiting, 
events are reversed readily to a depolarizing response 
which can be shown to increase the probability of firing 
and are thus at least in a measure excitatory. Such 
change in the response to a given input owing to differ- 
ent levels of membrane potential is not purely a labo- 
ratory phenomenon resulting from experimentally im- 
posed polarization, but occurs normally, for example, 
in the aforementioned situation where Terzuolo found 
the membrane level of spinal motoneurons shifted by 
cerebello-spinal pathways, thereby altering the response 
to dorsal-root inputs. (In some accounts, emphasis has 
been laid on the fact that an inhibitory input drives the 
membrane potential from either side toward a certain 
equilibrium value and does not cause any synaptic po- 
tential if the membrane is already at that level. Even 
under the latter condition inhibition can result, because 
the lowered impedance tends to clamp the membrane. 
But this can inhibit only in the near vicinity. Upon 
spike-initiating regions or synaptic regions at some dis- 
tance, the sign and magnitude of the potential change 
will determine the sign and magnitude of the effect.) 
In at least some cases, there intervenes between the 
synaptic or generator potential and the critical level at 
the spike-initiating locus (which is a very restricted part 
of the neuron at some distance from the principal syn- 
aptic and generator loci) an intermediate graded potential 
which is usually called the local potential (Fig. 11, from 
the work of Bullock and Terzuolo”’). This is presumed 
to arise in an electrically excitable membrane adjacent 
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Fıc. 11. Examples of 
true spontaneous activ- 
ity as seen by an elec- 
trode inside a nerve cell. 
Cardiac ganglion cell of 
the lobster, Panulirus 
[(a) and (b)] or the 
crab, Cancer [(c) (d) 
(e) ]. (a) and (b) show a 
large pacemaker poten- 
tial, presumably arising 
nearby, and another pre- 
potential, regarded as a 
local potential, before 
the spike. The local po- 
tential may fail to elicit 
a spike and then can 
alone cause repolariza- 
tion. Note the failure of 
the local potential to 
arise following the third 
spike in (b) with instead 

* an undulation leading to 
a new cycle. (c), (d), 
and (e) show different (d) 
forms and permutations 
of pacemaker poten- 

ial and repolarization. 

s: (a) and (b)—500 

msec; (c), (d), and (e)— 

50 my, 200 msec [from 

T. H. Bullock and C. A. 

Terzuolo, J. Physiol. (e) 

138, 341 (1957) ]. 
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to the synaptic or generator membrane, but distinct 
from it, and as a consequence of the local circuits 
from antecedent activity of the same cell. In this re- 
spect, the local potential is like the spike potential, 
which also arises by electrical excitation from the local 
circuits in neighboring regions of the same neuron. In- 
deed, local potentials can be made to have graded ampli- 
tudes all the way to the full amplitude of the spike 
potentials, and they differ only in being nonregenerative 
and, therefore, in spreading decrementally for distances 
of the order of a millimeter only. In some sensory 
neurons, we may conjecture that such local potentials 
occur in different branches of the distal process and 
sum in the stem process to determine the initiation of a 
spike. The experiments of Katz™ on the summating sub- 
threshold deflections in sensory terminals of muscle 
spindles may be so interpreted. Situations where distal 
sensory processes branch over a considerable area are 
quite common. 

The nonregenerative property of some neuronal mem- 
branes is important in those cases where it results in the 
inability of spike potentials, once initiated, to invade those 
regions of the neuron. This preserves these regions from 
explosive depolarization and from subsequent strong 
repolarization, and thus enhances their integrative ca- 
pacity. For example, the soma of the large cardiac 
ganglion cells in the lobster and perhaps the dendrites 
on some other types of neurons never experience an 
explosive all-or-none event. This characteristic is very 
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likely true of most neuronal membranes, that is to say, 
of the vast forests of fine branching processes making 
up neuropiles. If this is so, one must think of all-or-none 
spike potentials as a peculiar development of a limited 
portion of the neuron whose function is the faithful 
propagation of signals over long distances rather than 
integration. 

The familiar local and spike potentials are depolariz- 
ing in direction, but there are a few instances indicating 
that membranes are capable of equivalent regenerative 
potentials in the opposite direction. For example, during 
the plateau of an essentially completely-depolarized 
potential in the Purkinje tissue of the mammalian heart, 
during a heart beat, a threshold stimulus in the polariz- 
ing direction sets up a regenerative repolarization which 
grows explosively and restores a resting potential 
[Weidmann® ]. Hisada*! has recorded action potentials 
of the same polarity—that is, increasing polarization 
across the cell membrane of the protozoan Noctiluca. 

Finally, we have spontaneous activity manifested in 
pacemaker potentials in some neurons including many 
sensory neurons. A background of continuous spike dis- 
charge in the absence of known stimulation or under 
steady-state stimulation may be called spontaneity. Of 
course, such activity depends upon permissive condi- 
tions of the metabolism and of the milieu of the cell. If 
these conditions are normal ones, however, steady-state 
firing is spontaneous in the sense that the origin of the 
intermittent activity lies within the neuron, rather than 
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in the environmental mechanisms. Such activity has 
been explored in several cases (Fig. 11), and the sig- 
nificant finding from the present point of view is the 
localization of the pacemaker process to a limited portion 
of the neuron different from that in which spikes arise, 
and perhaps also different from that in which synaptic 
or generator activity occurs primarily. Of course, these 
regions must be within shouting distance of each other, 
for their normal significance lies in the summing of their 
effects to produce a threshold change in the membrane 
potential at the spike-initiating region. In lobster cardiac 
ganglion cells, evidence of more than one pacemaker 
locus is seen in the same neuron. This is possible because 
these loci apparently occur in different major processes 
of the cell, each of which has its own spike-initiating 
region; two different rhythms of spikes arise and can be 
seen in the same soma without interfering one with 
another since they cannot invade the soma. What this 
says is that the absolute dimensions of neurons are 
not accidental but are fixed by engineering require- 
ments specified by the electrotonic parameters of the 
membrane. 

There are very likely at least two distinct types of 
Spontaneous potential change. The rarer is approximately 
sinusoidal. Spikes tend to occur during the phase of 
maximum depolarization. Several spikes may occur 
during this phase, but they are not necessary for and do 
not accelerate the repolarizing phase. In contrast, the 
more common type is similar to a relaxation oscillation, 
in that the pacemaker potential itself is a steadily de- 
polarizing potential which must be interrupted by some 
new process with a threshold whose net result is to re- 
polarize, thereby starting a new cycle. This threshold 
process is typically the spike potential, but it may be 
preceded by and indeed may be substituted by a local 
potential which is capable of repolarizing the membrane 
during its recovery phase, thus setting the stage for 
another pacemaker potential to begin. 

There are other permutations and complexities in the 
processes of spontaneous impulse initiation as is suggested 
by the high rhythmicity of some receptors and the low 
rhythmicity of others. In fact, it may be characteristic 
that, in one and the same unit, rhythmicity is relatively 

high at high frequencies of discharge and low at low 
frequencies. Tokizane and Eldred”? have found two dis- 
tinct populations of stretch receptor fibers in the dorsal 
roots of cats. One population is consistently more 
rhythmic at any given average frequency than the 
other; that is, the standard deviation of intervals is 
smaller for a given average interval. They believe that 
the more-regular units come from flower-spray endings 
in the muscle spindles and that the less-regular units 


to annulo-spiral endings. ry. 
ee various possibilities for understanding the origin 


rhythmicity or randomness-within-limits of succes- 
of nonrhy nnot be discussed here. Nor can one 


sting question of the central problem 


created by nonrhythmicity, the distinction of signals 
from noise—that is, of weak true stimuli from random 
changes in the frequency of firing. These problems have 
been discussed elsewhere (see references 33-35). 

These separate processes reflected in membrane po- 
tential shifts, in synaptic, generator, local, pacemaker, 
and spike potentials are sequentially coupled in complex 
ways: partly because there are several alternative se- 
quences; partly because much depends on the particular 
microanatomy of a given neuron, the spatial relations of 
the separate loci, and the possibilities of spread of the 
respective potentials; and partly because of the labile 
character of the coupling constants. The constants or 
transfer functions well may be nonlinear. 

However, these do not exhaust the processes of which 
there is evidence that determine the initiation of im- 
pulses within the cell. Turning to processes nol mani- 


fested by membrane potential changes—at least by 


changes detectable with the same methods used for 
measuring the just-discussed events—wwe are faced 
mainly with indications of excitability changes. These 
are brought to attention only by responses to subsequent 
stimuli. One is called facilitation (Fig. 9). The cell can 
be initially at a certain membrane potential and respond 
to a given arriving impulse (in this case, a single pre- 
synaptic fiber) with a small synaptic potential. After 
repetition, the response to the same presynaptic impulse 
is much greater even when the membrane potential is 
allowed to return to the initial value between stimula- 
tions. Another type, which may be seen in the same 
neuron may be called defacilitating ; the response is less 
to successive presynaptic impulses the closer together 
they come. Here also, at the same membrane potential 
level, the cell or presynaptic terminals are altered in 
response or excitability according to the recent history. 
This is not a laboratory artifact but is an essential part 
of the normal mechanism of burst formation in the 9- 
celled cardiac ganglion of the lobster. Here, the large 
follower neurons, driven by a burst of arriving presyn- 
aptic impulses from the pacemakers, characteristically 
give the form of synaptic potential response shown in 
Fig. 9. A large, initial synaptic potential is followed by 
a series of small ones during the high-frequency portion 
of the burst and then by growing amplitude responses 
as the frequency of the arriving presynaptic impulses 
declines. 

Here we have to deal with the time course of excita- 
bility of graded, subthreshold events. There must be a 
Separate curve for each of the subthreshold events pre- 
viously discussed and for each locus and type of synapse. 
The last two instances, of facilitation and defacilitation, 
recorded in the same neuron were both excitatory syn- 
aptic potentials, from different presynaptic pathways 
(one from the pacemaker in the cardiac ganglion, and 
the other from the cardioacceleratory nerve from the 
central nervous system). The same neuron also has 
facilitating inhibitory s napses. Each of these synaptic 
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responses must have its own dependence of response 
upon input and of this relation (response per input) 
upon time after preceding activity. The local potentials 
also may have a separate time dependence of excita- 
bility and so may the spike-initiating locus where the 
excitability is measured by a sharp threshold. We note 
that the classical spike threshold, so easily measured 
any place in the axon, is normally a significant parameter 
at a very limited region only in the whole neuron. 
Beyond this region, the safety factor is normally suffi- 
ciently high that the spike threshold is without great 
significance. 

Another degree of freedom not predicted by the mem- 
brane potential is the presence or absence of after-effects 
—either persistent response after the input ceases or of 
overshooting rebound (post-excitatory inhibition or 
post-inhibitory excitation). Cases of each kind are 
known for receptors and for central neurons. 

Still another line of evidence for processes determining 
spike initiation, which cannot be seen in the usual mem- 
brane potential measurements, is that of significant 
alteration of the frequency of already-active neurons by 
weak electric fields in the surrounding medium, already 
mentioned. In a preliminary analysis, Terzuolo and 
Bullock? estimated the intensity of the voltage gradient 
in the saline surrounding a neuron (stretch receptor of 
the crayfish) when an imposed polarization was of a 
magnitude just sufficient to cause a noticeable change 
in the maintained frequency of firing. The results, as 
already mentioned, were that very weak fields (of the 
order of 1 uv/u in the external medium around the cell) 
were sufficient. There will be a potential change in a 
given direction (for example, an excitatory direction) 
in certain parts only of the neuron, and since this will 
be graded in magnitude geographically, there must be 
a very limited region only wherein the imposed field 
actually exerts its effect. It can be concluded, therefore, 
that, even in the absence of average change in mem- 
brane potential, very localized regions may be critical 
in determining the firing frequency and may have an 
extremely high sensitivity to small voltage gradients. 
Thus, old ideas are confirmed in a certain sense that, 
among other things, the neuron is to a significant degree 
under the influence of differences of potential between 
one part and another of the surface of the neuron. 

The general picture then is quite different from that 
of the axon. Grundfest’ has given arguments for be- 
lieving that adjacent patches of neuronal membrane are, 
respectively, electrically excitable and electrically in- 
excitable in a certain meaning of these terms, It may be 
emphasized that the evidence cited here is also strongly 
indicative of a neuronal membrane consisting of a patch- 
work of different kinds of subthreshold-response capaci- 
ties. The responsiveness or the excitability—these two 
are different, but ordinarily are not easy to distinguish 
experimentally—vary separately one from another; 
some are manifest by electrical events, others are not. 
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One more degree of complexity must be added: the 
separate processes do not simply sum algebraically to 
determine the level of probability of impulse initiation, 
but at least some of them apparently interact : the pres- 
ence of an increased magnitude of one subthreshold 
process may alter the magnitude of one of the others. 
For example, in the presence of a large generator po- 
tential, excitatory synaptic potentials may not simply 
add but will themselves be altered in size. The same may 
be true for inhibitory and pacemaker potentials. These 
interactions have been little studied, but there is no 
doubt that they will be found, in some cases, to con- 
tribute importantly to the integration of input to 
determine output (see, for example, the evidence of 
multiplicative neural events adduced by Hassenstein 
and Reichardt”). 


SUMMARY 


A selection of sense organs upon which recent prog- 
ress has been made is reviewed in order to point out 
problems and opportunities of biophysical interest. The 
electroreceptors in electric fish, mechanoreceptor hairs in 
arthropods, infrared receptors in pit vipers, and specific 
chemoreceptors in flies and moths are the examples 
presented. 

The general problem of absolute reception as opposed 
to reception of change is emphasized in terms of sta- 
bility, drift correction, and reference standard. 

The long series of subthreshold events which may 
intervene between transducing a stimulus and initiating 
a spike is reviewed. These include events which are 
reflected in the membrane potential and others which 
are not. Specific shifts of membrane potential, pace- 
maker, generator, synaptic, local and spike potentials, 
and some subvarieties are sequentially coupled in labile 
and perhaps nonlinear ways. They occur, in general, in 
restricted regions of the neuron which interact in com- 
plex ways not only because of the complex coupling 
constants between successive steps but also because of 
the profound influence of the anatomical distribution of 
the differently responding types of membranes. 

Events not reflected in the membrane potential are 
the excitability cycles and the presence or absence of 
aftereffects. In addition to the spike threshold, each 
graded process preceding the spike has a curve of re- 
sponse against input and of this relation against time 
after preceding activity. 

It is probably important that these several processes 
not only are sequentially related but that they interact 
—the amplitude of one may alter the responsiveness of 
another. Spike initiation is, therefore, potentially a 
highly derived and integrative result. 
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HE processing of sensory information begins in 

the sense organs themselves. It is in them that 
the first steps take place in the transformation of ex- 
ternal influences into the patterns of nervous action 
that regulate the activity of an animal in its complex 
environment. The fundamental nature of the receptors 
and the design of the accessory structures in which the 
receptors are deployed determine how external infor- 
mation flows into the organism. Thus, sense organ and 
receptor mechanisms determine the character of the 
neural activity that is passed on to higher neural centers. 
In addition, the first steps in neural integration take 
place within the sense organs, for in many of them the 
receptors interact with one another. As a result of both 
of these actions, patterns of sensory nerve fiber activity 
transmitted to the higher centers are more than mere 
replicas of the temporal and spatial patterns of external 
stimuli. Certain significant features of the stimulus 
patterns are accentuated at the expense of less im- 
portant fidelity of representation. This can be clearly 
illustrated in the analysis of the first steps of the visual 
process, with which this paper deals. 

One of the great contributions of biophysics in the 
last century was the precise description of the human 
eye as an optical instrument. The high degree of per- 
fection of our eyes enables us to exploit many of the 
peculiar advantages of luminous energy as a source of 
information; their shortcomings set limits to our visual 
performance. The vertebrate scheme of optical imagery 
by a lens system is not the only one that is used by 
animals ; compound eyes also have been evolved—made 
up of small optical units, each having a narrow entrance 
angle and each pointed in a different direction so that 
all cover the entire field of view. They too have both 
advantages and disadvantages, one of the advantages 
being that short wavelengths can penetrate to their 
receptors. In either case, retinal receptors arranged in 
a mosaic receive light in varying amounts from the 
various parts of the animal’s surroundings. The mecha- 
nism of the visual receptor units that compose the 
retinal mosaic determines many of the properties of 
vision. 

The photoreceptor offers certain advantages over 
many other receptors in the study of sensory mecha- 
nisms, for in it the very first step in the transducer 
mechanism for translating the stimulus into nervous 
action is beginning to be well understood. This is a 
consequence of the general principle that electromag- 
netic radiation, to produce a permanent effect on a 


material system, must yield some of its energy to the 
system. Consequently, the action spectrum of the visual 
apparatus is simply the manifestation of the absorption 
spectrum, or a portion of it, of the primary photosensi- 
tive material in the visual receptors. It is the absorption 
spectrum of the primary visual pigment that sets the 
rather indistinct limits to the extent of the visible region 
within the electromagnetic spectrum and that deter- 
mines quantitatively the relative effectiveness of differ- 
ent wavelengths of visible light. There is now very good 
agreement between the measurements of the absorption 
spectrum of photolabile pigments extracted from the 
retina and the “‘action spectrum” of vision for several 
animal forms, especially for man.) 

The fact that one can identify the photosensitive 
material of the visual receptor makes it possible to say 
whereabouts in the receptor cell the first act of the 
visual process takes place. In the vertebrate eye, the 
visual pigment “rhodopsin” is known to be concentrated 
entirely in the outer segments of the retinal rods. This 
identifies the outer segments as the locus of the initial 
step in the visual receptor process. Rhodopsin can be 
extracted from suspensions of the outer segments of 
retinal rods, and its absorption spectrum, after appro- 
priate correction, agrees well with the distribution of 
spectral sensitivity of rod vision. It is clearly the 
primary photosensitive substance of the rods. A number 
of visual pigments related to rhodopsin are now known. 
One of them, iodopsin, is the corresponding photosensi- 
tive substance of the retinal cones.* The biochemistry 
of the visual pigments constitutes an extensive and 
elegant chapter of modern biochemistry that cannot be 
discussed in detail in this paper (cf. Wald’). 

Rhodopsins are known to be conjugated proteins, the 
prosthetic group being a carotenoid called retinine. 
Retinine is an aldehyde, the corresponding alcohol being 
vitamin A, and is known in a number of isomeric forms. 
The first act of light apparently is to produce an iso- 
merization of the carotenoid group while it is still 
attached to the protein.® Retinine is then split off the 
protein molecule by subsequent reactions that are in- 
dependent of light, and may be converted reversibly 
into vitamin A. 

After photolysis, visual pigments can regenerate. 
Otherwise, one would have one look at the world and 
then be forever blind. The kinetics of photolysis and 
regeneration of visual pigments has been studied ex- 
tensively, both in vitro and recently in the living retinas 
of experimental animals and human subjects. Many 
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Fic. 1. Electron micrographs of rhabdom of an ommatidium of an arthropod compound eye (Limulus), showing honeycomb-like 
arrangement of osmium-staining membranes. Left: plane of section perpendicular to optical axis of ommatidium. Center: oblique 
section. Right: section in an axial plane. Height of figure approx 2 u. Courtesy W. H. Miller. 


receptor properties, such as the loss of sensitivity in the 
light and its recovery during dark adaptation, can be 
explained qualitatively by these elementary biochemical 
processes in the visual receptors. Moreover, a simple 
model of the photochemical system of the receptor was 
used by Hecht to explain quantitatively many psycho- 
physical measurements of vision.’ His formulations re- 
main the most comprehensive and successful theoretical 
treatment of visual-receptor physiology, although his 
model is oversimplified and the theory needs reworking 
in light of recent developments in biochemistry and 
physiology. iA 
The primary photosensitive pigment of the visual 
receptor is present in a structured system. Electron- 
microscope studies show a profusion of osmium-staining 
membranes in visual receptors. Sjöstrand? has shown 
that the outer segments of the receptors of the verte- 
brate retina have a lamellar structure. The rod outer 
egment is thus a stack of thin plates crowded with rho- 
seg sin. In the arthropods, instead of a lamellar system, 
adop “i of the receptor cell (the rhabdom) that pre- 
the pa contains the visual pigment is composed of 
sumar Y villi densely packed to form a honey- 


9 micro f 
Dee, pee fare Figure 1 is an electron micrograph 
comp- 


of the rhabdom of an ommatidium of an arthropod 
compound eye.’ In the vertebrates, the outer segments 
of the retinal rods and cones have been shown to be 
derivatives of cilia.™!! In the arthropods, where cilia 
are extremely rare, there is no evidence of a ciliary 
derivation. In some mollusks, however, there is a differ- 
ent system of membranes and again the structures are 
derived from cilia.? Exactly how the visual pigment is 
arranged within any of these membranous structures is 
not known, though there have been speculations on 
this point. 

Visual receptors have evolved into light detectors 
that are so sensitive that they work at the limit set by 
the quantum nature of light. A human observer is able 
to see a flash of light that contains only about 100 
quanta, measured at the cornea of the eye. After correc- 
tion for losses in transmission through the ocular media 
and failure of the visual purple to be present in sufficient 
amount in the retina to absorb all of the quanta that 
fall on it, this figure comes down to something of the 
order of 10 quanta. This aspect of visual physiology 
has been extensively studied and is well reviewed ina 
recent article by Pirenne. Obviously, it is of great 
significance to visual performance, especially at low 
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lic. 2. Oscillograms of the electrical activity in a single optic-nerve fiber (eye of Limulus) in response to prolonged steady illumination 


of the facet of the eye innervated by the fiber.!8 Each spike is the “action potential” associated with the passage of a nerve impulse 
in the fiber. For the top record, the intensity of stimulating light was 10' times that used for the bottom record. Signal of exposure 
to light blackens out the white line above the time marks. Time marked in } sec. 

illuminations. In the short “action time” of the retina, (ommatidia) can be separately illuminated, and the 


so few quanta are needed that the retinal image, though 
visible, is too “grainy” to be seen with high resolution. 
Also, at threshold, seeing is uncertain. Indeed, the 
statistical uncertainty at threshold can be explained 
almost entirely by the fact that a very few quanta 
suffice to excite a response. Nevertheless, the visual 
threshold is sharp enough so that it is quite certain that 
a human observer cannot see just one quantum, al- 
though exactly how many are needed is still a matter of 
controversy. The small amount of light that is just 
visible can be seen if it is spread over a retinal area 
containing about five hundred rods. This must mean 
that near threshold there is almost no chance of any one 
rod receiving more than one quantum, and that the 
cooperative activity of several rods is necessary to reach 
the threshold of vision. Thus, a single quantum of light 
absorbed within the stack of plates comprising the outer 
segment of a rod is sufficient to excite that rod, causing 
it to transmit some kind of nervous influence that can 
sum with similar influences from several other rods to 
reach the threshold for a behavioral response. In this 
retinal summation, one has an example of the simplest 
kind of neural integrative action, exerted at the very 
threshold of vision. 

The end result of receptor excitation is the generation 
of nervous influences in its attached nerve fiber. It has 
not yet been possible to record the neural activity of 
the receptors (rods and cones) of the vertebrate retina, 
but some invertebrate eyes afford an opportunity to 
record optic activity that appears to be very close to 
the action of the primary receptors. The eye of the 
common horseshoe crab, Limulus, is particularly favor- 
able for the study of the action of single receptor units. 
This eye is a coarsely facetted compound eye. Indi- 
vidual receptor units corresponding to each facet 


electrical activity of the optic nerve fiber from such a 
unit can be recorded.15 

The neural activity recorded from one of the receptor 
units of the eye of Limulus consists of trains of uniform 
nerve impulses similar in all respects to the sensory 
discharges observed in nerve fibers in all of the higher 
animal forms (Fig. 2). As in all receptors, the higher 
the intensity of the stimulus, the higher the frequency 
of the impulses with which the receptor responds. In 
the visual receptor, it is noteworthy that frequency 
changes over a relatively small range for a large range 
of light intensity : the dynamic range of a single receptor 
is five or six orders of magnitude. Roughly, the relation 
between frequency of discharge and intensity of light 
is a logarithmic one (Fechner’s Law). Thus, the trans- 
ducer mechanism of the visual receptor covers a large 
range and is adapted to signal the ratios of stimulus 
values. In our own visual experience, values of light 
and shade stay more or less fixed, no matter what the 
ambient level of illumination, over a large range. In 
such situations, stimulus ratios stay constant and the 
visual receptors yield approximately a fixed difference 
in the frequency for a given ratio of stimulus values, 
even though the absolute differences may vary widely. 

Another important receptor property is illustrated in 
Fig. 2. The discharge of nerve impulses begins at a high 
frequency when the light is turned on, but the frequency 
of the discharge subsides in a fraction of a second to a 
considerably lower level, which is then maintained with 
only slight diminution as long as light continues to shine 
on the receptor. This sensory adaptation is manifested 
by all other receptors, some to a far greater extent than 
others. As a result of sensory adaptation, receptors pro- 
vide a somewhat distorted report of the stimulus events, 
such as to accentuate any sudden change. Sensory trans- 
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Fic. 3. Electrical responses to illumination of a single receptor unit (ommatidium) in the eye of Limulus.!5 Recorded (upper trace) 
by a micropipette electrode (tip diam 1 x) in the neuron of the ommatidium, and simultaneously (black edge just below upper trace) 
by a pair of wire electrodes over which was slung the nerve bundle from the ommatidium containing the neuron’s axon. At the beginning 
of the record, the microelectrode base line records the resting level of the electrical polarization of the cell membrane, at a potential 
about 50 my negative to the solution bathing the outside of the cell. When the ommatidium was illuminated (black band above the 
time marks), there was a partial depolarization of the neuron (potential becoming less negative: rise in the base line) accompanied 
by an increase in the frequency of the spike-like deflections, each one of which was synchronous with the discharge of an impulse in 
the nerve bundle (small spikes on the black edge). Time marked in 4 sec. 


ducers are not concerned so much with a faithful rep- 
resentation of the world as with a useful one, and it is 
especially useful to the organism to accentuate the 
changes that occur in external conditions. If the illumi- 
nation on a visual receptor unit is given a small incre- 
ment, the receptor response consists of amodulationof the 
discharge of impulses in which there is an exaggeration of 
frequency changes at the onset and again when the 
increment is turned off. This permits the receptor to 
signal small changes, and still possess a large dynamic 
range. But what may be even more important, the 
suddenness of the changes enhances their stimulating 
effectiveness. Thus, the inherent properties of the sensory 
receptors determine how the patterns of neural activity 
they generate will represent the stimulus events. This is 
a first step in the processing of information for use by 
the organism. 

As yet not much is known about the nature of the 
excitatory processes following the initial photochemical 
reaction in visual receptors until one comes near the 
end of the receptor process. In the eye of Limulus, it 
has been possible to learn a little about the actual 
production of nerve impulses in the axon of the excited 
neuron in the receptor unit. By the use of a micropipette 
electrode penetrating the sensory structure of the 
ommatidium, changes in the electrical polarization of 
the cell membrane of the neuron in the ommatidium 
have been recorded.15—” These changes are associated 
with the trains of impulses initiated by this cell when 
the receptor unit is illuminated (Fig. 3). When light is 

turned on, the cell membrane becomes somewhat 
depolarized, and simultaneously there is a speeding up 
of the discharge of impulses in its axon. Such Gagne 
zation is referred to as a “generator potential,” in the 
belief that the nerve impulses are generated by local 
electric currents flowing as a result of the (oe n 

otential between the axon and the depolarize e 
Rady (or more probably, in the present case, the de- 


i iti cess of the cell which penetrates 
polane a ae ei Ao) The degree of de- 


polarization depends on the intensity of the stimulating 
light and in turn determines the frequency of the re- 
laxation oscillations of the membrane of the initial 
segment of axon in the region where it leaves the cell 
body and from which the propagated impulses take off. 
The discharge of trains of impulses by depolarized 
neurons is a familar process in neurophysiology. For 
the photoreceptor, the question is how the initial photo- 
chemical reaction produces the depolarization and the 
ensuing “generator potential.” About this, almost 
nothing is known. 

As discussed in the foregoing, the receptor itself by 
its inherent properties does a certain amount of process- 
ing of the information from the outside world. It is 
concerned with the report only of certain aspects of the 
physical stimulus that acts on it, and it is not necessarily 
a high-fidelity recording device. Built as it is, it selects 
certain features of the stimulus pattern for accentua- 
tion. The next step in the processing of sensory infor- 
mation in the visual system concerns the distribution 
of light over the entire population of visual receptors. 
A retina, whether in a vertebrate or an arthropod, is 
more than a mosaic of independent detecting elements. 
In the vertebrates, it is well known that the retina is a 
highly organized nervous center. It is really a part of 
the brain closely applied to a mosaic of sensory recep- 
tors. The first step in the neural analysis of the pattern 
of the retinal image requires the intercomparison of 
what happens in the various differently stimulated 
receptors, and a modification of the pattern of neural 
activity to accentuate important features of the spatial 
distribution of light over the receptor mosaic. Evidently, 
it is profitable to do this close to the point where the 
information is being picked up. In the vertebrate retina, 
the early neurons in the visual pathway are spread out 
in correspondence with their associated receptors, and 
many of the processes in the first step of neural integra- 
tion apparaently can be done most effectively in the 
retina itself. This is not a simple process; patterns of 
activity observed in the optic-nerve fibers in the verte- 
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ic. 4. The plexus of the compound eye of Limulus. (a) Light micrograph of a section cut through the eye in a plane perpendicular 
to its external surface (cornea removed), showing on its upper border a row of the heavily pigmented ommatidia, from each of which 
emerges a small bundle of nerve fibers (stained with silver by Samuel’s method) that contains, together with small fibers, the axon of 
ommatidium neuron. Connecting these bundles are festoons of fibers, with clumps of neuropile that appear at this magnification as 
condensations in the meshes of the plexus. Width of figure=2.2 mm. Photograph by W. H. Miller [from H. K. Hartline, H. G. Wagner, 
and F. Ratlifi, J. Gen. Physiol. 39, 651 (1956) ]. (b) Electron micrograph of a portion of a clump of neuropile in the plexus, showing a 
few outlines of the fibers composing the clump, within which are numerous small circular outlines interpreted as synaptic vesicles. 


Width of figure=1.2 u. Photograph by W. H. Miller,!® 


brate retina are very complex, and their analysis is 
difficult. In simpler visual systems, integrative processes 
can be more readily analyzed. The eye of Limulus again 
affords a good oportunity for such studies. 

The neural structure of the compound eye of Limulus 
is much simpler than that of the vertebrate retina or 
the eyes of more highly developed arthropods, but it is 
nevertheless a retina: the units of the receptor mosaic 
are interconnected by a network of nerve fibers [ Fig. 
4(a) ]. The nerve fibers from the ommatidia branch 


a eee came an 


profusely on their way out of the eye to form the optic 
nerve. Festoons of these branches connect each receptor 
unit with its neighbors. There are no nerve-cell bodies 
in this plexus of interconnections, as in more complex 
retinas, but there are numerous knots composed of a 
felt-work of very fine branchlets closely intertwined. 
The fibers in these clumps of “‘neuropile” are packed 
with “vesicles” typically present in synapses [ Fig. 4(b) ]. 
Evidently, the clumps of neuropile are synaptic regions, 
where influences are transmitted from one set of 


Fic, 5. Oscillograms of nerve action potentials, showing inhibition of the impulses in a single optic-nerve fiber of Limulus. The om- 
matidium of the eye from which the fiber arose was illuminated steadily at a fixed intensity, beginning 3 sec before the start of each 
of the records; adjacent ommatidium were illuminated during the interval signalled by the blackening out of the white line above 
the time marks, in the upper two records. In the top record, the intensity of the illumination of adjacent receptors was ten times that 


used in the middle record. Bottom record is a control (no adjacent illumination). Time in } sec. Experimental arrangement as in Fig. 6(a). 
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Lateral Lateral Receptors 
plexus plexus 
To amplifier A 
To amplifier 
i _7 To amplifier B 
3 Optic nerve — 
Optic nerve 
(a) (b) 
Fic. 6. Schematic diagrams of experimental arrangements used. (a) Experiment of Fig. 5 (inhibition of test ommatidium by illumina- 
tion of nearby ommatidia). (b) Experiments of Figs. 7 and 8 (mutual inhibition of two ommatidia that were close to one another). 


branches to another. Based on this structural organi- 
zation is a simple functional organization: each om- 
matidium tends to inhibit the activity of its neighbors. 
This influence is indeed exerted over the plexus of nerve 
fibers, for, by cutting the interconnecting branches to 
an ommatidium, the influence of its neighbors on it can 
be abolished. 

The inhibition that is exerted on an ommatidium by 
its neighbors is illustrated in Fig. 5. In the experiment 
from which these records were taken, the discharge of 
impulses was recorded in a single optic nerve fiber in 
response to illumination, by a small spot of light, of the 
facet of the ommatidium from which that fiber arose 
[ Fig. 6(a)]. During steady illumination of that one 
ommatidium alone, a steady discharge of impulses 
resulted (bottom record). When, during steady and 
continuous illumination of this “test?” ommatidium, 
light was caused to shine also on other ommatidia in 
neighboring regions of the eye (top and middle records 
of Fig. 5), the frequency at which the test ommatidium 

discharged impulses was reduced. 
Figure 5 also shows that strong illumination of the 
adjacent region produced a greater depression of fre- 
quency than weak illumination. It has also been shown 
that the magnitude of the inhibition exerted on an 
ommatidium is greater the larger the number of neigh- 
boring ommatidium that are stimulated. Thus, the 
- 1:1 ory influences from many neighbors can combine 
inpibito A the net effect they produce. Also, the inhibi- 
to increase d on an ommatidium by its neighbors is 
tiop jezer pa they are to it. Ommatidia that are 
greater A distance exceeding 4 or 5 mm have no 


e another. 


effect on oD vn the eye of Limulus is exerted mutually 


by the receptor units.” Each ommatidium, being a 
neighbor of its neighbors, inhibits them as well as being 
inhibited by them. This is shown in Fig. 7, obtained by 
recording activity simultaneously in the optic nerve 
fibers from two independently illuminated ommatidia, 
close to each other in the eye [ Fig. 6(b) |. The frequency 
of each receptor unit was lower when both were illumi- 
nated together than when each was illuminated by 
itself. When this experiment is performed using various 
intensities on the two receptors in various combinations, 
it has been shown that the amount by which the steady 
frequency of discharge of each receptor unit is lowered 
depends on the degree of concurrent activity in the 
other, and is indeed a linear function of the frequency 
of its discharge (Fig. 8). Thus, the response of one 
receptor unit is determined by the excitation furnished 
by the stimulating light shining on it, diminished by 
the inhibitory influence from the second receptor, which 
in turn depends on the resultant of the excitation 
furnished by its own stimulus and the inhibition exerted 
on it by the first. This mutual interdependence of any 
two neighboring receptor units may be described by a 
pair of simultaneous equations, linear in the frequencies 
of the discharges. 

When more than two interacting receptors are illumi- 
nated simultaneously, each is subject to the combined 
inhibitory influences from all of the others. The law 
that determines how the inhibitory influences from 
several active receptor units combine in affecting the 
activity of a neighboring unit has been found by experi- 
ment: if the influences on a given unit are measured 
by the reduction they produce in its frequency of nerve 
impulse discharge, the combined effect of all of the 
other units is simply given by the sum of the influences 
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alone 


Fic. 7. Mutual inhibition of two ommatidia close to one another in the eye of Limulus, steadily illuminated at fixed 
intensity on each. Experimental arrangement as in Fig. 6(b). Time (black dots) in 4 sec. 


exerted by each.”! The responses of a set of interacting 
receptor units, measured by the frequencies of their 
optic nerve discharges, are therefore expressed by a set 
of n simultaneous linear equations, 


Tete 2 Kpli =t) p=, 2---n. 
= 

In these equations, 7, stands for the response of the 
pth unit (measured by its steady frequency of impulse 
discharge) when it is illuminated steadily together with 
the other units. Its excitation, ep, is measured by the 
response it has when it is illuminated alone. Each con- 
stant K,; is the coefficient of the inhibitory action of 
the jth receptor on the pth (usually less than 0.2) and 
each constant 7,,° is the threshold of that action. Terms 
for which j= are usually omitted. The equations as 
written apply only to those units and that range of 
activity for which r; is not less than 7p% As a rule, the 
closer the interacting elements are to one another, the 
larger the K’s and the smaller the 7’s. Exceptions are 
often found, however, and it is not yet possible to state 
the statistical law governing the effects of distance on 
the inhibitory interaction. 


5 


If V small groups of receptors are considered, each 
group uniformly illuminated and assumed to consist of 
receptors with similar properties exerting equal actions, 
the foregoing set of equations may be reduced to V 
simultaneous equations with lumped coefficients repre- 
senting the group interactions. Applied to three inter- 
acting receptors or receptor groups, the theory outlined 
in the foregoing can account quantitatively for a number 
of effects that have been observed with various experi- 
mental configurations of retinal illumination. Thus, if a 
test receptor is located midway between two groups of 
receptors that are themselves too far apart to interact, 
the combined inhibitory action of these two on the test 
receptor is equal to the sum of the separate actions of 
each, unless the test receptor itself has an appreciable 
effect upon them. If the two groups are close together 
and both are near the test receptor, their combined 
inhibitory effect will be less than the sum of their 
separate effects since they inhibit one another mutually 
when both are active. If a group of receptors is too far 
from a test receptor to influence it directly, it may 
nevertheless affect its response by inhibiting a second 
group located close to the test receptor, thereby releas- 
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Fic. 8. Mutual inhibition of two ommatidia close to one another 
in the eye of Limulus, illuminated independently at various levels 
of intensity in various combinations. Amount of inhibition (de- 
crease in frequency) of each ommatidium plotted as a function of 
concurrent level of response (frequency) of the other. From an 
experiment similar to that of Fig. 7, experimental arrangement 
as in Fig. 6(b). 


ing the test receptor from the inhibition exerted by the 
second group. Such “disinhibition” illustrates how in- 
direct effects may be exerted beyond the limits of direct 
influence and, in principle at least, extend over the 
entire mosaic of interdependent receptor units. 

The inhibitory interaction just described may be 
considered a simple integrative mechanism that takes 
place at or close to the level of the receptors themselves. 
Because of it, patterns of optic nerve fiber activity yield 
a distorted representation of the patterns of incident 
illumination. This distortion, however, serves a useful 
function, for it is clear that inhibitory interaction must 
enhance contrast: brightly lighted elements in the 
receptor mosaic inhibit the dimly lighted ones more 
than the latter inhibit the former. If, as in the eye of 

Limulus, mutual inhibition is greater between close 
neighbors than distant ones, contrast will be greatest 
near regions of steep intensity gradients and borders 
and edges in the retinal image will be “crispened.” 
Phenomena of border contrast are illustrated in our 
own vision by the light and dark bands bordering a 
penumbra (Mach bands), and by the fluted appear- 
ical step-wedge or of shadows cast by 
ance of an optical step-We"é* © a ee 
multiple light sources. Inhibitory interaction is probably 
oné of the m echanisms in our di visual systems that 
, henomena. 
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Direct experimental demonstration of the ‘‘crispen- 
ing” of the contours by inhibitory interaction can be 
made, using the eye of Limulus.” The discharge of 
impulses is recorded from a “test” receptor near the 
center of the eye as the eye is caused to scan slowly a 
pattern of illumination containing a gradient of in- 
tensity. When all of the receptors are masked except 
for the one from which activity is being recorded, a 
faithful representation is obtained of the distribution of 
intensity in the image viewed. But when the mask is 
removed, so that all of the receptors view the pattern, 
maxima and minima in the frequency of the test recep- 
tor discharge occur, corresponding to the regions border- 
ing the gradient. These resemble in form and location 
the “Mach bands” seen by a human observer viewing 
the same pattern. 

Interaction is known to take place in other sense 
organs. In the ear, von Békésy** has suggested that 
inhibitory interaction may be important in increasing 
pitch discrimination. Indeed, Galambos and Davis” 
have demonstrated inhibition of the activity of single 
auditory nerve fibers by tones differing in frequency 
from those used to excite the fibers. Also, von Békésy?’ 
has demonstrated “contrast” effects in the tactile 
stimulation of the skin, suggesting inhibitory inter- 
action over considerable distances over the surface of 
the body. 

In the higher nervous centers, integrative processes of 
great complexity take place. In the retina of the verte- 
brate eye, which—even though located in the peripheral 
sense organ itself—is nevertheless a nervous center of 
high order, there is an intricate interplay of excitatory 
and inhibitory interactions.” As a result, diverse and 
labile patterns of optical nerve fiber activity are 
generated.” In the vertebrate retina, to a much 
greater degree than in the primitive retina of Limulus, 
the patterns of afferent nervous activity are greatly 
modified to accentuate significant features of informa- 
tion about the environment. The process of neural 
integration is well begun by the time the afferent mes- 
sages are transmitted to still higher centers in the brain. 
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INTRODUCTION 


HE propagation of an impulse along a nerve or 
muscle fiber is brought about by two coupled proc- 
esses: (i) cable transmission, which allows an electric po- 
tential change to spread along a short distance, but with 
rapid attenuation, and (ii) a boosting mechanism by 
which the full signal strength is regenerated at each 
point. If either of these processes is interfered with, the 
signal will be blocked and will fade out locally. The 
cable mechanism depends upon the continuity of the 
fiber structure, with a relatively low-resistance core 
and high-impedance surface layer. During the impulse, 
sufficient current must be able to flow forward along the 
inside of the axon and outward through the resting 
membrane to stimulate it. If one were to close the core 
with a high-resistant transverse membrane, or to place 
a low-resistance shunt across the fiber surface, trans- 
mission would be impaired and probably would fail at 
that point. The very reason why thousands of axons 
packed together within one nerve bundle can conduct 
their messages independently, without mutual interfer- 
2nce, rests on the absence of structural continuity, and 
lo of an effective cable connection between them. 

The purpose of this paper is to consider what happens 
at “synapses,” the points of contact between one nerve 
cell and the next, or between nerve and muscle fiber. 
There is no sign of cytoplasmic continuity between the 
different cell units. Electron-microscope evidence shows 
that the membranes of the synapsing cells are arranged 
in close proximity, though in general they do not seem 
to fuse or to come into intimate contact. The electron 
micrographs, however, do not reveal anything about 
the electrical properties of the contacting surfaces; and 
one has no means of guessing intuitively whether or 
not an effective cable linkage exists across the synapse. 


ELECTRICAL AND CHEMICAL TRANSMISSION 


There are, in principle, two basically different modes 
of synaptic transmission, electrical and chemical. Elec- 
tric transmission implies that in spite of apparent mor- 
phological complexities, an effective local-circuit con- 
nection exists which allows sufficient current to pass 
from one cell to the next to restimulate it. In other 
words, the transmission is one continuous process with- 

t ois essential change at the synapse. There must, 
ws se, be some difference; for all synapses have the 
CS f functioning only in one direction unlike 
propery a which can conduct impulses with equal 
power ither direction despite the fact that in nor- 

lity in €l 
facility 10 


mal life, because of their terminal synaptic connec- 
tions, they are used for one-way traffic only. 

Chemical transmission implies the intervention of an 
entirely different process specific for the synaptic area. 
It presupposes that the ordinary cable-connection is 
interrupted at the contact point between the cells, and 
replaced by the agency of a chemical mediator. A 
specific chemical stimulant which is synthetized and 
stored inside the nerve terminal is liberated by the 
nerve impulse. When the substance is released, it 
attaches itself to special receptor molecules in the sur- 
face of the contacting postsynaptic (or effector) cell. 


This chemical combination leads to a membrane change 
which gives rise to a local depolarization of the effector 
cell. When the depolarization exceeds the threshold 
level, a new action potential is set up which then travels 


along the cell in the manner previously described. Thus 
one has, interposed between two separate waves of 
propagated electric activity, a secretion of a specific sub- 
stance from one cell, and a chemoreceptor reaction in 
the surface of the next cell. 

Now, it is not possible to predict without thorough 
experimental examination which of these two modes of 
transmission occurs at a particular type of synapse. 
Various attempts have been made to generalize by 
drawing analogies from the few cases which have been 
explored; and in recent years the view has become 
prevalent that transmission is llkely to be chemical at 
all synapses, but that a variety of substances are being 
employed in different cases, and only a few of them, 
like acetycholine and noradrenaline, have so far been 
identified. This view, however, seems a little too sweep- 
ing, and the discussion is begun, therefore, by quoting 
an example of electric transmission which has just been 
brought to light by the work of Furshpan and Potter.) 
This may well be an exception to the rule, but it pro- 
vides a definite warning against too much generaliza- 
tion in this field. 

Furshpan and Potter used a “giant” synapse in an 
abdominal ganglion of the crayfish cord. This is a con- 
tact point between a very large nerve fiber which runs 
through the central nervous system of the crayfish and 
a somewhat smaller motor axon which emerges from 
the cord to supply the flexor muscles of the “tail.” This 
synapse was chosen for two reasons: (a) because the 
large size of the two contacting cells made it possible 
for a pair of microelectrodes (one to pass current and 
one to measure membrane potential) to be introduced 
into each of them; (b) it seemed a priori that electric 
transmission might be feasible at this synapse, more so 
than at other types where minute nerve endings usually 
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terminate in contact with a huge cell (which implies, 
from the electrical point of view, a poor mismatch, very 
little current being available from the high-impedance 
terminals to discharge the large surface of the postsyn- 
aptic cell). Furshpan and Potter were able to show that 
the membrane contact of this special synapse acts as a 
good electric rectifier, allowing current to pass relatively 
easily from the presynaptic to the postsynaptic cell, but 
not in the reverse direction. In other words, at this par- 
ticular synapse, an adequate cable connection exists be- 
tween the interior of the two cells, in the normal direction 
of impulse travel alone. Provided the internal potential 
of the prefiber was higher than that of the postfiber, elec- 
tric current could flow across and influence the mem- 
brane potential of the adjoining cell. The result is that 
a depolarization such as occurs during the impulse can 
be transmitted in the normal orthodromic direction, but 
not the other way. And, conversely, a local hyperpol- 


arization, produced experimentally by passing a current 
inward through the fiber membrane was found to be 
transmitted only in the antidromic direction (from 


postfiber to prefiber) but not the other way. Thus, at 
the giant synapse of the crayfish, one has a case—so far, 
the only known example—of electric transmission, in 
which the action current generated by the arrival of an 
impulse in the presynaptic cell is passed on without 
finite delay and can directly depolarize and thereby 
excite the postfiber. One-way transmission is owing to 
the valve-like one-way resistance of the synaptic con- 
tact membranes. 

There are only a few giant synapses available in 
nature allowing such a direct experimental approach 
to both sides of the junctional region. In most cases, 
the presynaptic nerve endings are too small to be 
tackled with intracellular electrodes and their electrical 
behavior has to be inferred from a more indirect ap- 
proach. It is of great interest, however, that at another 
giant synapse, in the stellate ganglion of the squid, 
Bullock and Tasaki and Hagiwara? obtained evidence 
of a different kind; they observed a definite local delay 
in the propagation of the electrical change, indicating 
a stoppage of the local-circuit transmission at the junc- 
tion; there was no detectable transfer of subthreshold 
cable signals in either direction. These observations pro- 
vide another fair warning against attempts to generalize 
about synaptic mechanisms. 

Tf one now takes an entirely different case—namely, 
the skeletal neuromuscular junction—one finds here 
one of the few examples of a synapse where chemical 
transmission has been firmly established. It was shown 
by Dale and his colleagues that a specific cholinester, 
almost certainly identical with acetylcholine, is re- 
leased from the active motor-nerve endings. This sub- 
stance is a very potent local stimulant and, provided 
it is applied rapidly to the junctional end-plate region 
of a muscle fiber, causes a local depolarization of the 
cell membrane and sets up propagated impulses and 
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contraction in the fiber. The chemical effect is localized 
to the synaptic area of the muscle surface; it is at these 
points exclusively that a number of chemical blocking 
agents, like curare, act (apparently by a competitive 
attachment to the acetylcholine receptors). By histo- 
chemical methods, a high local concentration of a 
specific enzyme, acetylcholinesterase, has been found 
at the same point; the apparent purpose is to hydrolyze 
the transmitter substance within a very short time 
after it has exerted its action. Electrical studies have 
shown that there is an irreducible delay of 0.5 to 1 msec 
between the arriving nerve impulse and the start of the 
so-called end-plate potential (which is the local de- 
polarization in the muscle fiber produced by the trans- 
mitter substance). There is no cable transfer of electric 
current, of either polarity, directly from the nerve axon 
to the muscle fiber. When potential changes are im- 
posed on the terminal portion of the motor nerve, these 
changes do not spread beyond the nerve terminal, but 
can be shown to increase the rate at which acetylcholine 
is being released from the nerve endings and so, indi- 
rectly, to influence the membrane potential of the 
muscle fiber. 

The most direct way of establishing chemical trans- 
mission by nerve impulses would be to show that a 
substance is released, on nerve stimulation, into the 
circulating fluid, and when applied to a remote effector 
cell produces the same, excitatory or inhibitory, action. 
This has been achieved in a few cases, notably the 
classical experiment of Otto Loewi in which he demon- 
strated the role of acetylcholine as the transmitter of 
nervous inhibition of the heart beat. Usually, such a 
direct demonstration is not feasible because of the 
enormous dilution of the transmitter agent, on the way 
from its primary point of release and action to the 
assaying tissue. Indeed, this discrepancy between (i) 
the amount of acetylcholine (ACh) which has to be 
applied artificially to stimulate a muscle, and (ii) the 
much smaller quantities which are released into the 
perfusion fluid from stimulated nerve endings, has been 
used as an argument against the validity of the chem- 
ical-transmitter theory. By a method of microscopic 
ionophoresis, much more effective applications have 
been made recently, and it has been possible to show 
that as little as 10~'* g equiv of ACh can give rise to an 
effective, superthreshold, depolarization of the muscle 
fiber. This is still a few hundred times more than the 
amount per impulse recovered in the earlier perfusion 
experiments, but it must be remembered that even 
with the best micropipettes one cannot reduce the 
average diffusion distance to that of the natural syn- 
aptic contact. The remaining quantitative discrepancy 
is entirely within the range to be expected and hardly 
can be used as a theoretical counterargument. The 


ionophoretic microtechnique has shown a number of 
other interesting results; it has confirmed the extremely 
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Fic. 1. External and intracellular application of acetylcholine 
to a motor end plate [from J. del Castillo and B. Katz, J. Physiol. 
128, 157 (1955)]. Intracellular recording of membrane-potential 
changes from the junctional region of a frog muscle fiber. In A, 
an ACh-filled micropipette was placed on the outside of the end 
plate, and a quantity of ACh was released by passing a brief 
outward-directed current pulse through the pipette (registered 
in b). It produced the effect shown in trace a: a depolarization 
developing after a diffusion delay and culminating in two spikes. 
Between records A and B, the ACh-pipette entered the muscle 
fiber. An outward pulse produces now a small, direct potential 
change, owing to the passage of current through the fiber membrane. 
(If, for comparison, a KCl-filled micropipette is used, no potential 
change is produced by the pulse until the pipette has entered the 
fiber, when the effect is identical with that recorded in B, a.) 


the neuromuscular junction. Moving the tip of the 
pipette by several microns can substantially reduce the 
effect of a given dose. Furthermore, it has been possible 
(Fig. 1) to insert the tip into the interior of the muscle 
cell, and so apply the acetylcholine alternatively to the 
external and internal side of the postsynaptic end-plate 
membrane.‘ The result showed that a depolarization 
was produced only by external, not by intracellular 
application; and this was observed with acetylcholine 
as well as carbamylcholine (a substance of similar 
action, but not destroyed by the local cholinesterase), 
and the same result was found for the blocking action 
of curarine. It seems that the first chemical attachment 
on to the receptor molecules must take place at the 
external surface of the end-plate membrane, which is, 
of course, the side facing the nerve endings from which 
the acetylcholine emerges under normal conditions. 
Before discussing other peculiarities of the neuro- 
muscular junction, the principal features and problems 
inherent in chemical transmission in general may be 
considered briefly. 2 
There are two main steps interposed between the 
arrival of a presynaptic and the departure of a post- 
tic impulse: (i) the process by which the arriving 
sy pap Jeases the transmitter substance, from its 
pulse recan ide the terminal into the narrow cleft 
storage place inside 


between the contacting cells—this is a special case of 
what has been called ‘“‘neuro-secretion”’; (ii) there is the 
process by which the transmitter substance becomes 
attached to specific molecules in the postsynaptic cell 
surface and causes its electric membrane properties to 
change—this is a special example of chemoreceptor 
action, that is a process analogous to that occurring in 
our various chemical sense organs where a minute con- 
centration change of some specific substance is regis- 
tered in the form of sensory nerve impulses. As an 
intermediate step, one should consider also the mech- 
anism by which transmitter molecules are transported 
across the small synaptic gap; however, the path length 
is only a fraction of a micron, and the time taken up 
by simple diffusion over such a short distance is well 
within the range of the observed synaptic latency. 


EXCITATORY AND INHIBITORY CHEMO- 
RECEPTOR ACTION 


To begin with, consider the second step, that is, the 
chemoreceptive mechanism. How do transmitter sub- 
stances alter the membrane potential? Only a very in- 
complete answer to this question can be given. In 
general, the primary action leads to the opening of 
some ionic permeability channel in the membrane. 
Depending upon the size or specificity of this ionic 
channel, the membrane potential either tends to fall 
toward a low level, well beyond the firing threshold of 
an impulse, or it may become stabilized in the vicinity 
of the resting level or even tend to rise somewhat 
(hyperpolarize). In the first case, excitation ensues; in 
the last cases, one obtains an opposite, inhibitory, 
action (see Eccles®). But it may be noted that, under- 
lying all of these changes, there is a common primary 
effect—namely, an increase of some ionic conductance. 

For example, at the motor end plate there is evidence 
that acetylcholine causes the membrane permeability 
to increase, simultaneously, to several monovalent 
cations (e.g., sodium, potassium, ammonium) and pos- 
sibly opens up an indiscriminate aqueous channel to all 
small ions on either side of the membrane. The result 
is a depolarization which has a “null point” at about 10 
to 20 mv, negative inside, which corresponds to the 
level of a free-diffusion or liquid-junction potential be- 
tween cytoplasm and external fluid. The effect is to 
short-circuit and depolarize the surrounding muscle 
membrane beyond the level at which a new impulse 
arises which then travels rapidly along the whole length 
of the muscle fiber. The methods by which these con- 
clusions were reached have been described elsewhere ;°:7 
briefly, they consisted in measuring the current/voltage 
relation across the end-plate membrane, and observing 
the particular level of the membrane potential at 
which the electromotive effects of ACh reversed, with 
normal as well as with altered composition of the ionic 
environment. 

While ACh has a depolarizing and excitatory effect 
at the end plate, it produces the opposite action, that is 
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hyperpolarization or stabilization of the resting poten- 
tial, in the regions of the heart muscle onto which it is 
released by impulses in the vagus nerve. Here also, the 
basic effect is an increase of ionic conductance, but the 
channel which is being opened is restricted to potassium, 
and does not include sodium. As a result the membrane 
tends to move towards, or to be held at, the potassium 
equilibrium potential which is usually somewhat greater 
than the existing resting potential (hence, a hyper- 
polarization). 

Somewhat similar changes appear to be associated 
with the excitatory and inhibitory synapses in the 
motor neurons of the spinal cord.’ At these junctions, 
the transmitter substances are unknown, but the “null 
points” of the potential changes which they produce 
have been determined and correlated with the existing 
ionic concentration gradients. Here also, an inhibitory 
hyperpolarization occurs which appears to be associated 
with an increase of membrane conductance to various 
small ions but excluding sodium; while excitation ap- 
parently arises from an indiscriminate “short circuit” 
in which sodium as well as the other small ions are 
allowed to pass. 

Although the transmitter effects are fairly well under- 
stood in terms of ionic conductance changes, and the 
subsequent steps leading to excitation or inhibition 
present no special problem, the molecular mechanism 
by which the chemical attachment, e.g., of acetylcho- 
line, to the receptive sites of the membrane alters its 
permeability is still far from understood. At one time it 
was thought possible that ACht ions might produce a 
local depolarization without permeability change, simply 
by being able to move very rapidly through specific 
channels into the interior of the muscle fiber. This idea 
had to be abandoned when it became clear that the 
transfer of Coulombs during the end-plate depolariza- 
tion exceeds the charge on the available ACh ions by 
several orders of magnitude. The observation shown in 
Fig. 1—namely, that a positive quantity of charge 
applied directly into the interior of the cell is much less 
effective in depolarizing the fiber membrane than an 
equivalent charge of ACh ions applied on the outside— 
illustrates this point rather clearly: most of the ex- 
ternally released ACh ions will diffuse away and have 
no chance of penetrating, or even colliding with, the 
end-plate surface. Yet under its influence much more 
positive charge enters the fiber than with the direct 
intracellular discharge. The conclusion is that the rela- 
tively few ACh ions released from the nerve terminal 
cause a vastly greater quantity of other, ambient, ions 
to flow through the end-plate membrane, and so achieve 
the great amplification of local current which is needed 
to transmit an impulse from the minute nerve endings 
to the much larger muscle cell. 

Regarding the molecular combination between ACh 
and receptors and the subsequent chain of events, all 
that can be said at present is that a study of various 
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Fic. 2. Spontaneous miniature end-plate potentials. A: A re- 
cording microelectrode was placed inside a frog muscle fiber at 
the nerve-muscle junction. B: The electrode was placed 2 mm 
away into the same muscle fiber. The upper portions were re- 
corded with slow speed and high amplification and show the 
occurrence of spontaneous small potential changes, restricted to 
the junctional region (calibrations: 3.6 my and 47 msec). The 
lower portions show the response to a nerve impulse with fast- 
speed and low-gain recording (calibrations: 50 mv and 2 msec). 
The stimulus was applied to the nerve at the beginning of the 
trace; response A (at the end plate) shows the step-like initial 
end-plate potential which leads up to the propagating wave; 
response B shows only the propagated action potential, delayed 
by conduction over a distance of 2 mm [from P. Fatt and B. 
Katz, J. Physiol. 117, 109 (1952) ]. 


chemical inhibitors (e.g., del Castillo and Katz®) sug- 
gests the presence of a 2- or 3-stage process whose 
kinetics resemble those of many enzyme-substrate re- 
actions. Substances like tubo-curarine appear to act as 
competitive inhibitors, by interfering with the initial 
site of attachment, without themselves leading to the 
next phase which involves a change in the physical 
membrane properties. 


Quantal Nature of Acetylcholine Release 


An interesting feature which has emerged during a 
detailed study of the vertebrate nerve-muscle junction 
is that the release of ACh from the motor nerve termi- 
nals occurs in discrete packets or quanta each containing 
a large number of molecules. Even in the absence of a 
nerve impulse, such packets are released ‘“‘spontan- 
eously” at infrequent random intervals (Fig. 2). The 
arrival of an impulse at a cell junction apparently 
causes a few hundred events to be synchronized within 
a fraction of a millisecond instead of going on at a 
leisurely average rate of about 1/sec.? 

The evidence for this state of affairs was obtained 
soon after it became possible to apply intracellular re- 
cording electrodes to the motor end plate. If a recording 
electrode is inserted into a resting muscle fiber well 
away from its junctional region, one observes a steady 
resting potential of about 90 mv, negative inside. But 
as one approaches the end plate with the recording 
probe, a characteristic form of spontaneous activity 
shows up which consists of an intermittent random 
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local random impacts of ACh on to the end-plate 
receptors. The source of these impacts is evidently a 
spontaneous release or leakage of ACh from the motor- 
nerve terminal where the substance is stored, for the 
miniature potentials vanish in the course of experimen- 
tal nerve degeneration, at a time when neuromuscular 
transmission fails. 

We considered the possibility that random molecular 
diffusion of ACh from the motor-nerve ending might be 
responsible for the minature e.p.p.’s. If this were true, 
then the same type of discharge ought to be elicited, at 
vastly increased frequency, by applying ACh to the 
fluid surrounding the end plate. This, however, is not 
the case: the depolarization which one then observes is 
continuously graded in size and time course, depending 
upon the dose and length of the diffusion path. It is 


clear that the effects of single molecular impacts of 
ACh must be far below the resolving power of our re- 


Whole muscle 


Fic. 3. Quantal units of end-plate response, recorded intra- 
cellularly from a muscle fiber in a calcium-deficient and mag- 
nesium-rich medium. The top portion shows a few spontaneous 
potentials. The lower part (below the 50-cps time signal) shows, 
in addition, the responses to single nerve impulses. Stimulus and 
response latency are indicated by a pair of dotted lines. There 
was a high proportion of failures, and only 5 single unit-responses 
to twenty-four impulses [from J. del Castillo and B. Katz, J. 


Physiol. 124, 560 (1954) ]. 


Single junction 


discharge of minute potential changes of standard size 
and time course. Each deflection is a transient depolari- 
zation of the order of 0.5 mv, with a rapid (1 msec) 
rise and a slower decay, lasting altogether about 20 
msec. It resembles in many respects the end-plate po- 
tential (e.p.p.)—that is, the immediate depolarization 
of the end plate produced after the arrival of a nerve 
impulse, but differs from it in its much smaller size 
(about 1%) and its spontaneous random occurrence. 


ry, 


Fatt and I called it the miniature end-plate potential. 
It resembles the e.p.p. in its time course, its restricted 
localization to the innervated region of the muscle fiber, 
and its pharmacological reactions. It is reduced in size 
by curare, and its amplitude and duration increases 
when the local hydrolysis of ACh is prevented by a 

ti-esterase. In both respects, the miniature 
ae as exactly like the depolarizations pro- 
prepa Bene of ACh, and we believe, 


applied dose : 
Cea that the spontaneous discharges arise from 
7 


Fic. 4. Statistical properties of the end-plate response. The 
nerve-muscle preparations were blocked by a high magnesium- 
and calcium-deficient medium. The nerve was stimulated at 100 
shocks per sec which produces a progressively increasing end-plate 
response. The upper record was obtained from the surface of a 
sartorius muscle showing the “smooth” average population re- 
sponse of a few hundred end plates. In the lower part, the response 
of a single end plate is recorded intracellularly, showing the 
quantal fluctuations of the response. Stimuli indicated by dots. 
Note spontaneous potentials on the superimposed ‘‘base lines” 
[from J. del Castillo and B. Katz, J. Physiol. 124, 574 (1954) ]. 
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Fic, 5. Histograms of e.p.p. and spontaneous potential amplitudes (inset), from a mammalian end plate blocked by magnesium. 
Peaks of e.p.p. amplitude distribution occur at 1, 2, 3, and 4 times the mean amplitude of the spontaneous miniature potentials. A 
Gaussian curve has been fitted to the latter and used to calculate the theoretical distribution of e.p.p. amplitudes (continuous curve). 
Arrows indicate expected number of failures (zero amplitude) [from I. A. Boyd and A. R. Martin, J. Physiol. 132, 74 (1956) ]. 


cording equipment, and conversely that a discrete 
miniature e.p.p., with its standard size and brief time 
course, must be due to a synchronous package of ACh 


molecules, may be hundreds or thousands, discharged 
in the immediate vicinity of our end-plate receptors. 
That this packet is the basic coin in which the trans- 
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Fic. 6. Method of obtaining the continuous theoretical curve in Fig. 5. A Poisson distribution was c 
m=mean amplitude of e.p.p. responses/mean amplitude of spontaneous potentials. Thi calcula n 
been distributed along Gaussian curves, corresponding to multiples of the spontan: 
of ordinates gives the continuous curve of Fig. 5 [from I. A. Boyd and A. R. Martin, 
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Fic. 1. Electron micrograph of a reptilian neuromuscular junction. Diameter of the motor-nerve ending is approx 2.5 p; it contains, 
in addition to mitochondria (large dark particles), many small vesicles of a few hundred A units diameter [from J. D. Robertson, J. 


Biophys. Biochem. Cytol. 2, 381 (1956) ]. 


mitter is normally delivered from the nerve endings, 
has been shown in several kinds of experiments. The 
release of ACh by an impulse depends upon a number 
of “co-factors” which can be varied experimentally: 
among these perhaps the most important are the con- 
centrations of calcium and magnesium in the surround- 
ing fluid. Calcium is an essential adjuvant, magnesium 
an inhibitor of the release mechanism. By lowering the 
calcium and raising the magnesium concentrations, the 
quantity of ACh liberated during an impulse, and the 
size of the resulting e.p.p., can be progressively reduced 
towards zero. The point of interest is that, during such 
an experiment, the e.p.p. at each individual end plate 
is found to be diminished in discrete steps, which cor- 
respond to the dropping out of individual miniature 
potentials, one by one. With a suitable ratio of Ca/Mg 
concentrations the liberation can be reduced to a small 
number of ACh packets;” in this condition, the size of 


taneously occurring miniature potential. Examples are 
shown in Figs. 3-5, Fig. 6 showing the method whereby 
Fig. 5 is obtained; detailed studies of this effect on a 
variety of vertebrate nerve-muscle junctions have made 
it certain that transmission is brought about by a 
summation of many of these quantal units of activity. 

A further point of interest is that the size of the unit 
parcel of ACh which is delivered from the nerve end- 
ings, spontaneously or in response to an impulse, is 
relatively constant at all cell junctions and apparently 
quite unaffected by the many changes which may be 
imposed on the system in the course of an experiment. 
On the other hand, the probability of release of any one 
parcel in a given time interval—that is, the frequency 
of the miniature potentials—can be altered by several 
orders of magnitude, for example, by electrically de- 
polarizing the nerve ending with a steady current, or 
by changing the chemical composition of the environ- 
ment. The action of the nerve impulse itself can be 


ae 
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< the end-plate potential during successive nerve im- . ] 
5 pulses has been found to fluctuate in a characteristic described as causing a momentary enormous increase 
‘ stepwise manner, corresponding to a Poisson-wise dis- in the frequency Ot the miniature Potentials (by a 
: on of the number of units released in each factor of nearly 10 > provided a high Ca/Mg ratio 
4 ORs instance 1412 The unit-step 1S identical with.the spon- exists in the surrounding medium). 
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The basis of this multimolecular quantum of ACh 
release is not yet known; an attractive suggestion is 
that the transmitter substance is stored, within the 
nerve endings, in minute intracellular corpuscles from 
which it is discharged at the surface in an all-or-none 
manner.®” Electron micrographs'*"4 have revealed a 
mass of fairly densely packed so-called vesicles inside 
the nerve terminals, and it is conceivable that these are 
the intracellular bags in which ACh is being stored 
prior to its release (Fig. 7). It is possible to imagine 
mechanisms by which a collision between such particles 
and certain critical spots in the nerve membrane could 
bring about suddea liberation of the vesicular contents 
straight into the synaptic cleft. But while it is easy to 
present speculations which are compatible with the 
existing evidence, to put them on a firm experimental 
basis will be a much more difficult task. 
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N assessing the sensory performance of higher or- 
ganisms (Rosenblith, p. 485), a certain number of 
characteristics emerge that suggest problems for a 
quantitative study of the electrical activity of the 
nervous system. In considering organisms engaged in 
communication tasks, it has been seen (a) that their 
performance is statistical in character; (b) that they 
need more time to handle more information; (c) that 
their capacity to discriminate and their speed of re- 
action depend on stimulus intensity; and (d) that their 
repertory of absolute identifications (“span of absolute 
udgment”) is relatively small and seemingly based 
apon the ability to make several rather crude dis- 
criminations simultaneously, i.e., to classify environ- 
mental sensory inflow into some rather gross categories. 
One is thus led to examine the electrical phenomena in 
the nervous system in their statistical aspects and to 
look for over-all, i.e., relatively gross patterns of elec- 
trical behavior such as those exhibited by populations 
of neurons at various levels of a sensory system, rather 
than to attach disproportionate significance to the be- 
havior of isolated members of such populations. By 
choosing this quasi-thermodynamic approach, it is not 
intended to deprecate the complementary view that 
derives its inspiration from statistical mechanics. But, 
jt must be emphasized that we do not know how to 
sample a population of neural units adequately (it is far 
from easy to define a “typical” unit when experimenting 
with a single microelectrode), and that there is much to 
be learned of the laws of interaction between units that 
presumably belong to a population. 

It is from this viewpoint that data are presented in 
this article on (a) electrical responses evoked by sensory 
stimuli and on (b) so-called on-going activity ; the latter 
may be thought of as reflecting an organism’s “state” 
as it reacts selectively to sensory stimulation. 

OW CHART OF A MODEL 
SCHEMATIC ENSORY SYSTEM 
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ous system that become involved when a sensory stimu- 
lus is presented. It may, therefore, be expedient to start 
the discussion by presenting, in Fig. 1, a highly over- 
simplified view of a model sensory system. Pour fixer 
les idées, consider the auditory system, which is per- 
haps the one for which the greatest amount of quanti- 
tative data is available. 

A stimulus impinges upon the organism: The ap- 
propriate transducer (here located in the inner ear) is 


activated, and electrical activity is generated in the 
region of the hair-cell-neural junction.’ If the stimulus 
isa “transient,” certain subsequent electrical phenomena 


at the various higher levels of the nervous system are 
comparatively easy to follow, and both amplitude and 
temporal characteristics of responses at various loca- 
tions can be measured with accuracy. 

Once the electric signal has entered the organism’s 
nervous system, it is possible for it to ascend the classi- 
cal afferent pathway to the cortex. The route can be 
represented schematically by straight segments in the 
nature of nerve tracts; these alternate with circles that 
symbolize the main relay stations of the pathway. Along 
the axons of the longitudinal tracts, the transmission 
of signals is predominantly in the form of conducted 
nerve impulses (all-or-none activity or ‘‘spikes”). Trans- 
mission is one-to-one in the sense that a given axon 
exhibits identical patterns of “spikes” along its entire 
length. The relay stations exhibit activity that has both 
“discrete” and “continuously graded” aspects. It is 
in these regions that both excitatory and inhibitory 
synaptic junctions play an important integrative role 
by permitting interactions of the many-to-one and 
perhaps also one-to-many type. 

One of the more significant features of the flow chart 
of Fig. 1 is that it depicts, in parallel with the classical 
afferent pathway, another route that leads—via the 
brain-stem reticular formation—to the cortex. In his 
recent book,* Magoun characterizes the reticular system 
as follows: 

“From recent study of the brain stem . . . the con- 
cept has developed of a major non-specific or trans- 
actional mechanism in the brain, paralleling the specific 
sensory and motor systems of classical neurology and 
richly interconnected with them. This non-specific neu- 
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ral system is distributed through much of the central 
core of the brain stem and, as spokes radiate from the 
hub of a wheel to its peripheral working rim, so func- 
tional influences of this central system can be exerted 
in a number of directions: caudally upon spinal levels 
to influence both reflex and other spinal activity; ros- 
trally and ventrally upon hypothalamic and pituitary 
mechanisms through which influences can be exerted 
upon visceral and endocrine functions; cephalically 
upon the diencephalic and rhinencephalic brain, where 
affect and emotion now reign instead of in the heart; 
and, more cephalically and dorsally still, upon the neo- 
cortex of the cerebral hemispheres which, with its in- 
terconnected thalamic and basal ganglionic masses, 
subserves higher sensory motor and intellectual per- 
formance. The influences of this non-specific system in 
the brain stem are thus brought to bear upon most 
other portions and functions of the central nervous 
system, either to diminish or to raise the level of their 
activity, or to interrelate or integrate their several 
performances.” “Tt does this as a reflexion of its 
own internal excitability, in turn a consequence of both 
afferent and corticifugal neural influences, as well as 
of the titer of circulating humors and hormones which 
affect and modify reticular activity.” 

Relatively few of these properties are depicted in 
Fig. 1. Yet there can be seen collaterals that branch 
off from the afferent pathway to furnish the reticular 
system with information regarding the signals that are 
being processed along the more direct and specific route. 
There is also an indication that this system receives 
impulses from all sensory modalities, i.e., both from the 
milieu exlérieur and the milieu intérieur. And there are 
indications that the flow of neural signals for even a 
single sensory modality is not just one-way in the reticu- 
lar system, i.e., directed toward the cortex; there are 
important descending pathways whose functional sig- 
nificance is just beginning to be understood. 

It is perhaps somewhat surprising to learn that neuro- 
anatomists have given descriptions of the reticular 
system before the beginning of the twentieth century, 
but that it was only after Bremer’s experiments of the 
midthirties, and in particular after the now classical 
Moruzzi-Magoun experiments® on the “Brain stem re- 
ticular formation and activation of the EEG” (1949), 
that this system began suddenly to loom large in the 
understanding of all integrated behavior. Previous to 
that period (and even in a large number of experiments 
performed since then, in the fifties), the use of centrally 
acting anesthetics has resulted in differential inter- 
ference with reticular activity. Under these circum- 
stances, most explanations of the role which the nervous 
system plays in perception or in the handling of sensory 
information are based on evidence from, neurophysio- 
logically and behaviorally speaking, badly distorted 
preparations. 

Among the more dramatic findings that have forced 
one to give up oversimplified views of the organization 
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Schematic flow chart of the auditory system 


Tic. 1. For the sake of simplicity, this diagram presents only 
a gross outline of the neuroanatomy of the auditory system. Suc 
important way stations as the cerebellum, the superior olive, the 
lateral lemniscus, and other regions of the brain from which re- 
sponses to acoustic stimuli have been recorded have been left out; 
the still incompletely known and complex organization of the 
brain-stem reticular formation (BSRF) has not even been sugges- 
ted. Furthermore, no attempt has been made to indicate connec- 
tions that correspond to acoustic reflexes of short latency: middle 
ear muscles contract about 10 msec after stimulus delivery, while 
eye blinks occur about 35 to 40 msec after delivery of a strong click; 
such reflexes represent simple forms of integrated behavior that 
do not necessarily involve the higher levels of the auditory system. 

In order to appreciate the number of neural elements present 
along the afferent pathway, Chow’s' numerical estimates of the 
auditory system of rhesus monkey are reproduced here. There are 
about 105 cells in the cochlear nucleus (CN), about 4 105 cells in 
the inferior colliculus (IC) and also in the medial geniculate (MG); 
the auditory cortex (AC) is estimated to contain about 10? cells; 
these numbers, which represent orders of magnitudes only, may be 
compared with the 3X 10! fibers in the auditory nerve (AN) and 
the 2.5 10* hair cells in the inner ear, i.e., at the transducer neu- 
ral junction (TD-NJ). The foregoing data are the results of histo- 
logical investigations; hence, they do not permit any direct infer- 
ence concerning the number of neural units that will be either 
directly excited or inhibited in the presence of a stimulus of given 
intensity. 

The remaining symbols are to be identified as follows: OCB 
stands for the olivo-cochlear bundle (depicted in dashed lines 
like the other descending pathways) whose influence upon the ac- 
tivity of the auditory nerve has been investigated by Galambos,? 
CIPP stands for the central information-processing pool of neu- 
rones, whose existence, though neither neuroanatomically nor 
neurophysiologically established, is conceptually useful. The rela- 
tions of this hypothetical construct to the sensory-projection 
areas and association areas remain to be worked out to give a 
more realistic account of the role that the brain and its various 
components play in the organism’s handling of sensory informa- 
tion. CMA stands for the inferred control unit of the motor activ- 
ity that takes place in connection with the stimulus-related 
response. 


and functioning of sensory systems have been the 
following : (a) sectioning of the classical auditory ascend- 
ing pathway was shown to leave intact an animal’s 
ability to be aroused by sound; (b) animals whose audi- 
tory cortex had been ablated were shown to be capable 
of simple frequency and intensity discriminations, 
though they were not capable of distinguishing more- 
complex patterns such as simple melodies. Experiments 
of this sort should, however, not be interpreted to mean 
that the anatomical structures that can be removed or 
interfered with play no role in the normal performance 
of sensory tasks. There are certainly different ways of 
assessing impairment of performance, and the mere 
statement that an animal is still capable of carrying out 
a given task, after ablation, does not force the conclu- 
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Fic. 2. (a) A typical response to a click of medium intensity recorded near the cat’s round window of the inner ear. M is a so-called 
microphonic potential which is non-neural in origin. N, is the earliest and most prominent neural component that can be recorded at 
this location. It represents the summated action potentials from the auditory nerve. S indicates the occurence of the electrical pulse 
that generates the acoustic click. (b) An example of how the amplitude and latency of N, vary as the intensity of the click is varied. 
The latency values include a delay of approximately 0.5 msec that corresponds to the time that it takes for the sound to travel from 
the earphone diaphragm to the eardrum. (c) An average response to a click recorded from a location on the auditory cortex of an 
unanesthetized cat. The averaging operation was carried out by an electronic computing device." Surface positive deflections are 
plotted downward. (d) Intensity functions for the baseline-to-peak amplitude and the peak latency of the surface positive component 


of the cortical response to clicks in an anesthetized cat. 


sion that there is no sensory deficit. More generally, one 
is impressed with the fact that the nervous system of 
higher animals seems to be characterized by enough 
flexibility (which some people have metaphorically re- 
ferred to as a “safety factor” or “redundancy’’) so as 
to permit the performance of certain rather basic dis- 
criminations in a variety of ways. 

For a long time students of the nervous system have 
wondered how its flexibility manages to deal with 
sensory stimuli, when they are in the focus of the or- 
ganism’s attention in contrast to situations in which 
they seem to be of no importance. Adrian formulated 
this question most incisively at the 1954 Symposium 
on Brain Mechanisms and Consciousness®: 

“The operations of the brain seem to be related to 
particular fields of sensory information which vary from 
moment to moment with the shifts of our attention. 
The signals from the sense organs must be treated 
differently when we attend to them and when we do 

not, and if we could decide how and where the diver- 
gence arises, we should be nearer to understanding how 
the level of consicousness is reached. The question is 
whether the afferent messages that evoke sensations are 
allowed at all times to reach the cerebral cortex or are 
sometimes blocked at a lower level. Clearly we can Te 
duce the inflow from the sense organs as we do by closing 


i wish to sleep 
d relaxing the muscles when we 
eee hat the sensitivity of some of 


the sense organs can be directly influenced by the central 
nervous system. But even in deep sleep or coma there 
is no reason to believe that sensory messages no longer 
reach the central nervous system. At some state there- 
fore on their passage to consciousness the messages meet 
with barriers that are sometimes open and sometimes 
closed. Where are these barriers, in the cortex, the brain- 
stem, or elsewhere?” 

The schematic flow diagram of Fig. 1 suggests, in 
agreement with the beautiful experiments that have 
been carried out since 1954, that the barriers that 
Adrian refers to are at no single place in the nervous 
system, but that they are multiple. This is clearly not 
the place to review in detail the experiments carried 
out by Hernandez-Peén’ and many others.*—8 Suffice it 
to report that all of these experiments indicate that the 
descending pathways exercise powerful gating, screen- 
ing, and regulating influences. This central or centrifugal 
control of the afferent sensory inflow has been demon- 
strated to be effective across sense modalities (for 
example, the presence of a mouse or of fish odors reduces 
the response to a click at the level of the cat’s cochlear 
nucleus) or during “habituation” (which presents itself 
as a reduction in the amplitude of evoked responses for 
stimuli that have been repeated over and over again). 
This effect of habituation can, however, be suddenly 
abolished by pairing the “boring” stimulus judiciously 
with a few electric shocks. From the available evidence 
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(these dramatic effects of attention and habituation are, 
for instance, not found in anesthetized animals), one is 
led to conclude that the selective processing of the 
sensory inflow is under the control of the brain stem or 
of descending pathways that are structurally or func- 
tionally closely related to the reticular system. Much 
remains to be learned about the precise ways in which 
the informational aspects of a sensory signal affect its 
progress in the nervous system, but it is already clear 
that sensory systems cannot be characterized by in- 
variant sensitivity and invariant “tuning.” 


EVOKED RESPONSES ALONG THE 
AFFERENT PATHWAY 


Having discussed the schematic flow chart and some 
of the phenomena that led to its formulation, it is now 
necessary to admit that most of the quantitative data 

>that exist deal with electrical events along the classical 

afferent pathway. This admission should give one a 
more realistic appreciation of what the phenomena are 
for which there exist functional relations of the stimulus- 
response type. What follows should, therefore, be merely 
considered as an introduction to some problems of re- 
cording, data-analysis, and model-making in relation to 
the electrical activity found at various levels of the 
nervous system. 

The left side of Fig. 2 depicts samples of responses to 
clicks from populations of neural units at the level of 
the auditory nerve and at the level of the auditory 
cortex. As indicated, it is possible to make quantitative 
measurements along both the voltage and the time axes 
of these displays. Thus, one can measure amplitudes of 
deflections (either peak-to-peak or baseline-to-peak) 
and “latencies” (i.e., time intervals that have elapsed 
since the delivery of the stimulus) of extreme values or 
zero crossings. Naturally, the characteristics of the re- 
cording equipment must be chosen appropriately. It 
would be futile if one were to attempt to measure ac- 
curately either the latency or the amplitude of rather 
sharp deflections on the ink tracings taken by the cus- 
tomary EEG recording device. 

When such amplitudes or latencies are measured, one 
must ask whether one shall be satisfied with average 
values or whether one shall need to know entire dis- 
tributions in order to be able to give a rational account 
for the neuronal events. As has been shown by McGill," 
Macy,!® Frishkopf!’ and others, evoked responses are 
afflicted with variability, especially in regard to ampli- 
tude and particularly at the higher neural centers 
unless the animal is rather heavily anesthetized. It is 
important to realize that this variability of evoked re- 
sponses to identical stimuli is present in spite of the 
fact that the gross electrodes that record the activity 
of reasonably large neural populations are already per- 
forming an averaging process. Some studies of the varia- 
bility of evoked responses in anesthetized preparations 
have been undertaken!®*8 and have yielded results that 
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EEE latency to first spike (msec) 


Fic. 3. Responses of single units in tactile thalamic region to 
transient electrical stimulation of forepaw. Position of stimulation 
is unchanged for all records. Numbers in column at extreme left 
give a measure of stimulus strength. Numbers in second column 
indicate (in msec) mean latency to first spike and standard error 
of the mean for each stimulus intensity. The records in the right- 
hand column depict the modal value of the number of spikes that 
are recorded for each intensity. The short vertical line at the 
left of each response trace indicates the delivery of the electrical 
stimulus [from J. E. Rose and V. B. Mountcastle, Bull. Johns 
Hopkins Hosp. 94, 238 (1954) ]. 


are suggestive from the viewpoint of probabilistic 
models. 

On the basis of amplitude and latency measurements, 
one can plot intensity functions, i.e., functions that re- 
late the average amplitude of characteristic deflections 
and their average latency to the intensity of the stimu- 
lus. Figures 2(b) and 2(d) illustrate such functions. 
These and similar graphs permit one to state the follow- 
ing generalization: As the intensity of a transient sen- 
sory stimulus is increased, the average amplitude of 
evoked responses increases (or at least does not de- 
crease), and the average latency of either the onset or 
the peak of the deflection decreases. [Parallel general- 
izations can, of course, be formulated for single neural 
units (see Figs. 3 and 4, from Rose and Mountcastle”). ] 
The detailed shape of these intensity functions varies 
for different locations in the nervous system and for 
different stimuli, and it is, therefore, hardly appropriate 
to search for a unique representation of stimulus in- 
tensity in the nervous system. 

It must be noted that the quantification process just 
described runs into difficulties if stimuli of appreciable 
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Fic. 4. Each pair of records comes from the same unit. All six 
units were activated by electrical stimulation of different locations 
on the skin. The stimulus artefect (ar) indicates the instant of 
stimulus delivery [from J. E. Rose and V. B. Mountcastle, Bull. 


Johns Hopkins Hosp. 94, 238 (1954) ]. 


duration are used. Thus far, it has not been possible 
to suggest, especially for the higher neural centers, 
meaningful measures of the evoked activity on the basis 
f visual inspection only. It is by no means certain that 
ae nctions will be obtained for all in- 
h the stimuli are “on”; the inter- 

d inhibitory phenomena, as well 


simple intensity fu 
stants during whic 
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as various adaptation phenomena, can be counted upon 
to complicate the picture. 

Before dealing with microelectrode data on the be- 
havior of single units at various way stations of the 
nervous system, mention should be made of the series 
of mathematical models developed by McGill,!® Frish- 
kopf,” Goldstein,” and Macy.!® These models, which 
are all probabilistic in character, relate certain aspects 
of gross-electrode responses (i.e., responses to which 
populations of neural units have contributed) to stimu- 
lus parameters. All of these models make certain, physio- 
logically not unreasonable, assumptions regarding the 
behavior of single units in the auditory nerve or at the 
auditory cortex. From these postulated properties and 
from certain assumptions regarding the interaction be- 
tween various members of a population of neural units, 


the behavior of population responses (i.e., evoked re- 
sponses recorded by gross electrodes) can be predicted 
and compared with experimentally available data. The 
response parameters that have been investigated include 
average amplitudes, variability in amplitude, wave form 
of a population response, and temporal recovery charac- 
teristics. The stimuli were clicks or bursts of noise: these 


were presented either singly or in pairs (or even triplets), 
either in the absence or in the presence of background 
noise. While it is obviously premature to forecast the 
ultimate success of such model-making, it is safe to 
assert that the models have demonstrated their useful- 
ness in summarizing sizable quantities of experimental 
data and in suggesting systematic experimentation in 
an area wherein workers seem only too often satisfied 
by demonstrating the existence of phenomena. These 
models force one, furthermore, to face an issue that 
may well be critical for the understanding of the organi- 
zation of the nervous system: What is the relationship 
between the behavior of the components that can be 
studied (single cells), the behavior of subsystems (made 
up of more or less homogeneous populations of single 
units), and the behavior of an entire sensory system? 
For it is hardly reasonable to expect that the knowledge 
of the functioning of the brain will be revealed in toto 
in the absence of a working knowledge of principles of 
organization of functional units of the brain. 


PATTERNS OF ELECTRICAL ACTIVITY IN SINGLE 
UNITS OF SENSORY SYSTEMS 


Over twenty years ago, first Blair and Erlanger”! and 
then Pecher” showed that, under carefully controlled 
conditions, a single nerve fiber will sometimes respond 
and sometimes not respond when the same stimulus is 
presented repeatedly. They furthermore showed that, 
at least to a first approximation, spike responses in 
fibers that belonged to the same nerve trunk were sta- 
tistically independent. In 1950,2: it was shown that 
the probability of response to a click of a single unit in 
the cochlear nucleus of the cat increases from “almost 
never” to “almost always” as the stimulus energy was 
multiplied by a factor of approximately 100 (or 20 db); 
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Fic. 5. Microelectrode recording from a unit in the medial part of the pontine reticular formation of a cat: A, spontaneous activity ; 
B, tapping the ipsilateral forelimb; C, rubbing the animal’s back; D, touching the cat’s whiskers; Æ, hand claps; F, single shocks de- 
livered to the animal’s sensorimotor cortex. Delivery of stimuli is indicated by white horizontal lines (B,C,D), dots (Æ), and artifacts 
that displace the record’s baseline (F). Time calibration at the bottom of figure indicates 10-msec intervals [from M. Palestini, G. F. 


Rossi, and A. Zanchetti, Arch. Ital. Biol. 95, 97 (1957) ]. 


at the same time, the latency of the spike response that 
is elicited decreases substantially. Since then, a con- 
siderable amount of such data has become available for 
different structures in the nervous system.***6 

Rose and Mountcastle" have given some of the most 
beautiful records that are available today. These records 
illustrate the finesse and the subtlety with which the 
nervous systems can “code” stimulus intensity into re- 
sponse patterns of single cells. These workers stimulated 
the skin of cats electrically and recorded the activity of 
single neurons in the thalamic relay nucleus of the cat’s 
tactile system. Figure 3 shows how the number of spike 
responses increases and the latency of, the first spike 
decreases as the strength of the transient peripheral 
stimulus is increased. Measurements of intervals be- 
tween spikes yield such interesting results as: (a) Units 
start to fire at a rate which is higher, the larger the 
number of spikes in a train of repetitive discharges. 


(b) The interspike interval increases throughout a train 
and firing ceases when the interval exceeds a value such 
as 2 msec. 

Figure 4 shows, in each horizontal subdivision, pairs 
of responses from the same unit to identical stimuli. 
These discharges are again all-or-none in character and, 
in spite of the fact that their height (amplitude) varies 
within a given train, there is overwhelming evidence 
that all spikes in a single train come from the same unit 
This figure furnishes striking evidence for the view ac- 
cording to which the variability in response patterns of 
single units to identical stimuli renders their statistical 
description imperative. 

Figures 3 and 4 represent locations in the modality- 
specific classical afferent pathway. The next record 
taken in Moruzzi’s laboratory (Fig. 5, from Palestini 
et al.27), illustrates, in contrast, the response pattern of a 


single unit in the brain stem.?: This unit was responsive 
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Fic. 6. Typical patterns of unitary discharges in the cat’s visual cortex in response to a short flash of 3000 Lux; the bottom trace 
shows a typical electroretinogram that is observed at the same time. Arrow indicates delivery of flash; the B-, D-, and E-neurons 
reflect, to a certain degree, patterns of retinal excitation (on, off, and on-off). The pattern of activity of the B-neuron—i.e., an initial 
burst followed by a pause of about 0.1-sec duration—is found in other sensory systems (after O. J. Griisser and A. Griitzner*’). 


to all test stimuli that were tried, although the pattern 
of spike activity was rather different for the different 
modalities. 

The next microelectrode record (Fig. 6) demonstrates 
that some of the response patterns that can be observed 
in sensory systems are discernible over time intervals 
whose length is comparable to behavioral reaction times. 
Jung and his group” distinguish four different types of 
neurons in the visual cortex on the basis of their ac- 
tivity patterns. Figure 6 describes schematically three 
of these typical patterns. The fourth pattern, which 
corresponds to the so-called A-neuron, is not repre- 
sented. Actually, more than one-third of the units in 
the visual cortex from which records have been obtained 
fail to yield visually detectable changes in their spon- 


predbieha 


Maea iha 


taneous activity after delivery of an optical stimulus.f 
There is, however, some evidence that these units may 
exhibit some kind of responsiveness, provided a form 
of optical stimulation is used that attracts the animal’s 
attention. . 

From the highly selected evidence that has been 
presented in Figs. 3 to 6, a certain number of conclusions 
impose themselves regarding the strategy of future 
model-oriented research in this area. New recording de- 
vices and new techniques of analysis are needed. One 
must be able to record simultaneously the firing patterns 
of not one or two cells but of significant assemblies and 
subassemblies of cells. Gross electrodes permit the re- 
cording of potentials that are very coarse spatio-tem- 
poral averages of both spike and continuously graded 
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Fic. 7. Four tracings ( 
records were taken during 


traces, Depth of anest 


I, I, IM, IV) of electrical activity by a gross electrode from the auditory cortex of a cat. These 
the presentation of clicks. The delivery of the stimuli is indicated on the accompanying lower 
hesia increases from I to IV, as the amount of dial administered is increased.” 


nd Davies” found, likewise, that about one-third of the units in the auditory cortex was unresponsive to 
ose, 4 
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activity. The time has now come for solid-state record- 
ing devices that will permit the exploration of what 
might be called the “domain-structure” of sensory sys- 
tems. It would, however, be foolish to expect that the 
data from 10 or 50 inputs could be handled adequately 
without the help of digital computers of appreciable 
capacity. One cannot be satisfied with visual inspection 
of records if one wants to find out whether unit X re- 
sponds most of the time only if units Y and Z are 
inhibited or after units A, B,and C have fired in a certain 
order and separated by certain time intervals. 


DETECTION OF EVOKED RESPONSES IMBEDDED 
IN BACKGROUND ACTIVITY 


In recent years, as neurophysiologists attempted to 
study cortical responses to sensory stimuli in animals 
that were, physiologically speaking, in a “more normal” 
state, two problems arose: First, it became more difficult 
to detect the evoked responses that were now imbedded 
in a much livelier background activity of the cortex 
than had been observed in the previously used, deeply 
anesthetized preparations. Second, it became important 


Clicks: 1/sec, — 40 db 


(N= 250) 


(a) 


Spontaneous 
(N = 250) 


(b) 


Fic. 8. The upper records of (a) and (b) show ink traces obtained from the skull (vertex) of human subjects (a) in 
the presence of clicks and (b) in the absence of clicks. The pulse channel, recorded simultaneously, indicates in (a) the 
instants at which clicks were delivered and in (b) serves as a comparable time reference. The lower records are the 
average wave forms obtained, by a process of cross correlation of the signal with a series of brief pulses, from a sample 
of 250 intervals of which 5+ are shown in the ink records directly above. The average wave forms are the envelope 


of a series of closely spaced lines (10 msec).* 
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Lateral 
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Lateral 
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Fic. 9. Averaged responses to flash from three stations in the 
visual pathway of a cat; simultaneous recordings under moderate 
pentobarbital anesthesia with reference lead on the back of the 
neck. The average response is the envelope of a series of pen de- 
flections separated by intervals of 1 msec. The delivery of the flash 
coincides with the first of the continuous series of pen deflections. 
The numbers above each average response are the latencies of 
characteristic points in the particular wave form [from M. A. B. 
Brazier, Acta Physiol. Pharmacol. Neerlandica 6, 692 (1957)]. 


to find ways of characterizing this background activity 
in a more than impressionistic manner. 

Figure 7 illustrates the first problem by demonstrating 
how additional doses of anesthetic make it much easier 
to detect the “signal” (the evoked response) in the much 
diminished background activity that can, however, 
hardly be called “noise.” Raab and Kiang® were ac- 
tually looking for correlations between measures of the 
activity that preceded the delivery of a stimulus and 
the size of the evoked response. Their precise results 
are not of particular import here. It is, however, im- 
portant to note that the interaction between evoked 
and background activity of the nervous system is not 
a problem for which a simple or even a single solution 
can be assumed. 
~ Having a probabilistic bias in the interpretation of 
the electrical activity of the nervous system, members 

of our laboratory have attempted for several years to 
treat the obtained data in a manner that will yield 
average responses to sensory stimuli. The reasoning was, 
qualitatively, somewhat as follows: In the handling of 
sensory information, organisms behave as if they were 


Pe i n the basis of estimates of activity averaged 
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over substantial regions of the nervous system; such a 
weighted averaging process could be carried out by 
comparing the outputs of a large number of neural 
elements. Since there are at present no devices capable 
of performing this task (even if one knew exactly where 
and how to perform it), one chooses to average over an 
ensemble of responses to repeated identical stimuli in 
order to bring out certain typical aspects of the behavior 
of the nervous system. Once the “‘correlator system” 
(referred to below) was available, it became possible to 
use it with comparatively slight modifications for the 
detection of evoked responses." Figure 8 illustrates how 
this averaged display yields average results that would 
be hard to foretell on the basis of visual inspection either 
of ink tracings or of tracings of single responses photo- 
graphed from the screen of an oscilloscope. Electronic 
computation of the sort used in our laboratory neces- 
sitates the recording of data onto FM magnetic tapes. 


Such tapes not only contain, in general, the specific 
responses that are set off by delivery of the stimuli 
but also offer, especially when multichannel recording 


is used, a much more representative sample of the 
organism’s electrical activity. Appropriate programming 
permits the carrying out of different analyses of the 
recorded data as desired by replaying the tapes as often 
as is necessary. 

Figures 9 (from Brazier**) and 10 offer just two ex- 
amples of how the Evoked Response Detector (ERD) 
modification of the correlator system was used in the 
study of the visual* and auditory*® systems. Other prob- 
lems investigated include: cortical off-responses in the 
auditory system,*® the effect of anesthesia upon the 
wave form of evoked cortical responses," as well as 
responses to photic stimulation in man.** 


Rate Unanesthetized Dial (15 mg/Ib 1.P.) 


50/sec 


100/sec 


Fic. 10. Averaged cortical responses to repeated clicks (25-db 
threshold) indicate that the auditory cortex in nonanesthetized 
preparations is capable of “following” at higher rates. Note, also, 
the difference in wave form of response in anesthetized and non- 
anesthetized preparations for 10 clicks/sec. The evoked-response 
detector averaged the following number of responses at the differ- 
ent rates: 600 responses at 10/sec, 3000 at 50/sec, and 6000 at 100/ 
sec,35 
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Fic. 11(a) Block diagram of average response computer (ARC-1). 


More recently, we have been able to use (thanks to 
the cooperation of some colleagues at the Lincoln Labo- 
ratory) a considerably more powerful special-purpose 
digital computer, the ARC-1 (Average Response Com- 
,puter*) in these investigations. Perhaps the most sig- 
nificant feature of this transistorized computer is that 
it computes ‘in real time” (Fig. 11). Thus, a display 
can be obtained instantaneously of the average of all 
response traces including the one just recorded from 
the animal’s or subject’s head. This immediate access 
to knowledge of results offers advantages, especially 
when as complex a system as the nervous system is 
being investigated. Without entering into the realm of 
science fiction, one can foresee experiments in the near 
future in which an organism and a computer operate 
in a “closed loop,” i.e., in which the organism’s response 
behavior alters the programming of sensory stimuli that 
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Fic. 11(b). Emergence of an average response from background 
activity as the number of averaged responses is increased.” 
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the organism is instructed to deal with. With the help 
of ARC-1, our laboratory has continued (see Fig. 8) a 
rather exciting study of evoked potentials in man. It 
has thus been possible not only to record extracranially 
what appear to be cortical responses to acoustic clicks 
in many experimental subjects,“ but also to demon- 
strate that these electrical responses appear at stimulus 
intensities that are quite comparable to the subject’s 
absolute psychophysical threshold for clicks (see Fig. 12). 
Furthermore, it has been noted that the responses dis- 
appear near the subject’s masked threshold. These find- 
ings suggest numerous sensory experiments in which 
convergent neurophysiological and psychophysical data 
can be obtained from human beings and are thus in 
tune with an era of experimentation in which the elec- 
trical activity of behaving animals will be recorded by 
implanted electrodes and broadcast by transistorized 
transmitters.” 


ATTEMPTS AT QUANTIFICATION OF 
ONGOING ACTIVITY 


Up to this point, this paper has emphasized specific 
electrical responses to specific sensory stimuli, although 
it has been pointed out that these responses are, at all 
levels of the nervous system, in a nontrivial sense func- 
tionally related to the existing background activity.§ 

Trained electroencephalographers are clearly capable 
of arriving at diagnostic judgments on the basis of ink 
tracings of the EEG. Attempts have been made to 
“quantify” (here the term is used in its broadest mean- __ 
ing) various aspects of brain-wave activity in experi- 
mental situations. These attempts at quantification 
have ranged from a classification of the recorded com- _ 
plex visual patterns into broad classes (corresponding — 
either to characteristics of the EEG wave form or to 
such states as arousal or drowsiness, or to some patho- 
logical condition) to a running Fourier analysis of 
short-time (several seconds) samples of the EEG ac- 
tivity. Given our lack of theoretical understanding of — 
the physiological mechanisms that underlie the E 
and the multiplicity of purposes for which EEG 
have been used, it is not surprising that no ideal m 

§ This background activity has been designated as sf i 


or ongoing activity. The precise conditions under W k 
served in pure form are extremely hard to define. 
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of analysis has been found. Several years ago Wiener— 
intent on testing certain of his ideas regarding the pres- 
ence of a “clock” in the brain**—interested Brazier’s 
group at the Massachusetts General Hospital and the 


No stimulus 


E Communications Biophysics group in the Research 
Laboratory of Electronics, MIT, in applying correlation 

-80 analysis to the EEG. A “correlator system” was built, 

a and a great quantity of experimental data has by now 
3 been processed.** There developed a considerable amount 
2 —70 of discussion regarding the mathematical nature of the 
S EEG time series, regarding the usefulness of correlation 
= E analysis of the EEG (since it discards so much of the 


information contained in the raw data), and even re- 
garding the ease with which a correlation function of 
the EEG can be interpreted in a really quantitative 
manner. Whatever the merits of these issues, any data- 
reducing scheme should prove of some usefulness if the 
final display of the dependent variable satisfies several 
‘requirements: (a) it should be stable for a given organ- 
ism when data are taken under comparable conditions; 
EAOa average responses to, monaural periodic (b) it should be sensitive to important changes in the 


clicks obtained from scalp electrodes for different stimulus in- organism’s environment; (c) it should perhaps be pos- 
tensities. aah ae pereen the wave form of the average sible to classify a whole population on the basis of a 
response to individual presentations of identical click stimuli oa oe me 

delivered at a rate of 1.5/sec. The subject’s (W. P., 10 September limited number of display = . : 
1957; awake, eyes open) psychophysical threshold for these clicks, Figures 13 to 15 illustrate some of the properties of the 
as determined during the same experiment, lay at approximately autocorrelation function as such a display. Figure 13 
—85 db. Upward deflection indicates that electrode A is positive Z > 


with respect to electrode B [from C. D. Geisler, L. S. Frishkopf, demonstrates that the display is capable of reflecting a 
and W. A. Rosenblith, Science 128, 1210 (1958) ]. threefold classification of the EEG ink tracings into 
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“high alpha,” “medium alpha,” and “no alpha.” It 
remains to be seen to what extent a much larger popu- 
lation could be accurately classified on the basis of a 
certain number of such displays for various locations 
on the skull. 

Figure 14 asks the question: How stable is such a 
display for a given person over a prolonged period of 
time? Although it is not possible at present to evaluate 
the similarity of these two displays in precise numerical 
terms, they appear to be quite comparable if the con- 
ditions of recording the EEG are reasonably well con- 
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Fic. 14. Autocorrelograms of the EEG (1-min sample) of a nor- 
mal human subject recorded four months apart [from J. S. Barlow, 
M. A. B. Brazier, and W. A. Rosenblith, “The application of 
autocorrelation analysis to electroencephalography,” in Proceed- 
ings in the National Biophysics Conference, 1957 (Yale University 
Press, New Haven, to be published) ]. 
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Fic. 15. Autocorrelograms computed for same subject in ab- 
sence of controlled sensory stimulation (‘‘spontaneous”) and in 
presence of different rates of periodic photic stimulation [from J. 
S. Barlow, M. A. B. Brazier, and W. A. Rosenblith, “The applica- 
tion of autocorrelation analysis to electroencephalography,” in 
Proceedings of the National Bopa Conference, 1957 (Yale 
University Press, New Haven, t Opb 
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Fic. 16. Brain waves were recorded on four consecutive days 
(at same time of the day) from an awake subject sitting with 
closed eyes in an anechoic chamber. The displays show the four 
sets of results in fine lines, with the mean of the four runs as a 


heavy solid line. Each analyzed run lasted 3 min. For details of 
computer program and definitions of such terms as amplitude 
threshold, total activity, and number of bursts, see Farley et al.A® 
The shaded region is bounded by the estimated confidence limits 
for the variation of the displays.*¢ 


trolled and the time sample from which the correlation 
function is computed is sufficiently long. Figure 15 
demonstrates that the autocorrelational display is sensi- 
tive to photic stimulation. 

It is not possible to examine here the extent to which 
the findings obtained with the aid of correlation (both 
autocorrelation and crosscorrelation) analysis support 
a particular model of brain function. One must, of 
course, admit that any method of data analysis implies 
a model of the process that is being investigated. The 
research worker has a burden of proof in the sense thai 
he must show that the implied model fits the process - 
well enough for his purpose. The fact remains that, in — 
the absence of systematic computational analysis, most 
questions regarding the mathematical nature of the 
signal cannot even be raised in a meaningful manner. 
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Subjects RMB, MM, 
RB, and MB 


Total activity 


Ana 
(a) 


Subjects RMB, MM, 
RB, and MB 


Number of bursts 


0 0.5 1.0 1.5 
AT 


(b) 


Fic. 17. Functions obtained for four different subjects on the 
basis of single 3-min runs. Note that subjects RMB and MM do 
not differ much in their total activity profile, but they are clearly 
distinguished in the display depicting the number of bursts.‘¢ 


benefits that derive from the use of a display that is in 
itself ‘‘anschaulich.” 

Displays of correlation functions of ongoing electrical 
activity are just one way of trying to describe an organ- 
ism’s “‘state” in terms of pseudo-state-variables. A re- 
port entitled “Computer Techniques for the Study of 
Patterns in the Electroencephalogram™® represents per- 
haps the most far-reaching effort in this direction. With 
the help of a rather flexible general-purpose digital com- 
puter equipped with a large memory, the attempt was 
made to “recognize the pattern of rhythmic bursts” in a 
given individual’s EEG. Once a program for this pur- 
pose had been written, the attempt was made to dis- 
tinguish different subjects from each other, as well as 
different “states” in the same subject. 

“res 16 to 18 show some of the results that have 

b ei Figure 16 illustrates that it is possible 
pe e 


. ‘te reproducible displays for the same sub- 
to obtain quite Tep. re taken to obtain the data under 
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fairly 
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Subject JB 


Awake (1) 


160 


Awake (2) 


80 


Number of bursts 


100% 


Subject JB 


Total activity 


Fic. 18. Four runs for a subject, two while he was 
awake and two in two different states of sleep.*® 


mate confidence limits for the variability of the display 
for this subject and to compare displays from other 
subjects with these limits in order to assess significant 
differences. The initial results by Farley et al.** indicate 
that even a limited number of displays is capable of 
separating subjects into statistically distinguishable 
classes. Increasing the number of displays by increasing 
the number of pseudo-state-variables should make it 
possible to remove existing “degeneracies.” 

Figure 18 shows how these displays in a given subject 
are affected by his going to sleep. The influence of drugs 
or of sensory stimulation could be studied in this manner. 
Some evidence has already been obtained that these 
displays are sensitive to even short periods of sensory 
deprivation. 

It is, however, appropriate to utter here a caveat 
regarding this and other methods that aim at quantify- 
ing data on the electrical activity of the nervous system. 
The multivariate nature of the nervous system prevents 
one from taking too much stock in any given number 
or even in a computed curve. Somehow one must leam 
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neural activity in a fashion that is perhaps somewhat 
analogous to the dimensionless parameters of engineer- 
ing systems. To make this point more concrete: Farley 
el al’ have shown that there are individuals whose 
total activity profile when they are awake bears some 
resemblance to the profile of another individual while 
the latter is asleep. However, whenever a person falls 
asleep there are changes in his activity profile. In other 
words, one must learn to assess the significance of 
change in relation to certain base lines and must refrain 
from attaching too much value to isolated pieces of 
numerical information. Unfortunately, economical math- 
ematical descriptions of patterns and sequences art not 
yet available. These considerations can perhaps be 
carried somewhat further by noting that Fig. 13 illus- 
trated how widely the autocorrelogram from apparently 
normal individuals may vary. It would thus seem unwise 
to draw too far-reaching conclusions regarding, for in- 
stance, a person’s psychological make-up on the basis 
of a single display or even a set of summary displays 
computed from the electroencephalogram. 
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CONDITIONAL probability computer, appropri- 

ately connected with input and output devices, 
can “recognize” patterns and “learn” by trial and error. 
The principle of such a system developed at the National 
Physical Laboratory is described briefly in this paper, 
together with some comments on the role of engineering 
in the study of biological problems. 

An engineer generally starts with a specification, a set 
of rules of behavior for a machine, and then tries to 
design a machine that possesses that behavior. He de- 
signs an airplane, for example, to fly at certain speeds, 
to go certain distances, and to carry certain loads; from 
these requirements he deduces the required structure. 

Nature is the engineer that constructs the difficult 
bridge between structure and behavior of living systems. 
In psychology, one discusses the behavior of an organ- 
ism and wonders what kind of structure could cause it. 
In physiology and anatomy, on the other hand, one 
usually studies the structure and wonders what it is 
doing, what function it is supposed to serve. 

We believe, in all humility, that the engineer can 
perhaps help to bridge the gap between structure and 
behavioral function in biological systems. Using his 
known methods, the engineer can most fruitfully start 
with observed behavior and work toward a possible elu- 
cidation of structure. Some engineers have tried to work 
in the opposite direction: to think up some properties of 
units, give them random connections, and find out what 
they will do. We do not think that that approach has 
been fruitful. We have considered, specifically, the prob- 
lems of pattern recognition and learning, and have tried 
to design computers to do these things in the hope that 
the structure of the computers might be suggestive of 
the organization of the nervous system.! 

This approach differs in one respect from that of 
Shannon and many others, who have also studied the 
design of machines to do certain things which animals 
and men can do. Their approach has been to program a 
niversal computer. Such an approach should lead to a 
athematical understanding of behavior, but it is 
ly to Jead to a specific structure which will give 
ysiologist, because a universal machine 
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from another, corresponding to sets of signals in nerve 
fibers. The signals might be continuously variable, or 
they might be discrete and binary; the latter type is 
much easier to handle, and is used in this study. The 
machine must have a component unit for each signal- 
input channel to indicate if that channel is active; and 
a unit for each possible pair of channels to indicate, by 
means of a coincidence circuit, whether or not that pair 
of channels is active. In general, one needs a unit for 
each possible set of channels. Such an arrangement is 
called a classification system; an example is shown in 
Fig. 1.2 

Next, the system must be made to recognize patterns 
in zime; this requires the introduction of delay circuits. 
Suppose that there are only two input channels, and 
that each is put through a fixed delay. If these two 
inputs and their delayed versions are used as four inputs 
of a classification system, the system will recognize all 
12 “tunes” which can be played on a two-note piano if 
one allows only two temporal ideas: “before” and 
“after.” The connections and recognition possibilities 
of such a system are shown in Fig. 2. 

A classification system can be built with quite random 
connections. The reason is that all possible arrange- 
ments of connections between units and channels are 
required. If there are sufficient units to accommodate 
the desired inputs, and if they are connected entirely 
at random to the inputs, a classification system will 
arise automatically. 

Next, the system must be given plastic behavior, in 
which past events can influence future responses. This 
property is introduced through the computation of 
conditional probability. Suppose that there are two 


F cation System for three inputs has seven units: 
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events, A and B, which occurred as follows on nine 
occasions : 
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(I= occurrence, 


O=nonoccurrence) 


One now asks: “Tf the future is like the past—if one 
accepts the inductive principle—and if A occurs, what 
is the chance of B?” In the past, A occurred on five 
occasions, and on three of these B also occurred. The 
corresponding probability of B, given A, is three out of 
five. Conversely, the probability of A, given B, is three 
out of seven. A machine to make these calculations must 
be able to count the five, in an A unit; the seven, ina 
B unit; and the three, in a coincidence-counting AB 
unit. Thus, to possess plastic behavior, the machine 
needs an additional feature: each unit must be able to 
count, even if in a very nonlinear way. The system will 
then be capable of inductive inference.’ 

Another important matter which comes easily out of 
this approach is the “context” in which events occur. 
Suppose that, in the foregoing example, an event C 
occurred on the last five occasions: 


A: J TOTOO 
B: (OORT ie me ae ae 
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This history now has two different contexts (C occurs 
and C does not occur), and one can ask: “In the context 
C, what would be the probability of A, given B?” 
In this context, B occurred five times and A once, so 
the three-out-of-seven chance has changed to one-out- 
of-five. Similarly, the probability of B, given A, in the 
context C, is unity. The system can be extended in- 
definitely to infer one thing from another in all sorts of 
different contexts.“ 

There is a practical difficulty in the design of counters: 
after a time they get full. The engineer has to put a leak 
into the counter to avoid this dilemma—and a very 
interesting property is thereby produced. The leaky 
counter gives a time weighting in which past events 
matter less than recent ones. The computer can then 
adapt fairly quickly to new conditions. 

Finally, the system must be able to calculate ratios of 
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Fic. (2a). A classificatiqn system for 2 inputs which distinguishes 
“before” and “after.” Four of the 15 units are shown. 
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Fıc. 2(b). The 12 tunes which can be distinguished by the 
system of Fig. 2(a). For the first 3 tunes—a, b, and ab—there are 
duplicate units. 


numbers. Division is done most easily by using loga- 
rithms. It turns out that a leaky counter produces 
something very close to a logarithmic count, so the 
required ratios can be obtained simply by taking the 
differences between the states of two units. (The “state” 
is just the stored voltage in the counter.) Clearly, the 
effect of an active counting unit on one to which it is 
connected must depend on their relative state. 

This relationship suggests a very important feature 
of a system that is capable of learning. The old idea 
that “facilitation” of a path depends on the number of 
times that it is used is not adequate; it is as illogical as 
the story of the Irishman and his boots—he couldn’t get 
them on until he had worn them a few times. It is well 
known that a novel stimulus may, initially, produce no 
reaction from an animal, but eventually produce a re- 
sponse owing to trial-and-error learning. If initially a 
path is not made at all, how could it possibly be made 
on the principle that the more often it is used, the 
better it is? 

The difficulty is resolved by a comparison of the states 
of two units. In neural terms, if a probability is to be 
computed, the numerator must be stored on one side of 
a synapse and the denominator on the other, so that a 
comparison can be made. 

In summary, a machine to simulate pattern recogni- 
tion and learning must be able to: 


(a) distinguish sets of signals, through the use of 
coincidence counting ; 

(b) recognize time patterns, by use of delay circuits; 

(c) count occurrences of signals and sets of signals; 

(d) store information regarding past occurrences, at 
least for a certain length of time; and 

(e) calculate ratios to determine relative states of 
units with stored counts of occurrences.® 


These properties are incorporated in the machine shown 
in Fig. 3. 


CONCLUDING REMARKS 


Mathematical models can be very useful for studying 
complex systems and functions, including those of living 
organisms. The utility of a model, and especially the 
confidence with which one can draw inferences from its 
behavior, can be greatly enhanced by building a physical 
analog of the model—a machine. In the process of build- 
ing the machine and making it work, one may find 


shortcomings or omissions in the theoretical model. — 


More important, one may find that the machine is 


capable of doing more than was anticipated. The system & i 


reported here, for example, was designed to do pattern 
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Fic. 3. A conditional probability computer, built at the National Physical Laboratory, capable of pattern recognition and trial-and- 
error learning. On the left is a model “retina” of photo cells to receive T-shaped shadow patterns. On the right is an electrically driven 
“vehicle” that learns, with the help of its computer “brain”? which is shown in the center of the picture, to move efficiently along the 


black-white border of its “highway.” 


discrimination, and it turned out to be capable of 
trial-and-error learning as well. 

Another important reason for making models is that 
people in different specialties use different languages— 
as is amply evident in a conference such as this Study 
Program. People try to talk about the same thing and 
hardly understand one another. A model is a sort of 
universal language. No matter what your field is, if you 
look at a physical model and see it working—see its 
parts working—you can get something out of it. In 
particular, the physiologist and anatomist can look at 
an engineer’s model of a psychologist’s observation and 
consider whether the structure of the model bears any 
resemblance to the structure of the living organism.’ 

One final comment: engineers use a very limited 
number of types of vacuum tubes to do a very large 


mber of different things. The nervous system appears 
nu 


to be similar. There must be an enormous number of 
things neurons can do with just a few basic properties. 
The computation of conditional probabilities is, perhaps, 
just one of them. 
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HE importance of specificity in biology has been 

mentioned or implied in a great many contri- 
butions presented in this volume, and many examples 
have been discussed. It is worthwhile, however, to 
examine the concept of specificity by itself, and it is 
advisable to review some of the more important physical 
and chemical factors that are currently believed to be 
involved in the phenomenon. 

In this paper are considered some typical examples 
of biological specificity involving more or less well- 
defined chemical systems as one of the specific inter- 
acting agenis. The molecular basis of specificity is 
discussed with special emphasis on the concept of 
complementarity. Finally, some words are devoted to a 
review of the important and old-fashioned idea of 
specificity as a form of organization. 

Serological specificity was defined by Landsteiner! 
as “the disproportional action of a number of similar 
agents on a variety of related substrata,’ and this 
definition is readily extended to other fields in which 
biological specificity manifests itself. Boyd? points out 
that this definition “would also include many chemical 
reactions, as might be expected if serological specificity 
were fundamentally chemical in nature, which we now 
believe,” and again there is no need to hesitate in 
applying the remark to biological specificity in general. 
Indeed, specificity is an important aspect of chemistry ; 
qualitative and quantitative analysis in organic and 
inorganic chemistry depends in a large degree upon 
“disproportionate actions” of similar agents on “related 
substrates” (e.g., the precipitation of silver halides in 
water and the separation of silver chloride from the 
bromide and iodide by treatment with ammonia, H:O 
and NH; being regarded as similar agents and the 
silver halides being related substrates). 

It is important to emphasize at the outset, too, that 
biological specificity is very variable in degree. Certain 
systems show a wide range of specificity whereas other 
systems show extremely narrow selectivity. What is 
likely to impress the chemist about biological specificity 
is the sharpness with which closely similar substances 
can be distinguished in some instances, but this sharp- 
ness is not at all universal. 

It is worth mentioning that the concept of specificity 
is involved in the actions of inhibitors as well as in those 
of normal substrates, since many closely related 
biological systems are poisoned to decidedly different 
degrees by substances having similar chemical com- 
positions. The drug industry thrives on this fact—as 
does mankind in general. 


EXAMPLES OF CHEMICAL SPECIFICITY IN 
BIOLOGICAL SYSTEMS 


1. Antibody-Antigen Interactions 


Because of the pioneering work of Landsteiner,! the 
field of immunochemistry provides one with probably 
the most easily understood examples of biological 
specificity, and it has made possible a fruitful method 
of attack on the relationship between specificity and 
chemical structure. When certain foreign substances 
(antigens) are introduced into an organism, the organ- 
ism reacts by producing proteins (antibodies) which are 
capable of reacting specifically with the antigens that 
gave rise to them. Antigens are invariably substances 
of rather high molecular weight, so that it is difficult 
to alter their chemical composition in a systematic 
fashion. Landsteiner was able to show, however, that 
new antigens may be prepared by combining small 
chemical groups (such as substituted aromatic rings) 
with proteins. These new antigens give rise to antibodies 
whose specificity depends on the nature of the conju- 
gated groups. The small chemical group is called a 
hapten. Let w, x, y, represent a series of chemically 
related haptens, let dw, Ax, Ay, Az represent antigens 
produced from protein A by conjugation with these 
haptens, and let Bw, Bx, By, Bz represent the anti- 
bodies arising from injection of Aw, Ax, Ay, Az into 
a rabbit. It is clear that a great deal might be learned 
about the chemical factors involved in immunological 
specificity by comparing the interactions of the different 
antibodies with the different antigens. Furthermore, it 
is found that a low molecular-weight compound, aw, 
containing the hapten group w, is able to inhibit the 
interaction between Aw and Bw. Evidently, Bw con- 
tains sites which are capable of reacting with the hapten 
groups on Aw, and the haptens in aw can compete for 
these sites. The study of this competition using a series 
of related haptens makes possible a more quantitative 
study of antibody-antigen specificity. 

Some typical results of this general approach are 
reported in a paper by Landsteiner and Lampl.! 
Metaaminobenzoic acid and metasulfanilic acid were 
coupled by diazotization of the amino group to the 
proteins of horse serum, and antibodies to these 
modified proteins were obtained by injection into 
rabbits. The resulting rabbit antisera then were mixed 
with samples of chicken sera which had been diazotized 
to aniline or to various aniline derivatives. Evidence for 
interaction consisted in the formation of a precipitate. — 
(The use of horse serum in the production of antibodies _ 
and of chicken serum in the antibody-antigen test 
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TABLE I. Effects of structural changes in the hapten on antibody-antigen interaction.* 


Hapten present on 
horse-serum antigen 
used in preparing 


Hapten present on chicken-serum antigens used to test interaction with antisera. 
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a 0 =no precipitate; +, +++, ++-+-+=increasing amount precipitate. 


eliminates any contribution of the nonhapten portion 
of the antigen to the antibody-antigen interaction.) The 
results are shown in Table I. It is seen that: 


1. No interaction will occur if an acidic group is not 
present on the hapten of the chicken-serum antigen. 

2. No interaction will occur unless the acidic group 
is meta to the amino group (i.e., CO; and SO;- must 
be meta to the point of coupling of the hapten to the 
protein). 

3. Replacement of a hydrogen atom on the benzene 
ring by a chlorine atom or a methyl group has little 
effect on the interaction. 

4. The antibodies are readily able to distinguish 
between a carboxyl group and a sulfonic-acid group 
on the antigen. 


On the basis of many experiments of this type, it 
could be concluded that the nature of the charged 
groups of the hapten is of decisive importance in 
determining the specificity. Haptens containing sulfonic- 
acid groups do not tend to compete effectively for sites 

antibodies which are generated by carboxy]-contain- 
on haptens, and vice versa. The introduction of an 
ing ^ap cid group into an antigen makes for a particu- 
arsenic-# n interaction between the modified antigen 
larly ues bodies which it induces. Introduction of an 
and the a methyl, halogen, methoxyl, or nitro group 
pe ten has a much smaller effect on the speci- 

into 2 


ficity; strong cross reactions are observed between 
antigens containing a given hapten and antibodies in- 
duced from antigens containing haptens that differ from 
the given hapten merely through the presence of one of 
these uncharged groups. On the other hand, antibodies 
are found to be able to distinguish very well between 
two haptens which are optical isomers or which are 
(cis-) (trans-) isomers. 

The important inference to be drawn from these 
studies is that the specificity of the antibody-antigen 
reaction must depend upon a complementarity between 
the surfaces of the antibody and antigen molecules. 
This complementarity presumably involves an ap- 
position of positive and negative electric charges, if a 
charge is present in the antigen. It also involves a 
rather close matching of the contours of portions of 
the surfaces of the antibody and antigen molecules—a 
lock and key relationship that had been visualized 
already by Ehrlich early in the development of im- 
munology. Small deviations in the perfection of this 
complementarity are tolerated (e.g., replacement of a 
methyl group by a chlorine atom, or even of a hydrogen 
atom by a methyl group). Forces of all types are 
presumably in action across the surface of contact of 
the antibody and antigen, but electrostatic forces seem 
to be particularly important. One can visualize these 
relationships by means of semischematic drawings 
such as Fig. 1, where a portion of the antibody molecule 
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is supposed to envelop the hapten, with electric 
charges suitably located on the antibody surface so as 
to interact favorably with the opposite charge on the 
hapten. 

A paper by Pressman, Siegel, and Hall‘ illustrates 
the possibility of obtaining quantitative information 
about complementarity by making use of the hapten- 
inhibition phenomenon. Samples of ovalbumin were 
diazotized to ortho-, meta-, and para-aminobenzoic 
acid. Three antigens were produced in this way, 
consisting of ovalbumin molecules bristling with 
azobenzoate groups having the three isomeric configura- 
tions shown in Fig. 2. The three antigens are referred 
to as “X,-ovalbumin,” “X,,-ovalbumin,” and “X,- 
ovalbumin.” The three corresponding antibodies were 
prepared by injecting these antigens into rabbits, 
bleeding the rabbits, and separating the globulin 
„fraction (which contains the antibodies) by ammonium- 
sulfate precipitation. The addition of suitable propor- 
tions of X.-ovalbumin to anti-X,-globulin gave a 
visible precipitate, the amount of which could be 
measured. When low molecular-weight haptens [such 
as benzoate, o-chlorobenzoate, m-chlorobenzoate, p- 
chlorobenzoate, o- (p’-hydroxyphenylazo) benzoate, and 
various chlorinated derivatives of o- (p’-hydroxyphenyl- 
azo) benzoate ] were added, the amount of the precipitate 
was decreased because of the competition of the small 
haptens for the complementary site on the antibodies. 
By varying the concentration of low molecular-weight 
hapten and observing the change in the amount of 
precipitate produced, it was possible to compare the 
affinities of the different hapten groups for the sites on 
the anti-X,-globulin.* These relative affinities could 


Surface of 
( antibody 


Protein 
of antigen 


Fic. 1. Diagrammatic representation of complimentarity 
of the hapten-antibody relationship. 


* The detailed mathematical analysis by Pauling, Pressman, 
and Grossberg® of the dependence of the amount of precipitate 
on the concentration of the low molecular-weight hapten showed 
that there must be a considerable variability in the affinities of 
the hapten for different sites on the antibody molecules. That 
is, the antibody sites at which the haptens are bound are not all 
alike. This heterogeneity can be described adequately by means 
of a Gaussian distribution of the free energies of binding about a 


IN BIOLOGICAL SYSTEMS 551 


TABLE II. Relative affinities of various haptens for 
anti-Xo, m, p-globulins. 


A Fret. 


cal/mole 
Hapten of hapten 
For anti-Xo-globulin: 
Benzoate (0) 
o-chlorobenzoate — 580 
m-chlorobenzoate 0 
p-chlorobenzoate 430 
o-(p’-hydroxyphenylazo) benzoate —1700 
3-Cl, 2-(p’-hydroxyphenylazo) benzoate —1100 
4-Cl, 2-(p'-hydroxyphenylazo) benzoate —1100 
5-Cl, 2-(p’-hydroxyphenylazo) benzoate — 1300 
6-Cl, 2-(p’-hydroxyphenylazo) benzoate — 1500 
For anti-Xm-globulin: 
Benzoate (0) 
o-chlorobenzoate 680 
m-chlorobenzoate —300 
p-chlorobenzoate 480 
m-(p’-hydroxyphenylazo) benzoate — 1600 
2-Cl, 3-(p’-hydroxyphenylazo) benzoate —560 
4-Cl, 3-(p’-hydroxyphenylazo) benzoate —350 
5-Cl, 3-(p’-hydroxyphenylazo) benzoate —980 
6-Cl, 3-(p’-hydroxyphenylazo) benzoate — 500 
For anti-Xp-globulin : 
Benzoate (0) 
o-chlorobenzoate 960 
m-chlorobenzoate —300 
p-chlorobenzoate — 560 
p-(p’-hydroxyphenylazo) benzoate —1700 
2-C], 4-(p’-hydroxyphenylazo) benzoate —940 


be expressed in terms of the differences between the 
mean free energies of combination of the haptens for 
the sites on the anti-X.-globulin. 

Similar experiments were performed on the Xm- 
ovalbumin-anti-X,,-globulin system and on the 
X,-ovalbumin-anti-X,-globulin system. As a result, 
it was possible to observe the effects that different 
groups in different positions in the hapten have on the 
ability of the hapten to combine with a given antibody. 
Some results for a series of chlorobenzoates and chloro- 
p-hydroxyphenylazobenzoate haptens are shown in 
Table II. The numerical values of AF, listed in this 
table are the differences between the average free 
energy for the combination of the hapten in question 
with the antibody sites and the average free energy of 
combination of the unsubstituted benzoate ion with 
the same sites. It is seen that, with each of the three 
globulins, the p-hydroxyphenylazo group greatly in- 
creases the affinity as compared with the benzoate ion 
(makes AF, more negative), presumably because it is 
able to take advantage of some of the binding forces 
associated with the portion of the antibody site that the 
organism intended to be occupied by the azo group and 
by the side chain of the protein to which the azo group 
is attached (see Fig. 3). This simple picture also shows 
mean value which is characteristic of each hapten, the spread in 
the free energies on either side of the mean value amounting to 
1000 to 2000 cal/mole of hapten. The complication introduced by 
heterogeneity can be avoided by talking in terms of the mean free 


energy of binding for a given hapten at the sites on the anti-X,- 
globulin. 
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orthobenzoate derivative 
= “X,-ovalbumin” 


metabenzoate derivative 
= "X -ovalbumin” 
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CO7 


C02” 


Protein 


parabenzoate derivative 
= “Xp-ovalbumin” 


Fic. 2. Configuration of haptens used in studying antibody-antigen complementarity. 


why the introduction of a chlorine atom in place of a 
hydrogen atom at various positions in the benzene 
ring in the simple benzoate ion tends to be unfavorable 
(makes AF,.; more positive by several hundred calories), 
if the slightly bulky chlorine atom is not able to go into 
the position intended for the azo group of the original 
antigen. When the much larger p-hydroxyphenylazo 
group is present in the hapten, it occupies the position 
intended for the protein of the original antigen; the 
introduction of a chlorine atom into such a hapten then 
invariably decreases its affinity for the antibody site. 
The amount of the decrease in affinity is different, 

epending upon the location of the chlorine atom on the 
enzene ring. This reveals something about the nature 
3f different portions of the surface of the hapten- 
binding site on the antibody. It is clear that experi- 
ments of this kind allow one to probe the various 
regions of the antibody-combining sites and to obtain 
quite detailed information on the factors that determine 
antibody-antigen specificity. 


2. Enzyme-Substrate Interactions 


Enzymes are well known to be rather particular 
about the substrates whose reactions they choose to 


Interaction of Xp-ovalbumin 
with anti-Xp-globulin 


Interaction of benzoate ion 
with anti-Xp-globulin 


catalyze. The degree of this specificity varies widely, 
however. The enzyme urease evidently catalyzes only 
the hydrolysis of urea and appears to have no noticeable 
effect on any urea derivative or other substance that 
has yet been tested. Fumarase adds water to the double 
bond of the fumarate ion to produce L-malate ion and 
no other double-bonded organic molecule has been 
found which is attacked by this enzyme. On the other 
hand, certain lipases will hydrolyze ester bonds con- 
taining widely different organic groups attached to 
either side of the bond. 

It is generally agreed that enzymes act by adsorbing 
the substrate molecule or molecules at some “active 
site” on the enzyme molecule in such a way that the 
appropriate bonds in the substrate show an increased 
reactivity. It is difficult to avoid thinking in terms of 
the hypothesis that the enzyme-substrate interaction, 
like the antibody-antigen interaction, depends upon a 
complementarity in structure between the two sub- 
stances, the substrate molecule fitting into a charac- 
teristically shaped impression in the surface of the 
enzyme molecule. The increased reactivity may arise 
because in the adsorbed state the bonds involved in the 


Interaction of p-hydroxy- 
phenylazobenzoate with 
anti-X p- globulin 


Fic. 3. Hapten inhibition of anti-X,-globulin. 
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reaction are placed under strain or because they are 
brought into a favorable position for reaction. This 
simple picture seems to be consistent in a general way 
with the wide range of specificity found among enzymes. 
Highly specific enzymes presumably contain a highly 
characteristic impression of the substrate molecule 
into which it is not possible to fit even closely related 
molecules capable of undergoing the same reaction as 
the substrate. Enzymes showing a wide range of specific- 
ity, on the other hand, must interact with only the 
region of the substrates in the immediate vicinity of the 
bonds that are involved in the reaction. This picture 
also explains why substances closely related to a given 
substrate, but incapable of undergoing the reaction 
that is catalyzed by the enzyme, often act as powerful 
inhibitors (e.g., malonic acid and malic acid strongly 
inhibit the conversion of succinic acid to fumaric acid 
by succinic dehydrogenase). These inhibitors must fit 
into the site intended for the substrate but, like the dog 
in the manger, they neither take advantage of the 
catalytic capabilities of the site nor permit access to 
the true substrate. 

The hypothesis of structural complementarity also 
explains the high degree of stereospecificity shown by 
practically all enzyme reactions in which the potential 
substrates are capable of existing as stereoisomers. 
(E.g., the dehydration of malic acid in the presence of 
fumarase to form fumaric acid is absolutely specific for 
L-malic acid, D-malic acid being unaffected by fumarase 
to any detectable degree; similarly, maleic acid, the 
cis isomer of fumaric acid, is not hydrated by fumarase 
to any detectable degree. Most peptidases will not 
hydrolyze peptide bonds involving D-amino acids.) 
Suppose that the active center of an enzyme is sur- 
rounded by a mold into which all of the chemical groups 
in the vicinity of a double bond of the ‘vans isomer of a 
compound will fit. It is easy to understand why this 
mold will not be able to accommodate the cis isomer 
of the compound, which has an entirely different ex- 
ternal shape. The specificity to optical isomers is no 
less easy to understand in terms reflecting the common 
experience that a left hand will not fit into a glove made 
to cover its mirror image, the right hand. 

It is interesting to examine some typical enzyme 
systems in order to observe the types of behavior that 
can be accommodated to the complementarity concept. 
One finds that many systems can be interpreted very 
easily in these terms, but one finds also that there are 
certain instances in which the concept does not present 
a very convincing picture of what must be going on. 


(a) Peptidases 


Bergmann studied the action of various peptidases 
on simple peptides and was able to show that there is a 
high degree of specificity among these enzymes which 
depends on the nature of the groups located on either 
side of the peptide bond that is split. He was able to 
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TABLE III. Bergmann’s specificity rules for the hydrolysis 


of the peptide bond in Rx—NH—CHR,—CO—NH—R;,. 
Enzyme Rt R: R: 
Pepsin glutamic or not H aromatic 
aspartic (tyrosine or 
phenylalanine) 
Trypsin cationic not H nonspecific; 
(lysine or may be H 
arginine) 
Chymotrypsin aromatic nonspecific nonspecific 


(tyrosine or 
phenylalanine) 


formulate certain general specificity requirements for 
the action of pepsin, trypsin, and chymotrypsin, which 
are summarized in Table IIT. Further work has shown, 
however, that many of these requirements are not met 
by all substrates of these enzymes, and the specificity 
picture has become somewhat more complex (see, for 
example, the discussion in Dixon and Webb® and 
Neurath and Schwert’). To illustrate these complexities, 
the chief points of attack on ribonuclease by various 
peptidases are shown in Fig. 4. Inspection of this figure 
and Table III will show that the specificity discovered 
by Bergmann for trypsin is invariably maintained: the 
points of attack by trypsin are always at the peptide 
bond next to either arginine or lysine, with the carbonyl 
end of the bond that is hydrolyzed belonging to the 
cationic amino acid. It does appear, however, that the 
bond adjacent to one of the cationic amino-acid 
residues of ribonuclease (the lysine at position 44 from 
the lysine end of the chain) may not be attacked by 
trypsin. The points of attack on ribonuclease by 
chymotrypsin are frequently adjacent to aromatic side 
chains (usually at the carbonyl side), as demanded by 
the Bergmann rules. Unexpected exceptions are found, 
however, in the ready opening of the leucine-threonine 
linkage between positions 35 and 36, and of the alanine- 
lysine linkage between positions 103 and 104. A weak 
ability of chymotrypsin to split peptide bonds involving 
methionine had been noted in simple peptides (chymo- 
trypsin also hydrolyzes a cystine-serine bond in insulin). 
These exceptions are somewhat puzzling from the point 
of view of the complementarity concept, because the 
groups that are involved can hardly be said to resemble 
in general shape the aromatic groups that the Bergmann 
rule demands for chymotrypsin. More complex factors 
seem to operate than the mere binding by the enzyme 
of the amino-acid residue next to the peptide bond that 
is being hydrolyzed. The slight ability of methionine 
and cystine to replace an aromatic ring in some sub- 
strates for chymotrypsin is especially puzzling and 
might suggest that an electronic mode of interaction 
may be involved rather than one involving steric 


complementarity. - 
The specificity rule for pepsin proposed by Bergmann 


on the basis of its action on simple peptides turns out 


to be violated frequently when pepsin acts on long 
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Fic. 4. Points of hydrolysis of ribonuclease by chymotrypsin (C), pepsin (P), trypsin (T), and subtilisin (S). 
Dotted lines represent points of slower hydrolysis. 


polypeptide chains. This is evident from Fig. 4, where 
the attack on an alanine-alanine bond may be noted. 
Pepsin also attacks an alanine-serine bond in ACTH 
and a leucine-valine linkage in insulin. In general, 
however, pepsin does show a tendency to attack either 
bonds involving glutamic or aspartic acids by them- 
selves, or bonds involving aromatic amino acids by 
themselves—in partial agreement with Bergmann’s 
rules. It appears that the true range of specificity of 
pepsin does not reveal itself until one allows it to act 
on relatively long polypeptides. 

It is interesting that trypsin and chymotrypsin are 
able to hydrolyze esters and suitably activated carbon- 
carbon bonds as well as peptides. The specificity 
requirements on the amino acids adjacent to the ester 
or carbon-carbon bond are similar to those observed 
for peptides with the same two enzymes. Here one has a 
particularly striking example of the importance of the 
environment of a reacting bond, rather than the bond 
itself, in manifesting the Specificity of the enzyme. 


(b) Cholinesterase and Acetylcholinesterase 


The reaction 


CH:—CH:— O— CO=CH, 
(acetylcholine) 
| 20 => (CH3)3Nt— CH.— CH,— OH 
; (choline) 
4 +CH;—CO—OH 


wo distinct groups of enzymes which 
ifferent specificity requirements. 
f enzymes, the acetylcholines- 
the substrate acetylcholine at 


Wp jialyzed by t 


these group 
terases is inhibited | 


higher substrate concentrations, so that its activity vs 
substrate-concentration curves pass through a maxi- 
mum. The other group of enzymes, known as cholin- 
esterases, is not so inhibited, and the activity vs sub- 
strate-concentration curves follow the normal Michaelis- 
Menton behavior. Both kinds of enzymes can act on 
many substrates related (in some instances rather dis- 
tantly) to acetylcholine, but mention is made only of the 
interesting and opposite effects of replacing the acetyl 
group with other acyl groups containing longer hydro- 
carbon chains. In mammalian brain acetylcholinesterase, 
the replacement of acetyl with butyryl greatly lowers 
the activity and does not remove the inhibition at high 
substrate concentrations. In cholinesterase, on the other 
hand, the same substitution greatly increases the ac- 
tivity of the enzyme. Here one has an example of two 
enzymes which catalyze the same reaction but whose 
specificity patterns are markedly different, showing that 
considerably different types of complementarities must 
be possible for a given substrate. 


(c) Glycosidases 


The hydrolysis of a- or B-alkyl- or aryl-glycosides to 
form free sugars plus alcohols or phenols is catalyzed 
by a group of enzymes that show interesting specific- 
ities. In general, a given glycosidase is highly specific 
to the sugar moeity, and to whether the glycoside has 
thea- or B-configuration, but it is relatively unconcerned 
about the nature of the aglucone residue. For instance, 
a-glucosidase will hydrolyze a-methylglucoside, a- 
phenylglucoside, and many other a-glucosides, but it 
will not cause the hydrolysis of any -glucoside or any 
a-glycoside of a sugar other than glucose. Here one has 
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examples of very high specificity toward a group that lies 
on one side of the chemical bond that is attacked, and 
a very low specificity to the other side. 


(d) Stereospecificity in the Transfer of Hydrogen 
lo DPN in Dehydrogenase Reactions. 


Vennesland® and Westheimer have shown by means 
of ingenious experiments with deuterium that, in the 
reaction 


CO—NHz, 


+CH;—CH.0H 


R 


(DPN*) 


+-CH;— CHO 


(DPNH) 


both the addition of the hydrogen atom to the pyridine 
ring of DPN* in the forward direction and the addition 
of hydrogen to acetaldehyde in the reverse reaction are 
stereospecific. Thus, if the equilibrium 


CH;— CH,—OD+DPN*+= CH;— CHO+DPNH+Dt 


is carried out in D,O, it is found that no additional 
deuterium atoms are introduced into either DPN*, 
DPNH, alcohol, or aldehyde. Similarly, no normal 
hydrogen is introduced when the reaction 


= CH;—CDO+DPND+H* 


is equilibrated in H:O. This proves that the hydrogen 
atom is transferred directly from the carbon atom of 
the alcohol to the DPNt (and vice versa) and does not 
go through the solvent. Furthermore, the hydrogen 
atom introduced into the DPNH must lie on one 
definite side of the pyridine ring. The ring must lie, 
therefore, in a definite orientation on the enzyme 
surface, and relative to the alcohol molecule, in the 
complex of the enzyme with its two substrates. That 
the orientations of the ethanol and aldehyde molecules 
also must be fixed rigidly relative to the pyridine ring 
is shown by the fact that the reaction 


Ht+DPND-+CH;— CHO — DPNt 
+CH;—CHD—OH 


produces a single optical isomer of ethanol-1d. Here is 
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an especially clear example of the detailed stereo- 
chemical specification of the structure of an enzyme- 
substrate complex. It must take advantage of a large 
fraction of the possible complementarity that can exist 
between the substances concerned in the reaction. 


(e) The Koshland-Stein Rule 


Koshland and Stein? have proposed the general 
principle that “if the enzyme requirements for a 
substrate having an oxygen bridge, i.e., R-O—Q, 
show high specificity for R and low specificity for Q, 
than R—O cleavage occurs in the enzyme reaction.” 
This rule has been found to be valid for invertase, 
alkaline phosphatase, acid phosphatase, trypsin, and 
chymotrypsin. The simplest interpretation of the rule in 
terms of complementarity would be that the group 
with the highest specificity would be presumed to have 
the closest contact with the enzyme; the enzyme then 
would be expected to have the greatest activating 
effect on the bond adjacent to this group. 


(f) Existence of Identical Amino-Acid Sequences in the 
Active Centers of Enzymes Having Diverse Specificities 


It has been found recently that the active centers of 
thrombin, chymotrypsin, trypsin, and phosphoglu- 
comutase all contain the sequence aspartic acid-serine- 
glycine. These enzymes have markedly different specific- 
ities, the first three catalyzing the hydrolysis of 
peptide bonds, whereas phosphoglucomutase catalyzes 
the transfer of a phosphate group from one glucose 
ring to another. The specificity of these enzymes must 
clearly reside in regions quite distinct from the center 
that activates the chemical bonds which are disrupted. 


3. Binding of Small Molecules to Proteins (Other 
Than Hapten-Antibody Interactions) 


Serum albumin is remarkable for its ability to adsorb 
a wide variety of small molecules, particularly (though 
not at all exclusively) those bearing negative charges 
(acetate ion, halide ions, methyl orange, and many 
other anionic dyes, detergents, etc.). The most striking 
feature of this adsorption, however, is the lack of any 
reasonably marked specificity. Changes in the structures 
of the adsorbed molecules do have some effects on the 
strength of the binding. These changes in affinity are 
not, however, as large as one would expect to find if the  _—_ 
surface of the serum albumin molecule were covered — 
with a mosaic of rigid binding sites having a reasonably — 
high degree of complementarity to the necessarily limited — 
number of types of substances which such a mosai 
might potentially bind. (For example, enantiomorphic — 
isomers of a dye are adsorbed to almost the same 
extent.) In order to account for this, Karush!® _ 
proposed the concept of “configurational adaptab 
according to which the serum-albumin mole 
considered to possess a degree of internal flex 
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As a result, the surface conformation can be changed 
so as to establish a measure of complementarity with 
the surface of almost any molecule that approaches it, 
and there is no need to deal with a limited number of 
fixed types of potential binding sites. A strong point 
in support of this concept is the fact that serum albumin 
does possess a high degree of internal flexibility as 
compared with other proteins, which also show a far 
smaller affinity for small molecules.!' In a sense, the 
serum-albumin molecule behaves like a lump of putty 
onto which one can stick an unlimited number of 
different shapes. 

The concept of configurational adaptability is 
important because it shows that complementarity need 
not imply specificity. Furthermore, it raises the possi- 
bility that, in some instances, the specific complimen- 
tarity structure may be developed only when the sub- 
strate is present. 


SPECIFICITY AND ORGANIZATION 


Green has set forth reasons for believing that, in 
certain biochemical processes involving a sequence of 
steps, each catalyzed by a different enzyme, the enzyme 
molecules must occupy fixed positions in space which 
are located so as to facilitate the successive reactions. 
He speaks of “an organized mosaic of enzymes in which 
each of the large number of component enzymes was 
uniquely located to permit efficient implementation of 
consecutive reaction sequences.” A primary example of 
this mosaic is the cyclophorase system, which includes 
all of the enzymes for the citric-acid cycle, fatty-acid 
oxidation, oxidative phosphorylation, and terminal 
electron transport. This system of enzymes has been 
shown to be located exclusively in the mitochondria. 
In view of the striking mitochondrial structures which 
have been revealed by electron microscopy [see the 
papers by Sjostrand (p. 301) and Fernández-Morán 
(p. 319) in this volume], there is a certain temptation 
to accept this theory. 

Dixon and Webb! have disputed the necessity for 
assuming organization in this spatial sense. As an 
alternative, they go back to the point of view presented 
twenty-five years ago by Hopkins" in these words: 
“The organizing potentialities inherent in highly 
specific catalysts have not, I believe, been adequately 

appraised in chemical thought. Highly specific catalysts 
determine just what particular materials, rather than 
any others, shall undergo reaction. They select from 
their environment. The specific catalyst determines 
which among possible paths the course of change 
shall follow. It has directive powers . . ..”” Dixon'* 
aks of “organization by specificity” as an alternative 
Spe en’s spatial organization of enzyme systems. 
to Gre entration of the cyclophorase enzymes into 
The eee. dria is ascribed to the need to reduce the 
the mitoc een the enzymes involved in consecutive 
distance betw 
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reactions. The cell is able in this way to reduce the 
“transit times” required for the products of one 
enzymatic step to diffuse to the enzyme molecules 
involved in the next step. According to this view, there 
is no need to assume any regular arrangement of the 
different enzymes. It is necessary only to have an 
arrangement (a random one would do) with a suffi- 
ciently small mean distance of separation between the 
enzymes involved in the different enzymatic steps. 

It is interesting in this connection to estimate the 
order of magnitude of the transit time for a typical 
metabolite to move a given distance from one enzyme 
to another by a diffusion mechanism. It is well known 
that if a? is the mean square distance moved in time £ 
by a particle whose macroscopic diffusion constant is 
D, then 


?=2D1. 
Typical metabolites have diffusion constants of the 
order of 1 cm*/day, or 10-° cm?/sec. Thus, it is found 


that, if the enzymes required for two successive steps 
of a reaction sequence are 1000 A apart, the transit 
time is 10 usec. If the separation is 100 A, the transit 
time is 0.1 usec. It would appear, therefore, that the 
arrangement of enzymes of a reaction sequence in 
layers whose repeat distance is a few hundred Angstrém 
units, without any regard to the serial positions in space 
of the enzymes involved in successive reaction steps, 
would give a more than adequately short transit time 
for most biochemical requirements. Specificity alone 
should be adequate to cope with problems of organiza- 
tion at this level. This is not to say, of course, that the 
living cell actually is organized in this manner. For 
reasons unknown, it may have chosen to solve its transit 
time problems in a less straightforward manner. 
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REVIEWS OF MODERN PHYSICS 


N this discussion, a brief account is to be made of 

one of the mechanisms involved in homeostasis, a 
word which means that the chemical and physical 
processes occurring within the fluid matrix or internal 
environment of an organism are integrated so that a 
stable physiological level of each is maintained. Thus, 
if one process deviates from its stable level, other 
mechanisms are shifted slightly or considerably, as the 
occasion demands, in an attempt to bring this process 
and the entire system back to its normal state. Of course, 
everyone appreciates the fact that the closely related 
set of activities occurring in one physiological system 
are influenced by the sets of activities occurring in other 
systems. lor example, the system discussed herein, the 
blood coagulation system, is influenced markedly by 
stress (as, of course, are the other physiological sys- 
tems.) The mechanisms in one physiological system, 
then, are not at liberty to establish levels of activity 
which are optimal for that system asan isolated system: 
the levels are a compromise derived from the activities 
of the organism as a whole. 

It should be made clear at the start that the infor- 
mation presented here is also the result of a compromise: 
it represents the activities in a chosen individual only 
in a general way and is a simplification most parts of 
which can be challenged. This situation stems from the 
fact that, while there is general agreement about many 
of the coagulation mechanisms, others have been recog- 
nized as possibilities only recently and an understanding 
of details is incomplete. Extensive use has been made 
of certain references in preparing this discussion: the 
books by Wintrobe! and Biggs and MacFarlane,’ the 
review articles by Brinkhous, Langdell, and Wagner,’ 
and Seegers,’ and the symposium edited by Brinkhous.® 

Applied to the blood vascular system the word 
homeostasis would include first, the hemostatic mecha- 
nisms—that is, the mechanisms which are involved in 
sealing a wound which has opened the usually closed 
circulatory system—and second, the more mysterious 
set of mechanisms which are involved in maintaining 
the thin fragile walls of the capillaries in such a state 
that the proteins and formed elements (red and white 
cells) of the blood do not escape from the circulation 
and pass into the tissue spaces. The first set of mecha- 
nisms, the hemostatic, involve the closing of small 
vessels through vascular constriction, plugs of platelets, 
and coagulation in which the fluid plasma is transformed 
into a gel. We can be concerned only with coagulation, 
where it is found that an extensive set of interlocking 
processes exist in what is quite evidently a poised 
equilibrium such that a displacement of the equilibrium 


CC-0. Gurukul Kangri University keena 


¥ 


ax. Sy f t%?. 
_ mene ead 


VOLUME 31, 


60 
Blood Coagulation—A Study in Homeostasis 


Davip F. WaucH 
Department of Biology, Massachusetts Institute of Technology, Cambridge 39, Massachusetts 


NUMBER 2 APRIL, 1959 


in the proper direction leads to a rapid autocatalytic 
development of the clotting enzyme thrombin from its 
inactive precursor prothrombin. A general scheme of 
the features of the coagulation system to be discussed 
here is shown in Fig. 1. The explosive development of 
thrombin swamps out, locally, all of the anticlotting 
mechanisms and a clot develops, the latter being the 
result of a change in the structure and in the state of 
aggregation of the protein fibrinogen. The clot is con- 
fined locally by a set of slower reactions, some of which 
are autocatalytic and controlled by thrombin develop- 
ment and most of which are backed up by large capaci- 
ties. These reactions nullify the clot-developing system 
at several points. 

The trigger for the mechanisms which displace the 
poised equilibrium is apparently disruption of a bound- 
ary which isolates key substances, either within small 
blood elements, the platelets, or within the tissue cells. 
On wounding, the intermixing of blood and tissue juices 
and the presence of rough surfaces at the site of injury, 
which cause platelet decomposition, are sufficient to 
give the rapid development of thrombin described in 
the following. 

One might venture that if hemostasis were the only 
objective of the coagulation system it could be much 
simpler than it is. Many of the extraordinary complica- 
tions are probably associated with the fact that coagu- 
lation must occur continuously at low levels in order 
to preserve the normal structural and permeability 
characteristics of the capillaries. The physiological 
necessity for slow continuous clotting is referred to 
later in an attempt to justify this statement. It is just 
mentioned now, however, to focus attention at the start 
on the possibility that one is dealing with a system 
which, on the one hand, must operate at a low but 
precisely controlled level of activity but which, on the 
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Fic. 2. Small artery, A, and capillaries, C, from the mesentary 
of a rabbit; M, muscle cells in the media; adv, adventitia; X, 
origin of a capillary from the artery; y, pericyte; z, perivascular 
histiocyte; end, v, endothelial nuclei [from A. Maximow and 
W. Bloom, A Textbook of Histology (Saunders Company, Phila- 
delphia, 1942), fourth edition]. 


other hand, can be triggered into explosive activity 
which is not controlled but must be confined. 
Blood flows within a set of tubes of varying size, the 
larger tubes (arteries, arterioles, veins, venules) being 
covered by substantial layers of muscle and connective 
tissue. The smallest vessels, the capillaries, are our chief 
concern. These fragile vessels (Fig. 2) have a wall which 
is constructed of a single layer of flat endothelial cells 
cemented at their boundaries and invested by a thin 
network of connective-tissue fibers. The caliber of the 
capillaries is close to the diameter of a single red cell, 
and is thus about 8 u in man. It is believed that water 
and solutes generally pass through the cytoplasm of the 
endothelial cells. White cells may migrate through the 
substance of the endothelial cells or through the 
boundaries between cells. 

Blood! itself is a fluid 40% of whose volume is made 
up of formed elements (red cells, white cells, etc.) and 
60% of a straw-colored fluid | lasma containing seem- 
ingly an endless variety of co pounds, many in the 
process of being transported from one organ to another. 
The most important of the formed elements in this dis- 

ion is the platelet, a disk-shaped body about 2 in 
a eee The platelets can become sticky and adhere 
oe Pied forming a mass which itself is of impor- 
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nents. The platelet is fragile. If it comes in contact 
with a surface foreign to the circulation (i.e., glass) it 
adheres, spreads out, and bursts liberating its contents. 

Within the plasma are found a series of proteins and 
lipoproteins all involved directly or indirectly in clotting. 
These proteins are described as they come up for dis- 
cussion, the current series includes prothrombin, throm- 
bin, and fibrinogen. Prothrombin, in the purified prepa- 
ration of Seegers et al., appears as a molecule of 
M~66 000 and as a prolate ellipsoid may be described 
as having a length of 119 A and a diameter of 34 A.’ 
This molecule can dissociate on dilution into halves. It 
is quite stable in the pure state and stable in plasma 
when prothrombin activators are absent. Prothrombin, 
as its name implies, has no action whatsoever on 
fibrinogen. 

Thrombin is one of the derivatives of prothrombin. 
It can be obtained from prothrombin by a variety of 
techniques, the important one in this instance being the 
normal biological-activation process involving a variety 
of substances. In this activation process, the molecular 
weight of prothrombin does not appear to change 
radically. In the concentrated solutions required for 
physical studies, however, thrombin may be associated. 
Our current studies suggest that the minimum molecular 
weight of thrombin is about 22 000. One unit of throm- 
bin is defined as that amount which clots plasma or a 
solution of purified fibrinogen corresponding to plasma 
in 16 sec. Since preparations of thrombin containing 
2000 units/mg have been made, it is clear that a single 
unit is represented by about 5X10~*mg. There is 
sufficient prothrombin in one milliliter of plasma to 
give 300 units of thrombin, corresponding to the small 
amount of about 0.15 mg of prothrombin per milliliter 
of plasma. The total possible number of thrombin units 
is, however, staggering. Immediate activation of a few 
percent of the prothrombin would coagulate blood in 
seconds. In the coagulation process, thrombin acts as 
an enzyme which splits an acidic peptide from fibrino- 
gen (Fig. 1). 

Fibrinogen has a molecular weight of 340 000 and is 
an asymmetric protein of axial ratio about 10.8 It has an 
interesting substructure? which cannot be discussed here 
for lack of space. In pure form, it is a fairly stable pro- 
tein which survives rough handling. After thrombin has 
split off an acidic peptide, the reactivity of the original 
protein is altered, the fibrin monomer produced by this 
enzymatic alteration having now a marked capacity to 
interact in a relatively specific fashion with other fibrin 
monomers. At the same time, a small amount of stabi- 
lizing plasma globulin is incorporated. The interaction 
leads predominantly to end-wise association which pro- 
duces fibrin fibrils or fibers having asymmetries orders 
of magnitude greater than that of the original fibrinogen 
molecule. During growth of the set of fibrin fibrils there 
occurs a brief interval of time during which the total 
volume changes from being greater to being less than 
the sum of the volumes dominated by each fibril. The 
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fibrils start to overlap and to crosslink where they come 
into contact. In this way a space filling network of 
strands develops." At first, the strand network is tenu- 
ous and easily ruptured, but as old strands become 
thicker and as new strands are incorporated into the 
network, the clot progressively develops rigidity. As is 
seen later, this is not the end of the story, for, in the 
presence of platelets, a spontaneous squeezing out of 
fluid may occur (clot retraction). This is an important 
activity of the clot in forming a compact closure over a 
wound. If retraction and drying of the clot do not occur, 
an enzyme plasmin may be activated from its pre- 
cursor.’ Apparently, the enzyme plasmin has been de- 
vised specifically to dissolve fibrin. 

The first portion of this general scheme to consider 
is the transformation of prothrombin into thrombin. 
This phase of the over-all problem has received a great 
deal of attention, the natural result of the fact that 
discoveries in clinics have revealed an ever-increasing 
series of patients whose plasmas have lacked one or 
another of the activators necessary for the prothrombin- 
thrombin conversion, and of the fact that prothrombin 
has been prepared in relatively pure form, first by 
Seegers and more recently by others, and its activation 
has been studied under a variety of conditions. 

The scheme which is shown in Fig. 3 is a simplified 
version of experience in the field.'~* The central theme 
is the conversion of prothrombin into thrombin. The 
top half of the figure depicts clotting events which can 
occur when blood is drawn and every effort is made to 
keep the blood free of contamination with tissue com- 
ponents. Before proceeding, attention is directed to the 
central role assigned here to thrombin not only as the 
clotting enzyme but also as an agent which feeds back 
and promotes the formation of more of itself. As can 
be seen, this feedback enters at several levels into the 
set of interactions and, in order to promote the explosive 
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Fic. 3. Illustrative scheme for mechanisms 
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liberation of thrombin, what is necessary, so far as this 
portion is concerned, is platelet decomposition and a 
nudge in the direction of thrombin production. 

To start with, on being drawn, the blood comes into 
contact with a foreign surface and certain of the plate- 
lets respond by rupturing and liberating a series of 
compounds, the one singled out being platelet factor 3, 
a lipoprotein. This factor combines with or activates 
an inactive form of a substance which is probably a pro- 
tein called plasma thromboplastin component (PTC). 
The inactive forms of components have been placed in 
parentheses, the Roman numerals indicate the accepted 
number in the “factor” series. Plasma thromboplastin 
component now interacts with antihemophilic globulin 
(AHG) and calcium ion to give plasma thromboplastin. 
This is not sufficient in itself to produce rapid pro- 
thrombin conversion, at least one additional step being 
necessary—namely, an interaction with the active form 
of accelerator globulin (AcG, VI). A rapid conversion 
of prothrombin now takes place. Note how thrombin 
is instrumental in converting inactive accelerator globu- 
lin to an active form, that it is involved in platelet 
alterations and may be involved directly in the conver- 
sion of prothrombin. All of the elements in this scheme 
are present in plasma. 

The lower part of the Fig. 3 shows a sequence of 
events involving tissue components (tissue thrombo- 
plastin). Tissue thromboplastin short circuits platelet 
contributions at the start but requires calcium ion, the 
active form of accelerator globulin, and a new protein 
factor, convertin, which is produced from a precursor 
by the action of thrombin (cf. Goldstein and Alexander 
in reference 5, p. 93). It is generally held that the tissue 
thromboplastin-convertin cycle gives an immediate 
thrombin production which then operates on itself and 
on the upper mechanisms of Fig. 3 to yield a fast 
production of thrombin. 

During the rapid development of thrombin (upper 
part of Fig. 3), all of the prothrombin may not be con- 
verted to thrombin. In the scheme presented, this is 
most readily understood by the fact that, during pro- 
thrombin activation, plasma thromboplastin, and thus 
at least some of the components which give rise to 
plasma thromboplastin, are either inactivated or con- 
sumed, as is true also of tissue factors. This is related 
to the confinement mechanisms which are outlined in 
the following. 

Another aspect of the thrombin-production feedback 
mechanism which should be mentioned is that the 
platelets are thought to liberate a substance which is 
an antiheparin, and since, as is seen, heparin is inti- 
mately involved in the destruction of thrombin and 
thromboplastin, removal of heparin constitutes an ad- 
ditional positive factor in thrombin production. 

The scheme of Fig. 3 has presented one possible se- 
quence of events. The situation is probably more com- 
plicated than this, as is evident from a consideration of 


the following. First, at least two possible factors have 
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been omitted from the scheme. These are the Hegeman 
factor and PTA or plasma thromboplastin antecedent. 
Second, the number and sequences of reactions leading 
to the development of plasma thromboplastin are not 
clear. Not only is it possible that the details of factor 
interactions as given are incorrect, but also there is 
known to be opportunity for substitution under the 
proper conditions. For example, the platelets liberate a 
material which can substitute for the active form of 
accelerator globulin. 

It is in this series of reactions that one finds the 
proven genetically controlled clotting defects. An ab- 
normality in antihemophilic globulin is the defect of 
classical hemophilia, which was described over 150 years 
ago. It is a sex-linked recessive and manifests itself in 
varying deficiency. Seegers el al. have evidence to 
show that this abnormality may be owing to inhibitors 
which block AHG. Lack of accelerator globulin (V), 
although it may be acquired through liver damage, may 
also be of hereditary origin; however, the inheritance is 
uncertain. Lack of PTC (hemophilia B, Christmas 
disease, etc.) is an hereditary disease which is a sex- 
linked recessive, similar to AHG deficiency. 

There are two important series of reactions which are 
involved in confining coagulation to a locality and in 
removing unwanted clots. In approaching the mecha- 
nisms which confine clot development, it is necessary 
to realize that in wounds the hemostatic mechanisms 
are operating in the relatively stagnant or slowly flowing 
blood which occurs near the constricted ends of cut 
vessels. Whatever blood pressure is operating on these 
cut vessels will tend to work the coagulating system to 
the exterior. Thus, the problem is one of handling back- 
diffusion and any back-flow of active clotting agents. 
Of course, if such agents are carried into the general 
circulation, there occurs a relatively extensive dilution. 

Confinement of clot development and the shutting 
off of coagulation reactions involve a series of antieffects 
which operate on the basis of adsorption, stoichiometric 
combination, and enzyme action. Some mechanisms are 
not well established, although a high probability can 
be assigned to their existence. 

The first rather interesting reaction which is known 
to have importance in limiting clot development is the 
adsorption of thrombin to fibrinogen and fibrin.*11?.8 

In a static blood volume, such an effect allows complete 
local clotting to be accomplished before thrombin can 
diffuse out of the locality. 
‘here then follows a series of reactions which have 
en grouped in Table I according to whether or not 
are thrombin dependent. The second group is 
hrombin dependent; the equations mean that throm- 
þin, alone or with other materials, will destroy three 
1n, rocoagulants : accelerator globulin, prothrombin, 
key Ptihemophilic globulin. 
and antihemop he three equations listed in the 
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to effect its inactivation. When the mechanisms of the 
first three groups of Table I are coupled with the many 
observations that materials such as AHG, PTC, AcG, 
platelet factor 3, etc., appear to be consumed in acti- 
vating prothrombin, it is apparent that the explosive 
development of thrombin can be sharply terminated in 
the region of the clot. 

The presence of sizable amounts of free thrombin 
always, however, constitutes a dangerous situation and 
two mechanisms, one rapid and the other slow, have 
apparently been devised to cope with the situation. 
The first of these is indicated in the first equation under 
group 4 and is the rapid reaction. The experiments 
designed to prove the existence of this inactivating 
factor are assailable, but essentially the suggestion is 
that, when normal prothrombin is activated to thrombin 
in plasma, there arises at the same time an ephemeral 
factor which can inactivate the resulting thrombin. 

The greatest capacity for inactivating thrombin is 
distributed throughout the plasma itself. This activity 
is indicated in the second reaction under group 4, the 
responsible substance or substances being referred to as 
normal antithrombin. There is present in plasma suffi- 
cient normal antithrombin to inactivate 1.5 times the 
thrombin which could appear if all of the prothrombin 
were activated to thrombin. This places the capacity 
of antithrombin at about 450 units/ml. Under ordinary 
conditions, normal antithrombin acts relatively slowly, 
as is evident from the time disappearance of thrombin 
added to plasma which has been diluted fifty-fold with 
saline (Fig. 4, curve 1, see Waugh and Fitzgerald’). 
It is certain that antithrombin is itself inactivated in 
the process of destroying thrombin; the reaction is thus 
stoichiometric. A reversal of the thrombin-antithrombin 
complex has so far not been accomplished. It should be 
emphasized once more that the capacity of plasma 
antithrombin is about 450 units/ml. This large plasma- 
antithrombin capacity to a certain extent compensates 
for the small rate constant of the second-order thrombin- 
antithrombin reaction, the result being that when one 
unit of thrombin is present per milliliter plasma, in- 
activation will take place at the rate of 0.05 units 


TABLE I. Confinement and shut-off mechanisms. 


Group 1. Thrombin--fibrin = thrombin fibrin 


T 
Group 2. AcG, VI — inactive AcG, VI 
_ T, factors 
Prothrombin > inactive prothrombin 


T 
AHG —> inactive AHG 


Group 3. Antithromboplastin+thromboplastin — inactive 
product 

Antiplatelet factors+-platelet factors — inactive 
products 


Anti AcG VI+AcG VI —> inactive product 


Group 4. Thrombin-+inactivating factor — inactive thrombin 
Thrombin-Fantithrombin — inactive (thrombin __ 
antithrombin) 
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Fic. 4. Inactivation of thrombin by antithrombin. Curve 1 
refers to normal antithrombin and curve 2 to the effects which 
are produced by traces of heparin. 


thrombin per second. This rate of inactivation will 
increase the clotting time under conditions of fast 
activation but will not act with sufficient rapidity to 
prevent clotting. 

The use of heparin, the highly acidic (sulfated) 
mucopolysaccharide, as an anticoagulant is well known. 
At this time, attention is directed only to the fact that 
heparin, at the small concentrations at which it exerts 


a strong anticoagulant action, acts through the normal _ 


antithrombin mechanism.!® Figure 4 indicates the im- 
portant points of this action—namely, that heparin 
speeds up the rate of normal antithrombin action by a 
factor of over 50 and at the same time decreases the 
efficiency of inactivation so that the antithrombin 
capacity drops by a factor of 1.5 to 2.0. Heparin, a 
product of the liver, is present in most tissues of the 
body and can be liberated and appear in the blood 
stream during anaphylactic shock. The normal concen- 
tration of free heparin in the blood, if it is present at 
all, is so low as to be difficult to measure. From the 
standpoint of hemostatic feedback mechanisms, the 
possibility that traces of heparin determine the rate at 
which normal antithrombin acts is most interesting. 
Heparin has been mentioned earlier as an important 
factor in the destruction of plasma thromboplastin. 

It is now clear that the shut-off mechanisms attack 
any active components which leave the clot area. Under 
more favorable circumstances of time it would be neces- 
sary at this juncture to examine some of the concepts 
concerning the necessity of having threshold levels of 
procoagulants before the activating mechanisms are 
effective. There is evidence that such is the case and, if 
so, threshold levels must be intimately associated with 
the shut-off mechanism. That is, they would prevent 
an extensive spread of the explosive development of 
thrombin under conditions where activators would 
escape and be carried into the blood stream. 

Consider now the clot structure itself. The developing 
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clot has platelets adhering to its fibers, particularly at i 
the fiber junctions. These platelets may be the loci at | 
which such junctions were formed in the first place, it 
being understood that purified fibrinogens will form firm 
clots without platelets being present, thus that platelets 
are not necessary for fibrin development. In any event, 
the presence of the platelets gives rise in the normal 
course of events to an expulsion of the fluid portions of 
the clot, thus to clot retraction. As pointed out at the 
start, retraction gives rise to a relatively fine-meshed 
protective wound cover. It is now quite certain that 
platelets or platelet derivatives (e.g., serotonin) are 
necessary in the retraction process. The opinion has been 
expressed that the fibers of the clot are “contracting” 
in this process of clot retraction. Another alternative, 
which is the more likely, is that a process of zippering 
together is initiated at the fiber junctions and that this 
zippering process extends gradually throughout the clot. 

Another important action takes place during clotting. 
This is the adsorption of an inactive enzyme called 
profibrinolysin” on to the strands of the clot. On activa- 
tion, profibrinolysin gives rise to the enzyme fibrinolysin 
whose specificity is directed against fibrin, and to an 
extent fibrinogen. Fibrin is attacked via the splitting of - 
peptide bonds, the fragments produced being unable to 
reassociate to form fibrin or a clot structure under the 
conditions existing in plasma. Not much is known about 
the normal mechanism for the activation of profibrinol- 
ysin, current methods for activation involving such 
techniques as the shaking of plasma with chloroform 
or the addition of chemical compounds, for example, 
streptokinase which is obtained from the corresponding 
cocci. The focus of attention in this discussion is the 
fact that fibrinolysin activity is known to be generated 
within the liquor sanguinis itself: after accidental death, 
by fear, stress, and severe exercise, after epinephrine 
injections and at times in “normal” individuals. With 
an appreciation of the importance of the coagulation 
mechanisms in everyday physiology, it is not surprising 
to find an interest in the possibility that a chemical or 
physical treatment can be found which activates only 
the fibrinolysin which is adsorbed to the fibrin clot. 
Another statement, which should come as no surprise, 
is that there occurs in the blood a potent antifibrinolysin 
activity which so far has prevented the therapeutic use 
of fibrinolysin preparations. 

At the onset, reference was made to the possibility 
that a slow, continuous coagulation was necessary in 
the maintenance of capillary structure. Immediately 
thereafter, the factors involved in fast clotting and the __ 
control of the fast-clotting process were outlined. Return 
now to the problem of slow clotting, realizing that many 
of the complications of clotting may occur as a necessity 
for maintaining and controlling a system acting slowly 
and continuously, but poised for rapid action. ai 

No absolute proof can be offered in which coagulation 
factors or the products of coagulation, for examp 
monomer fibrin, can be directly implicated in capill 
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TABLE II. Hemorrhage owing to clotting defects. 
a U 


I. Platelet deficiency: from unknown causes or symptomatic 


II. Defects in activating components 
(a) Antihemophilic globulin (AHG) 
(b) Plasma-thromboplastin component (PTC) 
(c) Plasma-thromboplastin antecedent (PTA) 
(d) Accelerator globulin (AOG) 


ILII. Prothrombin deficiency: Jack of vitamin K, caused by drugs, 
by disease or is congenital 


IV. Fibrinogen deficiency: congenital or through liver damage 
V. Anticoagulants: heparin, enzymes (shock) 


properties such as fragility and permeability. The 
evidence is indirect and can be summarized briefly by 
pointing out that interferences in the normal clotting 
mechanisms often lead to internal hemorrhage, a situa- 
tion in which blood cells and plasma leak out through 
the capillaries into the tissue spaces. A few of the coagu- 
lation defects which lead to hemorrhage are shown in 
Table II (see also Wintrobe,! p. 285). 

First, it is seen that a deficiency in platelets may lead 
to hemorrhage. This deficiency can arise either through 
an inherent platelet deficiency or through an external 
agent, disease for example, which reduces the numbers 
of platelets. 

Second, deficiencies in procoagulation components 
have the same effect. Here one finds included anti- 
hemophilic globulin, plasma-thromboplastin component 
(mentioned earlier), possibly plasma-thromboplastin 
antecedent, and accelerator globulin. 

Third, a deficiency of prothrombin, whether it be 
induced by vitamin-K deficiency, by drugs, by disease, 
or is congenital. 

Fourth, a deficiency in fibrinogen: congenital or as a 
result of liver damage. 

Fifth, drugs which increase antithrombin action such 
as heparin and circulating anticoagulant enzymes which 
may appear in shock. 

The common denominator for this impressive list is 
active coagulation. In considering the physiological 
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necessity for slow coagulation, the author has been im- 
pressed by the difficulty with which the normal anti- 
coagulation mechanisms remove or inhibit the last 
traces of thrombin. This is a situation so frequently 
encountered that one is tempted to think in terms of 
traces of highly resistant thrombin. Such a conclusion 
would be premature since the alternative that the co- 
agulation system also has a mechanism which preserves 
slight traces of thrombin is attractive: such a system 
would not only provide for slow continuous clotting 
but would supply the traces of thrombin which might 
spark the initial stage of autocatalytic activation. I 
feel that the slow-clotting mechanism will emerge as a 
mechanism warranting close attention in the future. 


BIBLIOGRAPHY 


1M. M. Wintrobe, Clinical Hematology (Lea and Febiger, 
Philadelphia, 1956), fourth edition. 

2R. Biggs and R. G. MacFarlane, Human Blood Coagulation 
and its Disorders (Charles C. Thomas, Springfield, Illinois, 1957), 
second edition. 

3K. M. Brinkhous, R. D. Langdell, and R. H. Wagner, Ann. 
Rev. Med. 9, 159 (1958). 

4 W. H. Seegers, Advances in Enzymol. 16, 23 (1955). 

5 K. M. Brinkhous, editor, /lernational Symposium on Hemo- 
philia and Hemophiloid Diseases (University of North Carolina 
Press, Chapel Hill, North Carolina, 1957). 

6 W. H. Seegers, R. I. McClaughry, and J. L. Fahey, Blood 5, 
421 (1950). 

7¥, Lamy and D. F. Waugh, Physiol. Rev. 34, 722 (1954). 

8 S. J. Shulman, J. Am. Chem. Soc. 74, 5706 (1952). 

J. E. Fitzgerald, N. S. Schneider, and D. F. Waugh, J. Am. 
Chem. Soc. 79, 601 (1957). 

1 L, Lorand and A. Jacobsen, J. Biol. Chem. 230, 421 (1958). 

u W., A. Scheraga and M. Laskowski, Jr., Advances in Protein 
Chem. 12, 1 (1958). 

12S, Shulman, N. Alkjaersig, and S. Sherry, J. Biol. Chem. 233, 
91 (1958). 

13. W, H. Seegers, B. H. Landaburu, R. R. Holburn, and L. M. 
Tocantins, Proc. Soc. Exptl. Biol. Med. 95, 583 (1957). 

“ D, F. Waugh and B. J. Livingstone, J. Phys. & Colloid Chem. 
55, 1206 (1951). 

15 D, F. Waugh and M. A. Fitzgerald, Am. J. Physiol. 14, 627 
(1956). 


È 
¢ 


REVIEWS OF MODERN PHYSICS 


VOLUME 31, NUMBER 2 


61 
Hormone Regulation 


DEWITT STETTEN, JR. 


National Institute of Arthritis and Metabolic Diseases, National Institutes of Health, Public Health Service, 
U. S. Department of Health, Education, and Welfare, Bethesda 14, Maryland 


N higher organisms, whether animal or vegetable, 
there exist channels through which fluids are trans- 
mitted from one tissue to another. Into these fluids or 
humors are delivered many compounds elaborated in 
one or another tissue. It is to be expected that, in many 
cases, these compounds will exert recognizable physio- 
logical effects in other tissues to which they may be 
hydraulically transmitted. Among the host of com- 
pounds which meet these criteria, a limited number 
have been set apart under the special category of hor- 
mone or “chemical messenger.” It is the purpose here 
to review briefly what these hormones are, whence they 
arise, whither they go, and how they accomplish their 
missions. Although hormonal mechanisms are well de- 
scribed in invertebrates and indeed in plants, these re- 
marks are restricted to mammalian endocrinology. 

To speak of the mission of a compound is to lapse 
into teleological jargon. Teleology, however, has been 
defined as a woman with whom no scientist cares to be 
seen in public, but with whom all scientists consort in 
private. It will simplify my task if you will forgive my 
teleological indiscretions in this paper which permits 
me to dodge the otherwise necessary cumbersome 
circumlocutions. 

What then is a hormone? It is a compound devoid of 
intelligence but containing nonetheless in its structure 
certain information. While there are no Koch’s postu- 
lates by which one may judge whether a compound is 
or is not a hormone, there are certain accepted standards 
to which most respectable hormones conform. 

1. In many but not all cases, hormones arise in tissues 
which appear to be specialized for their production. In 
the mammal, these organs include the neurohypophysis, 
the adenohypophysis, the thyroid, the parathyroids, the 
islets of Langerhans of the pancreas, the adrenal medul- 
lae, the adrenal cortices, the testes, the ovarian follicles, 
and the ovarian corpora lutea. Other organs, such as 
the pineal body, the hypothalamus, the thymus, the 
carotid bodies, etc., are not included in the present list 
of so-called endocrine organs in view of their relatively 
uncertain status and lack of specific and convincing 
information. 

It should not be supposed, however, that all known 
hormones arise in tissues anatomically specialized for 
their production. The word “hormone” was coined 
actually to describe an as yet unidentified principle 
termed secretin, discovered by Bayliss and Starling, 
which appears to arise in the intestinal mucosa, to enter 
the blood stream, and to stimulate, upon its arrival 
there, the secretion of juices by the pancreas. A second 
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but distinct material, arising from the same source, 
cholecystokinin, stimulates the contracture of the gall 
bladder, and the injection of bile into the duodenal 
lumen. Similarly, there is partial evidence for the ex- 
istence of renal hormones affecting vascular tone, he- 
patic hormones affecting uptake of blood glucose by 
muscle, and indeed a hormone liberated by working 
skeletal muscle which affects the metabolism of resting 
muscle in the same animal. 

Whereas a number of the hormones which arise in the 
specialized endocrine glands, e.g., the pituitary, the 
adrenal, etc., have been purified and characterized 
structurally, this has not been accomplished for hor- 
mones arising from such tissues as liver or intestinal 
mucosa. Consequently, the structural information pre- 
sented is confined to products of the so-called glands 
of internal secretion. 

2. A second characteristic of hormones is that they 
normally circulate and exert their regulatory roles at 
low molar concentrations. On the basis of published 
estimates of the normal concentration of insulin in 
blood, it would appear that this agent is effective at 
concentrations of about 107" to 10-” molar, and epi- 
nephrine, glucagon, and thyroxin all appear to exert 
recognizable effects at similar levels of concentration. 

3. Some, but not all, hormones exert their effects 
upon highly specific target organs. Thus, adrenocorti- 
cotropic hormone of the pituitary operates essentially 
uniquely upon the adrenal cortex, while the activity of 
thyrotropic hormone is restricted largely to the thyroid. 
The sole convincingly demonstrated action of glucagon 
is upon the liver. On the other hand, epinephrine exerts 
actions upon liver and muscle, insulin has been shown 
to operate in many tissues including skeletal muscle, 
adipose tissue, and mammary gland, whereas the effects 
of adrenocortical steroids and thyroxin may be demon- 
strated in a wide variety of tissues. 

4. If a hormone is to exert a regulatory effect, clearly 
the rate of its release into the blood stream must be 
variable and should be determined by some function of 
the mechanism which it regulates. Such variation in 
rate of discharge has been demonstrated for certain but 
not for all hormones. For example, one may cite insulin, 
which regulates the level of glucose in the blood and the 
release of which by the islet cells of the pancreas is 
determined directly by the concentration of glucose in 
the pancreatic blood supply. Analogously, ACTH, re- 
leased by the pituitary, specifically stimulates the 
adrenal cortex to discharge steroid hormones, which, if 
injected, cause a prompt suppression of ACTH produc- 
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Fic. 1. Structures of certain phenolic hormones. 


tion. Many other such examples of negative feedback, 
more or less well-documented, might be cited. 

It may be profitable to consider briefly some of the 
structural types which are represented among the 
hormones. These include a variety of substances, among 
the simplest of which are certain phenols (Fig. 1). 
Inder the regulation of thyrotropic hormone derived 
trom the anterior pituitary gland, a protein, thyro- 
globulin, is elaborated in the thyroid gland, which con- 
tains the unique amino acid thyroxin, 3,5,3’,5’-tetra- 
jodothyronine. This amino acid, accompanied by tri- 
iodothyronine, is discharged into the bloodstream where 
it circulates, largely protein-bound, in the plasma. 
Epinephrine and norepinephrine derived from adrenal 
medulla are related closely both metabolically and struc- 
turally. There is a considerable amount of information 


o) 
| 


el 


HO 


Estrone 


Substituent at C Numbers 
opt 
2 Compound 17 16 


aor B Estradiol —OH —H 


Estriol —OH —OH 
_— 


of certain Cis Steroid hormones (N. B. Equilin 


G. 2. Sic epresent successive States of oxidation of B ring 
and equ 


of estrone) z 


SAP IB APA IN] WAR 


at hand defining the biosynthetic origins of these com- 
pounds. Tyrosine contributes the phenolic portions, in 
each case. With these, as well as with certain other 
hormones, secondarysites of origin exist. The synthesis of 
thyroxin-like materials in the totally thyroidectomized 
animal! is perhaps not surprising in view of the fact that 
tyrosine-containing proteins, after incubation with 
iodine in alkali, in the absence of enzymes, exhibit 
thyroxin activity. 

Steroidal hormones comprise a large and much- 
studied group. They often are classified structurally 
according to the number of carbon atoms contained. 
Thus, the Cis steroids include the estrogens (Fig. 2), 
estrone, œ- and @-estradiol, estriol, as well as the less 
familiar equilin and equilenin. All of these compounds 
have phenolic—i.e., acidic—properties. The methyl 
group born on carbon atom 10 of most other steroids is 
lacking in these compounds, as a consequence of the 
unsaturation. Arising predominantly in the Graafian 
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follicle of the ovary, under regulation by pituitary 
gonadotropin, estrogens are generated also by testis and 
possibly also by adrenal cortex. They are detoxified 
normally in the liver by conjugation with glucuronic or 
sulfuric acids, and the loss of this function in hepatic 
failure results in feminizing symptoms and signs. 

The Cı steroids include (Fig. 3) the androgens, tes- 
tosterone and its congeners. There are a number of 
steroids with androgenic activity and these may arise 
in the adrenal cortex as well as in the testis. The Ca 
steroids (Fig. 4) include not only the several adreno- 
cortical steroids but also progesterone, the hormone of 
the ovarian corpus luteum. The generation of these 
hormones is under direct control of specific anterior 
pituitary hormones. Testicular production of androgens 
is regulated by pituitary gonadotropin, adrenocortical 
activity is determined by pituitary adrenocorticotropin, 
and progesterone production depends upon a luteinizing 
hormone of the pituitary gland. 
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The structural similarities of the several steroid 
hormones are explained by the fact that they are all 
biogenetically closely related. Cholesterol appears to be 
a parent compound to all of them, and many of the 
individual enzyme-catalyzed steps whereby they arise 
have been elucidated. It is, of course, of great interest 
that specific and important biological activities of a 
wide variety are associated with steroids. In addition 
to the steroid hormones, other biologically important 
steroids are vitamin D, the cardiac aglycones of the 
digitalis group of drugs, the toad poisons, and certain 
carcinogens. What it is in the steroid structure which 
imparts to the members of this class these various ac- 
tivities is not known. 

The remaining groups of hormones about which there 
is structural information are polypeptides. Of these, 
insulin (Fig. 5) is probably the best-studied. Beef in- 
sulin, whose chemical structure was given by Sanger,” 
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is made up of two polypeptides, an A- or acidic chain 
comprising 21 amino acids and a B- or basic chain con- 
taining 29 amino acids. The intramolecular disulfide 
bridges have been described previously by Oncley 
(p. 30), and their integrity is essential to the physio- 
logical activity. The C-terminal alanine in the B-chain 
is apparently not essential, however, and indeed the 
eight C-terminal amino acids of this chain may be 
eliminated without complete inactivation. The question 
of what constitutes the essential center of physiological 
activity of insulin, if indeed there is such, remains un- 
answered. When this structure first was published, 
attention was attracted to the pentapeptide closed ring 
in the A-chain, particularly since rings of similar size 
were found by du Vigneaud?‘ to be present in both of the 
hormones of the posterior pituitary gland, vasopressin 
and oxytocin. The curious run of three adjacent aro- 
matic amino acids in the B-chain also was pointed out. 
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Fic, 5. Structures of bovine insulin and the hormones of the posterior pituitary. 
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Fic. 4. Structures of certain C2; steroid hormones (N. B. 
Progesterone differs from deoxycorticosterone only in that 
—CH-:OH(21) is reduced to —CH;3. 


No evidence has been presented implicating either of 
these sites specifically in insulin function. 

The posterior pituitary hormones present striking 
similarities and, as might be expected, some degree of 
cross-reactivity physiologically. There is some reason 
to believe that these compounds may be parts of a larger 
polypeptide arising in the hypothalamus and trans- 
mitted down the pituitary stalk, to be stored in and 
released from the posterior lobe of the pituitary gland. 

Attention is directed now to amino acids 8, 9, and 10 
in the A-chain of insulin. Insulins of several species have 
been prepared ; all are very similar in physical properties 
and are essentially identical in physiological assay, 
about 27 units per mg. Chemical differences do exist, 
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Fic. 6. Species variations in insulin structure. 


however, but thus far the sole differences that have been 
detected reside within these 3 amino acids (Fig. 6). 
Although the sample studied is confessedly too small 
to permit firm conclusions to be drawn, a few suggestive 
findings have turned up. In each position, 8, 9, or 10, 
only one of 2 amino acids has been found. Thus, in 
position 8, there is either alanine or threonine; in posi- 
tion 9, serine or glycine; in position 10, valine or iso- 
leucine. If this binary replacement proves general, and 
no other alterations are found, only 8 insulins are pos- 
sible. There may be, however, an additional restriction. 
On comparison of positions 8 and 10, it appears that, 
whenever the former is alanine, the latter is valine; 
whenever the former is threonine, the latter is isoleucine. 
Should this represent a firm link in the coding to these 
2 positions, only 4 insulins would occur, and indeed all 
types now are known. It is not surprising, then, that 
pig and whale insulins prove to be identical (Fig. 67). 

Insulin is generated in the £ cells of the islets of 
Langerhans. Its function, as is discussed later, is to 
favor the utilization of blood glucose by muscle, to 
restrict glucose formation by liver. It is noteworthy, 
therefore, that the release of insulin by £ cells is favored 
by a high concentration of blood glucose. 

Intimately associated anatomically with the 8 cells 
are the a cells of the islets which are the source of glu- 
cagon. This polypeptide (Fig. 7) was for many years an 
unsuspected contaminant of the best available insulin 
from American commercial sources although, because of 
differences in manufacturing procedure, it was lacking 
in Danish insulin. Chemically and physiologically it is 
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unrelated to insulin.® The sole sequential similarity be- 
tween insulin and glucagon that has been pointed out 
is the series 22-24 in glucagon (Phe. Val. Glu NH) 
which resembles Phe. Val. Asp NH: (1-3 chain B) of 
insulin. Whereas both glucagon and insulin affect the 
level of blood glucose, they do so by influencing quite 
different processes, and appear to be antagonists only 
on very superficial inspection. 

The last group of hormones about which there is 
useful structural information are the polypeptides of 
the anterior pituitary gland. There are at least six of 
these, thyrotropin, adrenocorticotropin, somatotropin 
(growth hormone), follicle stimulating, luteinizing, and 
melanophore stimulating hormones. Other activities 
have been suspected in the past and may be revealed 
in the future. Complete structural information is avail- 
able, for certain species, in regard to corticotropin and 
melanotropins. The formulas for certain of these have 
been given by Oncley (p. 30). Here, attention is 
directed only to certain salient points of similarity and 
dissimilarity (Fig. 8). Amino acids 4 through 10 of all 
corticotropins studied are identical with amino acids 7 
through 13 of all known melanotropins. This identity 
of a series of seven amino acids is sufficiently startling 
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to attract attention. Li® has pointed out that the identi- 
cal sequences would have been eleven rather than seven, 
but for a curious confusion about positions 3 and 11 of 
corticotropins, or 6 and 14 in melanotropins. An inter- 
change between the serine and lysine in the appropriate 
positions of either of these peptides would prolong the 
identical sequences from seven to eleven. 

This observation, as well as the previously mentioned 
species variations in insulin, suggests the occurrence of 
certain blunders in the transmission of the coded in- 
formation pertaining to amino-acid sequences. Such 
blunders, if indeed so they be, may themselves be in- 
formative. Thus, the present figure suggests that the 
code is such as to permit interchange of information 
about 2 amino acids 8 positions remote in a polypeptide. 

To date, accumulating knowledge of hormone struc- 
ture has, for the most part, revealed little about how 
these compounds function. Only within the past few 
years have reasonable hormone effects been observed in 
biological preparations disorganized below the cellular 
level. Several of these now may be mentioned. 

Thyroxin has been suspected for some time of exerting 
an effect similar to that of 2,5-dinitrophenol, a suspicion 
dating back to the early clinical observations of the 
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toxicology of the latter drug during World War I. When 
it was ascertained that dinitrophenol effected an un- 
coupling of oxidative phosphorylation, permitting respi- 
ration to proceed without obligatory and coincident 
phosphorylation, a similar effect was attributed to, and 
soon found with, thyroxin. This effect has been demon- 
strated with isolated mitochondria by Lehninger.” Such 


_ uncoupling, and consequent escape of glycolysis from 


the restraining influence of the Pasteur effect, would 
explain many of the clinical manifestations of thyro- 
toxIcosis. 

Estrogens have been shown by Villee® to activate the 
isocitric-acid dehydrogenase of placenta and this effect 
has been demonstrated in homogenized preparations 
(cell-free). More recently, Talalay® has indicated that 
the enzyme of placenta directly affected is a transhydro- 
genase which regenerates the TPNH required by the 
dehydrogenase in question. It is possible that, in this 
transhydrogenase reaction, the estrogen serves as a 
self-regenerating co-substrate. 

Two hormones about which there is partial mecha- 
nistic information are epinephrine and glucagon. It is 
perhaps remarkable that these two substances, wholly 
unrelated structurally and biogenetically, should both 
exert, at equimolar concentrations, virtually identical 
effects. These effects, elucidated by Sutherland,” relate 
to the activation of hepatic phosphorylase. This enzyme, 
required for glycogen breakdown and probably also for 
its synthesis, is itself enzymatically inactivated with the 
loss of inorganic phosphate. Its reactivation also may be 
accomplished enzymatically, but this requires a novel 
cofactor, an anhydride of AMP. The synthesis of this 
cofactor, by yet another enzyme, proceeds only in the 
presence of epinephrine and/or glucagon. The two 
hormones, though acting identically in liver, are quite 
different in their effects upon muscle glycogen. Here, 
whereas epinephrine is highly active, glucagon is with- 
out effect. 

Insulin, which has received the attention of many 
excellent investigators since its isolation, is in a more 
confused state than some of the other hormones which 
have been considered. The dramatic capacity to lower 
blood sugar apparently is the result of effects upon two 
processes. To evaluate these two effects, one should 
consider briefly some of the sources and fates of blood 
glucose (Fig. 9). Here are represented in abbreviated 
form some of the reactions of concern. In the absence 
of insulin, as in the severely diabetic human or as in the 
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pancreatectomized animal, the conversion of extracellu- 
lar glucose into intracellular glucose 6-phosphate, an 
essential first step in glucose utilization, is impaired. 
This impairment is especially demonstrable in muscle 
tissue and characteristically is seen in the isolated dia- 
phragm. The basis for the impairment is still somewhat 
obscure but the bulk of current evidence favors the 
view, first propounded by Levine," that insulin favors 
the entry of glucose into the intracellular compartment 
of muscle. Insulin demonstrably performs this function 
in relation to a number of other sugars. Possibly sig- 
nificant is the fact that the sugars most responsive to 
insulin in this regard are configurationally identical 
with glucose about carbon atoms 1, 2, and 3. 

A second pertinent insulin effect is defined best by 
the fact that the activity of glucose 6-phosphatase is 
greatly enhanced in the liver of the diabetic animal, an 
effect reversible by administration of insulin. The im- 
portance of this enzyme is revealed by the fact that it 
catalyzes the reaction responsible for the majority of 
glucose contributed to the blood by animal tissues. 

Consider the dilemma of the diabetic animal in the 
light of these two defects. With the conversion of glucose 
to glucose 6-phosphate restricted, and with the hydroly- 
sis of this ester exaggerated, there results a curtailment 
of glucose phosphate needed for a variety of important 
processes, including glycolysis, oxidation, generation of 
reduced coenzymes, preparation of essential intermedi- 
ates. Many, but certainly not all, of the late conse- 
quences of untreated diabetes are referable directly to 
this combination of biochemical defects. The question 
of which of the two defects is most significant is in a 
state of flux at the present time. Data from Hastings” 
and others suggest that the increase in utilization of 
glucose by muscle occurs much more promptly in re- 
sponse to insulin than does a decrease in glucose 6- 
phosphatase activity. On the other hand, Weinhouse 
and his collaborators'* have data indicating that the 
reverse is true. 

In relation to this figure, one can consider also the 
purported “antagonism” between insulin and glucagon. 
Glucagon, as mentioned previously, effects favorably 
the activity of hepatic phosphorylase, promotes break- 
down of liver glycogen, and produces a resulting rise in 
blood glucose. Insulin, by its actions, produces a fall in 
blood glucose. If one’s vision is restricted to the glucose 
of the blood, these two hormones act antagonistically. 
The blood glucose, however, is really relatively un- 
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important, except insofar as it nourishes the several 
tissues of the body, and the major consuming tissue is 
skeletal muscle. With respect to the nutrition of skeletal 
muscle, glucagon and insulin, far from being antagonists, 
are synergists. Glucagon supplies to the blood the glu- 
cose which muscle requires, and insulin facilitates the 
entry of this glucose into the cells so that it may be 
utilized. 

It was mentioned at the outset that many endocrines, 
insulin among these, are physiologically active at re- 
markably low concentrations. The finding of Stadie” 
that several tissues responding to insulin in vitro are 
endowed with the capacity to bind insulin to their 
surfaces, and that the response to insulin is, under ap- 
propriate circumstances, proportional to the amount of 
insulin bound, is perhaps relevant. If the target organ 
has the capacity to bind the hormone to itself, local 
concentrations of hormone may be built up far in excess 
of the concentrations occurring in the blood. The dose- 
response relationship for several hormones is compatible 
with this idea in that, whereas the response is approxi- 
mately proportional to dosage at low dosage levels, for 
many hormones a maximal response is attained asymp- 
totically at high dosage levels. This suggests that some- 
thing, possibly the acceptor site on the target organ, is 
being saturated with hormone. This situation may be 
treated arithmetically’® by assuming a reversible associ- 
ation between hormone (H) and target-organ acceptor 


site (T). If K is the dissociation constant and Q is the 
‘total concentration of acceptor sites (Fig. 10), the 


roblem may be treated as Lineweaver and Burk treated 
the Michaelis-Menten problem, to yield the final ex- 
pression on the figure. If now it be assumed tentatively 
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that the concentration of hormone H is proportional to 
dosage and that the response is proportional to the 
concentration of bound hormone HT, the reciprocals of 
these two quantities should be related linearly. In a 
limited number of cases where this relationship has been 
tested, the expected linear relationship indeed has been 
found. Values for K and Q can be estimated from the 
slopes and intercepts of such plots. The usefulness of 
the relationship diminishes insofar as some quantity 
other than the concentration of bound hormone limits 
the observed response. 

It is perhaps unfortunate to terminate this distin- 
guished series upon such an unsatisfactory note. In 
earlier papers on macromolecular structure and inter- 
action, biological fine structure, information storage and 
retrieval, etc., one was left with the idea that there is a 
narrowing gap between the biological description and 
the physical model, or even explanation. In the matters 
presently under consideration, the gap is so wide that I, 
for one, am unable to see across it. Of such a gap, one 
may hope, but one can not know, that it is being 
narrowed. Certainly many experiments will have to be 
performed before the hormones, their origins, their 
modes of action, and their interrelationships can be 
defined in physical terms. 
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I. INTRODUCTION 


s 

ESEARCH on line broadening, though often 
regarded as pedestrian and unlikely to lead to 

new fundamental insights, is nonetheless inspired by a 
vision. One dreams that in a distant star or in some 
other inaccessible region filled with matter a few atom: 
perhaps hydrogen atoms, emit lines whose structure can 
be analyzed in terrestrial laboratories. These lines carry 
in many cases information we have already learned t o 
understand : red shifts revealing masses, Doppler shifts 
revealing motions, and sometimes Doppler widths 
revealing temperatures. It is clear that in principl 
physical properties of the medium containing 
radiating atoms are somehow reflected in the 
structure, since they affect the forces between 
radiating atom and its neighbors, the distances oy 
which these forces are exerted, and the times dur: 
which they act. One hopes, therefore, that when ; 
language of the spectral lines has been fully learned, a 
radiating atom in a distant material environment can 
serve as a noninterfering probe conveying si 
data regarding pressure, temperature, or, m 
ally, the distribution of molecular speeds, a 


now far from fulfillment, an this 
a modest contrat pe 


This survey is limited to a small part of the line-width 
problem, a problem whose scope and complexity is not 
widely appreciated and which has suffered perhaps 
from an excessive optimism of authors who, on pro- 
posing a partially successful simple model yielding to 
mathematical treatment, have served enthusiastic 
notices that the whole problem is solved. When, as 
often happens, experimental observations fit the special 
model, the matter is easily regarded as closed despite 
logical difficulties, and it sometimes takes severely 
critical appraisals to set the problem on the path of 
progress again. 
The part treated here is plasma broadening, a group 
of effects peculiar to lines emitted in a medium that 
is strongly ionized but has no net charge. Hence we do 
not consider the results of impacts between neutral 
molecules, nor any of the theories suitable principally 
for their description. Other review! articles fill this 
need. In a plasma, the heavy positive charges and the 
light electrons often require different treatment, as the 
following discussion shows. Attention is restricted to 
emission because in ionized media absorption is not 
often directly observable and the problems connected 
with phenomena of reabsorption necessitate considera- 
tions unrelated to the broadening agencies here re- 
viewed. Nor are microwaves specifically included in our 
work; optical lines are our chief concern. 
Applications of the theory are found in four widely 
separated fields of investigation: in astrophysics, gas 
discharges, strong shocks, and flames or explosions. 
Some practical urgency attaches to line-width studies 
because of the importance which all of these researches 
currently enjoy. 
This introductory section surveys two very general 
methods that have found application to plasma line- 
broadening problems. One is called statistical theory; 
the other is variously called impact theory, velocity 
broadening or phase-shift broadening theory. They refer 
to different models that cannot easily be compared. 
Because of the intuitive difference of the models the 
effects they generate are sometimes regarded as separate 
or even additive, and oversimplifications have appeared 
as a result of this seductive fallacy. Statistical and 
impact theories are certain limiting instances of a more 
= general theory, each having its own range of application, 
= and each losing its validity in a large domain of physical 
conditions of interest to the experimenter. There are 
also situations in which both theories apply and give 
identical answers. But it happens that in plasma 
_proadening the integrity of the two methods is rather 
better preserved than under almost any other circum- 
av, Weston! RSH Be eager 
AE esschr. astron. Ges. 78, 213 (1943); (d) S. 
Revs. Modern Phys. 29, 20 (1957); (e) R.G. 
rn Phys. 29, 94, 1957; (f) I. I. Sobel’man, 
‘S.R.) 54, 551 (1954). (This paper contains 


inteeD ey, between the Doppler effect and 
a problem not considered here.) 


Phy: 
Dee 


on bro 
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Fıc. 1. Dependence of 
energies on interparticle 
distance. 


stances, the reason being that the heavy ions exert 
long-range and slowly varying forces which satisfy the 
statistical theory, whereas the light and swift electrons 
are very often tractable by impact methods. 


A. Statistical Theory’ 


A spectral line arising from a transition between an 
upper level of energy Æx and a lower of energy Fi, has 
its normal (radian) frequency (Z2— /,)/h. But Ez and 
E are not constants; they vary as a result of per- 
turbations caused by other molecules, the “‘perturbers.” 
In the presence of only one perturber the dependence 
of E» and E; on the distance, 7, between the centers of 
the radiating and perturbing particle is given by the 
well-known potential energy curves, of which a typical 
set is drawn in Fig. 1. A transition occurring at a large 
distance 73 has a normal w, at 72, w is smaller and at 71 
larger than normal. Thus to every r there corresponds 
an w(r) = (E2(r)— E1(r))/% and some w’s are more likely 
than others. In a statistical sense, the probability P(w) 
equals the fraction of configuration space in which 
E.— E,=hw or, more precisely, that fraction times the 
Boltzmann factor, exp{—[2(r)— H2() |/kT}. It has 
been customary to neglect the Boltzmann factor on the 
supposition that it is practically 1 where P(w) is large. 
This may sometimes lead to serious errors; yet we 
assume it here. 

In the presence of N perturbers, the frequency dis- 
tribution 7(w) is still given by P(w), but configuration 
space now has 3Ņ dimensions, and 


I (w)dw= P(w)d(w) 


eT Vius 


= av)" [ . [rere --ry2dry:+-dry. (L.A) 


The integration extends over the restricted domain in 
which 


w— dw <w(r1,72 ry) <wtdw. 


2 (a) H. Margenau, Phys. Rev. 40, 387 (1932); (b) 43, 129 — 
(1933); (c) 44, 931 (1933); (d) 48, 755 (1935); (e) 82, 156 (1951). 
*M. Kulp, Z. Physik 79, 495 (1932); 87, 245 (1933). : 
(a) H. Kuhn, Phil, Mag. 18, 987 (1934); (b) H. Kuhn and F- 
London, Phil. Mag, 18, 983 (1934). = 
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Usually, in application of statistical theory, additivity 
is assumed for the contributions to w: 


w(rie+-ry)=L w(7;). 


This is often true for large r, but never for small r, 
where the main contribution to the interaction comes 
from the exchange forces. In a plasma, w(r) for large r 
is the difference in atomic energy resulting from the 
Coulomb forces which are exerted by the passing ions 
and electrons. Mathematical techniques for evaluating 
the integrals in (1.1) involve the use of 6 functions 
(Dirichlet’s method) now so common as not to require 
review (see Margenau® and Chandrasekhar‘). 


B. Impact Theory 


An infinitely sharp line implies a radiating process of 
"infinite duration. Natural line width arises from the 
fact that the upper state of the atom has a finite life 
time, and can be obtained by carrying out a Fourier 
analysis of the sinusoidal vibration for a finite time. 
According to the reasoning of Lorentz, impacts of 
perturbers shorten the lifetime below the natural dura- 
tion and thereby increase the width of the emitted line. 
Mathematical representation of this process is very 
simple: Suppose an atom is allowed to emit radiation 
of frequency w’ between times 0 and T. Then the dis- 
tribution of Fourier amplitudes is given by 


et(e-w/)T__ 1 


T 
J (w,T) « f eieo t= 
0 


i(w—w’) 


If this were the only radiative act observed, the fre- 
quency distribution would be |J (w,T)|?. The observed 
line, however, is a composite of radiations from many 
atoms radiating for different lengths of time, but with 
a mean time equal to 7, the reciprocal of the collision 
frequency ve. The probability that a given atom radiate 
for T seconds is r~'e~7/". Hence the intensity at w is 


I(w) =r f “TeD) [2e-T Id T œ (1.2) 


(w— w)? ye 


This is the famous Lorentz “dispersion” curve whose 
full width at half-maximum is 2v on the radian 
frequency scale. In this article the “half-width” w; 
is defined as full width of the line at a height equal to 
z its maximum intensity. Formula (1.2), despite its 
simplicity, meets with singular success in predicting 
that a “pressure-broadened” line has a width propor- 
tional to the collision frequency, and since 


(1.3) 
in terms of the number density of perturbers n, the 
collision cross section q, and their velocity v, this width 


5S. Chandrasekhar, Revs. Modern Phys. 15, 1 (1943). 
6H. Lorentz, Proc. Amsterdam Acad. Sch 8, at Abe), 


Ye= ngQu 
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turns out to be proportional to the pressure at a 
constant temperature. Hence the name, pressure 
broadening. 

The meaning of an impact is clear only if the mole- 
cules are rigid spheres, an hypothesis of the worst sort 
for ions. Even for neutral structures, Eq. (1.2) is not 
wholly satisfactory, for if used in conjunction with (1.3) 
q must take on values quite different from gas-kinetic 
cross sections, usually larger values. To hide the dif- 
ficulty a new phrase was coined; theorists began to 
speak of optical cross sections, to the delight of experi- 
menters who now had a brand new quantity to measure. 
The unpleasant fact is, however, that Lorentz’ theory 
in its simple form breaks down when the perturbers 
are surrounded by force fields, i.e., when impacts are 
“soft”. 

Qualitatively speaking, collisions affect the radiative 
process in two ways: 


(1) Some actually terminate it, quenching the upper 
state by transferring the energy of excitation pre- 
maturely to another place. Precisely under what con- 
ditions an impact will quench is not easy to say because 
this process requires a delicate balance of energies 
within perturber and radiator. But it is known that 
quenching occurs, because it manifests itself in an 
added line width and a reduction in the total line 
intensity. 

(2) The other collision effect is a phase change in the 
emitted radiation accompanying the detuning of its 
frequency by a passing particle. If the phase change 
connected with a collision is large enough the passage 
will effectively divide the wave train into two incoherent 
ones, and this is tantamount to termination so far as 
line width is concerned. In this case no change in 
intensity accompanies this interruption because the 
atom goes on radiating a similar line during each suc- 
cessive free period. Lorentz did not distinguish between 
quenching and phase-altering impacts. 


Lenz and Weisskopf” incorporated the idea of phase 
changes in the impact theory and achieved thereby a 
measure of success in giving meaning to an optical 
cross section, or an “‘optical impact radius.” In the 
absence of perturbations, Weisskopf reasoned, w= con- 
stant=w’ and the phase is a linear function of 4, wt. 
This corresponds to a sharp line. If an appreciable 
change Ag is added to this linear trend during a per- 
turbation, that perturbation acts like Lorentz’s impact. 
Now 


Age if (o—«")dt= 7 f (AE;—AE:)dt, (1.4) 


where AE(r) is the departure of the Æ curves in Fig. 1 
from their final values. Writing e(r) for the second dif- E 
ference A(E:—Eı) and assuming AZ to be so small — > 
1 (a) W. Lenz, Z. Physik 25, 299 (1924); (b) V. Weisskopf, Z. 
Physik 75, 287 (1932). Boe 
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E that the perturber is deflected but very little from its 
straight-line path, then, if it flies by at a closest 
distance p from the radiator and has a speed 2, 


: agai f elpt e)idi= Alon). (1.5) 


The added phase depends on p and v. An effective 
impact, i.e., one which kills coherence, is an impact for 
which Ag> go a quantity presumably of order of 
magnitude 1. Thus the relation Ag= go divides all 
impacts into two ideal classes, those which broaden 
\ and those which do not. Since a smaller p leads in 
general to a larger Ay, the relation Ag=ggo defines 
a p-(v) below which all impacts broaden the line. 
| Therefore mpe (v) plays the role of Lorentz’ q and may 
be introduced in (1.3) and (1.2). Here then is a quali- 
Hi tative reason why optical cross sections may differ from 
i gas kinetic ones, and why they depend on v. 
We shall often encounter the equation Ag= po as a 
{ limiting condition separating the ranges in which dif- 
ferent kinds of treatment are necessary. The precise 
value of o never matters, but we assume it to be of 
4 order magnitude 1 (or 7/2, or even r) and call it the 
t critical phase. If Ap arises from single impacts, the 
critical phase defines an optical radius or critical impact 
distance pe, which depends on v and on the form of e, 
i.e., the law of interaction. Likewise, there results an 
optical cross section tpe. 

Numerous refinements of this theory have been 
sroposed.*”-® One of them is treated later in some detail 
pecause of its success in applications even under circum- 
stances in which its a priori validity is not evident (see 
Sec. IV). 


C. Relation Between Statistical and 
Impact Theories 


The two physical pictures underlying the treatments 
in parts A and B of this section are so completely 
unrelated that a decision as to their domains of validity 
cannot be made on simple intuitive grounds. Attempts 
to combine the results in phenomenological ways, by 
merely saying that both are present independently and 
thus “folding” one distribution into the other?44> have 
had some success, but lack fundamental justification. 
It is necessary to fall back upon a mathematical 
formalism more general than either treatment A or B, 
and to see under what conditions it reduces to these 
species of description. j pa 
Such a formalism is available in the Fourier integral 
for the line width, suitably generalized.®° A classical 


a . . 
Z. Physik 80, 423 (1933); (b) C. Reinsberg, Z. 
B (a) YTA (1938); (c) E. Lindholm, Arkiv Mat. Astron. 
Physik SB No. 3 (1941); (d) H. M. Foley, Phys. Rev. 69, 616 
? 


(1946). 
+ This pro 
te A W. An 


cess is not the same as adding the half-widths, 
derson, Phys. Rev. 76, 647 (1949). 
d H. Margenau, Phys. Rev. 90, 791 (1953). 
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observable Q, which is a function of momenta and coor- 
dinates of some physical system, is represented in 
quantum mechanics by a matrix with elements Q;;. 
The classical intensity distribution within a line, 


9 


(1.6) 


an 


f MOE, 


—%0 


24 


3rc3 


I(w)= 


is such an observable involving momenta and velocities 
of the radiating atom through the varying dipole 
moment yu and the atomic frequency w. The observed 
value of Q for the state of the system which is repre- 
sented by the statistical matrix S is 


Q=trace(SQ). 


Thus the quantum mechanical intensity distribution 
(we omit the bar) is, because of (1.6), 


Ha) «trace|s f aide farae), (1.7) 


where S, u(t), and p(t’) are matrices, the latter two in 
the so-called moving system or in ‘‘the Heisenberg 
form” (see the following). From (1.7), statistical and 
impact theories can both be derived as certain limiting 
cases, 84.11 

The Hamiltonian H for the emitting atom is a 
function of ¢ because of the perturbations that broaden 
the line. Introducing a time-development matrix U 
which satisfies the Schrédinger equation, 


ihU = HU, (1.8) 


then 
u(t)=U'nU, 


u being the time-constant (Schrödinger) matrix pi? 
= Spi (Xn ern)W;d7. Now suppose the perturbers are 
fixed in space. Then H does not depend on ¢ and (1.8) 
has the solutions Uim=exp[(—72/h) Et lm, where Li 
is a stationary eigenvalue of H. This approximation is 
too drastic and yields no line width; however, almost 
as simple and quite significant is the answer obtained 
by making the adiabatic hypothesis. 

The adiabatic hypothesis in its quantum mechanical 
form says that, under conditions specified forthwith, 
Eq. (1.8) has a solution of the form 


1 t 
Um=exp| — f E war), 
h 0 


u (a) The procedure here follows H. Margenau and R. Meyerott, 
BOE: J. 121, 194 (1955). See also: (b) H. Margenau, Z. 
Physik 86, 523 (1933) ; (c) T. Holstein, Phys. Rev. 79, 744 (1950); 
(d) M. Mizushima, Phys. Rev. 83, 94 (1951); 84, 363 (1951); 
(e) P. W. Anderson, Phys, Rey. 86, 809 (1952). 

t{ The word “adiabatic”? has come to be almost meaningless 
through unprincipled use in modern physics. This becomes evident 
in Sec. IIB, where various specific senses of “adiabatic” and its 
opposite, ‘“diabatic” (or its atrocious, pleonastic synonym, non- 
adiabatic), are discussed. In the present context,“the meaning 
should be clear from the discussion. 


(1.9) 
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where the //;(¢) are solutions of 
Hp) = E (Dlt), 


an equation sometimes called the adiabatic Schrödinger 
equation. The hypothesis is valid if, during the per- 
turbation interval ¢, and for all states m, 


i (dH /dt') im ai 
il TC f caw’ 
0 Et) =~ Em (t) h 0 
— En (t’) dt” bar 


(1.10) 


«i. (1.11) 


Here Z is the atomic state initially present, m some 
other state. With (1.9) and (1.10) y(é) takes a simple 
form: 

Him” Uine expli pim ()] 


ii R ETO) 
Plm 7 nf LEU’) = E(t) jdt! 
0 


We now introduce this into (1.7) and assume that S 
is diagonal, a form which can always be brought about 
by proper choice of state functions. Then, after a few 
steps (among them the replacement of the variables 
t’—t by 7) there results 


Iu) f dreC(), (1.13a) 
CO)=Z Silupi f expl—iLeul-+) 
ee —yi;()}}dt. (1.13b) 


The function C(r), whose Fourier transform is the 
intensity distribution, is called the correlation function; 
its general features form the substance of an elegant 
mathematical discipline called the statistics of time 
series; hence, this manner of writing I (w) is finding 
increasing favor with theorists. Qualitatively, the 
meaning of C is clear: giy(t-+7) and gi,(t) are certainly 
equal when r=0; hence, C(0) is large. If y has a short 
“memory” so that it quickly loses correlation with its 
previous values, C(r) will fall off rapidly with r. It is 
then something like a ô function and (w) is broad. If, 
on the other hand, g(t) remains correlated with itself 
(in the absence of interaction there is a linear corre- 
lation for all time: g=const:-#) during an appreciable 
period, C(r) will differ from 0 for that period, and J (w) 
will be correspondingly narrow. One may indeed show 
that the half-width of J (w) is approximately the recip- 
rocal of the half-width of C(z). 

In arriving at (1.13), which is suggestive of the 
starting point of the Lorentz theory, we have used only 
the adiabatic hypothesis. Its correctness is tied to 
inequality (1.11). 

An analysis of the conditions under which that 
inequality is true is given in the next section. We 
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observe here primarily that (1.11) contains both the 
nondiagonal element (dH/dt);m and the energy differ- 
ence £;(t)—Lm(¢); its validity depends on the relation 
between them. The remainder of this section takes 
(1.11) for granted and concerns itself with the manner 
in which #,—H,, changes in time, for the applicability 
of statistical and impact theories depends on the 
behavior of that difference alone. 

The statistical theory results if the difference changes 
slowly. The quantity » may then be expanded in a 
Taylor series 


elt) =e)+ell)-tH 


and all terms in higher powers of r may be neglected. 
The convergence here is generally good if only 
g” (Ù) (7?/2) <1 for all values of ¢ and those values of + 
which matter in the integration of (1.13a), i.e., for 
which C(r) #0. In (1.13b) we assume for simplicity 
that only one atomic state i=1 was initially present 
(S;=6;1) and that only one pı, namely pw, is of 
appreciable magnitude. Then 


Cae ferson f exp{—il[E:(ġ— E1 (0) ]r/&}dt; 


hence, 


Move f dreier f A OETA. 


The integral over 7 now leads to a 6 function which is 
different from zero only at those times ¢ when w, the line 
frequency under consideration, equals [Es (4) — Ex (é) J/h, 
our former w(r\:--7v). By the rules of statistical 
mechanics (ergodic theorem) these times have prob- 
abilities equal to the fractional volume of configuration 
space in which w(r1- - -ry)=w. The exact formal analysis 
here involves ranges dw, near equalities and probability 
densities, which we suppress, placing thereby a heavy 
burden of infinities and zeros upon the proportionality 
signs. With a little care the reader can supply these 
purely mathematical details. In sum 


I (w) a f o fiee --ry) dr: drw, (1.14) 


and this is the P(w) of Eq. (1.1). 

An extension of the reasoning here outlined permits 
proof of two propositions of interest in the use of the 
statistical method.!£8.11 

1. Far in the wings of a line, i.e., as Jw—w’ |, 
(w is again the normal frequency) the statistical theory 
is always applicable. The value of |w—w’| beyond 
which this is true depends on the form of the interaction 
and cannot be generally specified. 

Later we have occasion to return to this proposition 
(which we call for short the wing theorem) and therefore 
give its more detailed proof, which closely follows 
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Sobel’man. Substituting Eq. (1.13b) into (1.13a) and 
dropping the subscripts and the sum we have 


T(w) = f drei? f ewilelttn—e( gy, 


with the understanding, however, that this quantity 
must be averaged over different collisions. 

Let us redefine y (without bothering to change notation) 
to mean only the variable part of the phase which is con- 
tributed by the perturbation. We then have for the second 
integral 


Sereen 


On interchanging the order of integrations, we have 
I(w) « f eit f dreilAur eltt], 


where Aw=w—w’. We now set i+r=s in the second 
integral and obtain 


Ta) f etendi f eitoo-sens, 


In the wings of the line, i.e., for sufficiently large Aw 
the contributions to 
Í eile—Awt dy 


come from intervals Aż; around the points t+, defined by 
g(t.) =Aw since in other regions the rapid oscillations 
of the integrand tend to cancel. We can then write 


feoraes eily—Awtld, 


kV Atk 


We next expand the exponent in a Taylor series about 
tą obtaining 


gl (tx) 


SE eilel- Awt] f epil 
k Atk 2! 


(t— t)? 


o" (t) 
3! 


+ 


(t—t,)?+-- f: 


The important range of the integration, At+, is of the 


ger VEe". 


in this range, the term proportional to (¢—¢,)* is 
’ all, the integration can be extended to infinity. This 
smat, ~~. 


oh) Le" (&) <1. (1.15) 
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We then obtain for the integral, 


>> eilet —Aot o" (1) JÈ, 
k 


and, for I (w), 


Ilw) a > erle(e—e(t)—Aa(te—e IT!" (ty) o” (a) J>. 
kl 


The average over all collisions causes the terms for 
which kl to cancel, and we find 


I(w) « Eo (a) J. 


The sum here can be interpreted as proportional to the 
length of time during which 


Aw= g' (tx) 
or 
w=o'+ of =h(Ei(t,) — Ey (ty) |] 


Thus J(w) is proportional to the time that the atom 
spends in the perturbed condition with the energy 
separation between levels given by w. This leads again 
to Eq. (1.14) by arguments already presented; hence 
the wing theorem. 

2. If a spectral line is broadened by single impacts of 
perturbers (low density) and At is the time during 
which the passing particle moves a distance equal to 
the impact parameter p, the statistical theory is applic- 
able provided 


w A> 1. (1.16) 


Here w, is the actual, measured half-width. Proof of 
theorem 2 rests upon the assumption that for an 
individual impact the perturbation can be written in 
the form 

E.— E= constr 7, 


where 7 is the distance between radiator and perturber 
and o a small positive integer, so that y(t) can be 
constructed for any impact, If one then employs ine- 
quality (1.15) noticing that, by virtue of (1.13a), the 
half-width of C(r) is the reciprocal of the half-width 
of I (w), a little algebra leads to formula (1.16). 

This formula has been proved for individual impacts 
only, and w; is the actual half-width of the line; in the 
wing theorem, the value of w beyond which the statis- 
tical theory holds bears no general relation to the half- 
width and differs in different situations. 

Next, we demonstrate that (1.13) leads to the impact 
results when the phase changes Ag occur suddenly and 
are therefore separated in time. We use (1.13) later on 
to obtain impact formulas more refined than those of 
Lorentz or Weisskopf, following the work of Foley, 
Lindholm, and others. At present we retrace the step 
that led to (1.13) and write 


F (1.17) 


Tw) œ f nreno 
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assuming again that the atom has only two levels. If 
sudden impacts occur at times ‘=T7;, and if these 
impacts add amounts fia; to the phase of the atom, 


E:— E= o+ a5(t,T;) | 
so that 
gx=t)=wttA; for TIKt<T i, 
where 71<72<T;3:-+ and 


i 
A;=> Qj. 


i=l 
Then the integral in (1.17) is 
Tj+1 
mL af eio" tg, 
î Ti 


When the square of the absolute value of the sum is 
taken, there will appear the terms 


Ti+ 
f eieo’) tdt 
T 


7 


9 


Ja 


i 


(1.18) 


and, in addition, cross terms with factors et4—4, If 
the a; are unrelated, the sum of these will effectively 
banish. The remaining summation in (1.16) extends 
over different free times Tj41—7j;=T and is equivalent 
to the integration over T in Eq. (1.2). Under these 
assumptions the Lorentz formula is therefore estab- 
lished. 

The wings of a line rarely follow the Lorentz formula, 
whereas the central core often conforms to it. The 
reason for this is not hard to see. 

We spoke of sudden phase changes Ag. In quanti- 
tative terms, “sudden” means that the change in ¢(#) 
occurring in the integrand of Eq. (1.17) as the result 
of a collision is limited to a time interval At small 
compared to the period of the other exponential factor 
of the integrand, i.e., 


AtK2n/w. (1.19) 


Unless the collisions overlap completely (long-range 
forces, high density of perturbers) this inequality is 
satisfied for sufficiently small w, that is to say, near the 
center of the line. This result might be called the “core 
theorem” in contradistinction to the wing theorem, The 
former says that the core of a line “results from 
impacts,” while the latter claims that the wings “are 
statistical.” 

These considerations allow these qualitative in- 
ferences: impact theories describe lines at high tem- 
peratures and for sudden perturbations, i.e., at low 
densities; statistical theories meet success at low tem- 
peratures and for heavy perturbers, especially at high 
densities where many interactions coincide and produce 
small net fluctuations of energy at the radiating atom. 

To account for the contour of a given spectral line 


575 


when neither impact nor statistical theory can be fully 
trusted, the center of the line can usually be interpreted 
as resulting from impacts, the wings as statistically 
broadened. More is said about this in Sec. IV. 

We should comment on the consequences of Eq. 
(1.7) when the adiabatic hypothesis cannot be made, 
that is, when condition (1.11) is violated. This con- 
tingency is most likely to arise when the spacings, 
E,— Er, are small, hence in the microwave region, and 
under other more special circumstances. The story is 
then somewhat involved: the best account of it is to 
be found in Anderson’s article. 

In the plasma situation the statistical theory promises 
a satisfactory description of the ion effects, and it 
would seem that impact theories might be suited to 
treat the swift electrons. But in this latter task, unfor- 
tunately, a fundamentally new problem arises. All 
theories here surveyed require the use of a potential 
energy of interaction e, which is a function of the per- 
turber position. Basically, then, they assume that the 
perturber has a classical path. This is not true in 
general for electrons, whose position is quite diffuse 
when the speed is definite because of the uncertainty 
principle. To operate with an e(r) may therefore be 
meaningless, and a new approach is needed. In the 
next section, we offer qualitative arguments designed 
to permit some discrimination of conditions under which 
classical path theories are useful, and related con- 
siderations. 


II. CLASSICAL-PATH HYPOTHESIS AND 
CRITERIA FOR ADIABATICITY 


A. Classical Path 


Essential to the foregoing analysis of spectral line 
broadening are two assumptions: first, that the per- 
turber has a classical path, i.e., is a point particle with 
spatial coordinates that are functions of the time; and, 
secondly, that adiabaticity prevails, so that the per- 
turber disturbs (without mixing) only the two states 
between which the system is radiating. The lack of a 
completely general workable approach to the broaden- 
ing problem confers a good deal of importance upon a 
knowledge of the conditions under which such assump- 
tions hold. As instances of treatments where these 
assumptions are not invoked we note the quantum 
theories of Jablonski? and of Kivel, Bloom, and 
Margenau®; another example is the theory of dielectric 
relaxation of Debye,“ which treats the broadening 
induced by reorientation transitions in the radiator 
during collisions. The broadening mechanisms pertain- 
ing to these latter theories are quite different from 
those discussed in Sec. I. The two classes of results arise 
from opposite extremes in the physical model for the 

12 A. Jablonski, Phys. Rev. 68, 78 (1945). 

18 Kivel, Bloom, and Margenau, Phys. Rev. 98, 495 (1955). 

4 (a) P. Debye, Polar Molecules (Chemical Catalon Company, 

e 5 


Inc., New York, 1929), Chap. V; (b) J. H. Van Vleck and 
Weisskopf, Revs. Modern Phys. 17, 227 (1945). 
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interaction. We now formulate some simple criteria for 
the validity of the description of Sec. I. Special con- 
sideration is given to the interaction of a radiating 
hydrogen atom with electrons and protons. 

Quantum mechanics yields an accurate account of 
atomic phenomena, and, through the uncertainty 
principle, also defines the limits of classical mechanics. 
This principle focuses attention on our inability to 
describe certain details of an event, such as the collision 
between two systems, without introducing finite dis- 
turbances in the parameters necessary for such de- 
scription. When these disturbances compete in mag- 
nitude with the events one wishes to investigate, the 
event imagined cannot be observed in the degree of 
detail demanded, for then a classical picturization is 
without basis, and we must use quantum mechanics. 

Classical-path theories assume that the distance of 
separation r between the perturber and the radiator is 
expressible as 

r()= (PeP), (2.1) 
which requires exact specification of both its position 
and velocity at each instant of time. For this is to be 
meaningful the uncertainties in these quantities, as 
derived from quantum considerations, must be small 
in comparison with their actual magnitudes. 

Numerous authors base their use of (2.1) on the 
magnitude of the angular momentum involved in a col- 
lision,!°:84.15 relying on the correspondence principle to 
validate classical analysis for large orbits. If L is the 
angular momentum of the perturber with respect to a 
radiator at rest, then, L~/h. For large values of the 
integer / we may identify L with mrXv, which during 
the collision may be taken to be ~mrv. 

The relation above in the form 


mro=lh 
and the uncertainty principle, 


mArAv=h, 
yield on division 


(Ar/r): (Av/») = 1/1, 


and this is much smaller than 1 if / is large. 

Equation (2.2) implies that for sufficiently large / any 
specified relative accuracy in 7 and v can be accom- 

lished without violating the uncertainty principle, i.e., 
without need for abandoning the assumption of a 
classical path. For example, if /~100 we may choose 
the orbit of the perturber such that Ar/r~= Av/v~ 1/10, 
which ratios represent an acceptable percentage ac- 
curacy in the specification of these parameters.’ 

The preceding argument leaves many detailed ques- 
unanswered. Let us consider the problem with a 
re of skepticism which, fortunately, will later 
For the process of collision broadening 


(2.2) 


tions 
measu ‘ 
pr ove excessive. 


A hys. Rev. 55, 699 (1939); (b) E. Lindholm, 
15 (a) L, Spitzer, Paik 32A, No. 17 (1945). 


Arkiv Mat. Astron. 
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to be tractable by means of a classical path we must 
require that the inequality 


(Ar/r)- (Av/v) 1 (2.3) 


holds throughout the time of any collision and for all 
impact radii of importance. When the classical path is 
derived properly as the limiting form of a quantum 
mechanical description these requirements lead to rela- 
tions more specific than (2.3), relations which depend 
upon the mass of the perturbers, the density and tem- 
perature of the gas, and the type of interaction which 
pertains to the problem. 

Collisions with large angular momenta, as shown by 
Eq. (2.2), can be described classically; this means that 
the perturbing particles correspond to wave packets, 
well localized both in coordinates and momentum, the 
limits of this localization being Ar and Av. We now 
imagine a wave packet with a mean spatial extension a 
and a mean momentum mv, this latter quantity being 
equal to the actual momentum of the classical particle. 
In addition, we let r be the distance between the radi- 
ator and the center of the packet; it therefore corre- 
sponds to the collision distance previously denoted by 
the same symbol. For the validity of the classical 
description we demand that 


acr 


Soi? 


(2.4) 


the ultimate limit of the classical description being a~r. 

In all applications of the theory the interaction is 
taken to be vanishingly small outside of a finite region 
of space, the linear dimension of which is designated by 
d. Physically, this distance represents the extent of 
penetration of the field of the radiator into its sur- 
roundings. For the case of a neutral plasma this distance 
is the Debye length'® 


d=6.90(T/n)!, (2.5) 


where d is given in centimeters, T in degree absolute, 
and n in cm™. 

We consider the interaction region to be a cube of 
dimensions d, oriented for convenience with an axis 
parallel to the perturber’s path. The time during which 
the perturbation is in effect, 7p, is then of the order of 
d/v, a constant for all values of r up to approximately d, 
beyond which distance it vanishes. 

In a classical description of the passage of the per 
turber through the interaction region, the collision time 
must be shorter than the time during which the per 
turber diffuses quantum mechanically through the 
distance 7. This diffusion results from the initial 
localization of the packet and proceeds with a velocity 
of the order of #/ma. It is therefore necessary that 


r/ (h/ma)> d/o. (2.6) 
Because of (2.4) this inequality must hold for, a#% 


16L, Spitzer, The Physics of Fully Ionized Gases (Interscient 
Publishers, Inc., New York, 1956). 


i 
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hence, we get from (2.6) 


a> (dd)}, (2.7) 


Where à is the mean 
turber, 7/mz, 
Applying this basic inequality to the most distant 
collisions with least momentum uncertainty for which 
r~d~a, we obtain a> , which (since a~7) is nothing 
more than the condition that [>>1. Thus, the criterion 
(2.7) reduces to the conditions that the angular mo- 
mentum be large when applied to the most distant 
collisions; in general, (2.7) is more restrictive than [> 1. 
In addition to the condition that the passage of the 
perturber through the interaction region retain its 
meaning in view of the diffusion of wave packets, we 
might also require that the interaction itself be classical 
„in terms of precise momentum exchanges. 
= The interaction energy between the perturber and 
the radiator is V(r) in the classical description. This 
can usually be expressed in the form 


V (r)=C./r?, 


de Broglie wavlength of the per- 


C. being some constant. The momentum exchange for 
the collision is then 


Ap= - f vva. 


Another quantity of interest in this connection is the 
phase change per collision 


Ap=} f V (ide. 


In both these integrals we replace dt by dx/v and V (4 
by C,(p?+v)-*?. The integrations should then extend 
from —d/2 to d/2, but a sufficient approximation 
results when they are performed from —o to +o. 
Curvature of the path is neglected in this procedure. 
Elementary calculation yields 


npt TEED VG) 
v” TEC+2)/2] v 

Ce PEe—1)/2 
T(0/2) 


The critical impact distance p. is that value of p for 
which Ay=1 (see the end of Sec. IB). Solving this 


equation for C, in terms of pe and inserting in Ap, we 
find 


and 


Ag= 


hop 


Ap= (0-/p)"(o—1/pyh, 


To regard the perturber wave 
is to neglect its momentum uncer 
uncertainty is thus introduced also in the states of the 
radiator system during and after th isi 

ee eee é e collision. 
validity of the classical interacti peau 


(2.8) 


as a classical particle 
tainty h/a. The same 
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if we require that all momentum (ang t 

corresponding energy) exchanges between the Cra 
and the radiator exceed this neglected 
classical picture therefore holds only y 


Ap~V (p)/u>h/a. 


This places an upper limit on the ma 
which the interaction may be treated 
When the opposite inequality 


Ap~V (p)/vKh/a 


holds, the momentum uncertainty in the syst 
exceeds the momentum exchanges of T i 
description, and the spread in momenta must ; 
porated into the description of the perturber a 
is equivalent, the spatial spread of the packet m. 
taken into account). Then the Classica] picture 
collision breaks down. 

From (2.9a) using the more accur 
Ap, from Eq. (2.8), we have 


PKatep-(a/p)'/(e—1) 


where a= (¢—1)"/("-) is of the order of unity, a: 
in the correlation of o with a, below: 


o=2 3 4 6 
a,=1.00 141 1.44 1.38 


Since ar for all r during the collision, and r can 
on the value p, (2.10) seems to imply 


Paope. ( 


This is essentially the condition Ag>>1. We con 
that only collisions within the optical radius 
amenable to a classical-path treatment and all- 
which occur outside of pe must be treated by mea 
quantum mechanics. We return to this somewhat 
prising conclusion at the end of this secho eam : 
Letting a~p as the least stringent coe i 0 
classical path we obtain in place of (2.1 ) 


Uncertaing 
vhen: 


gnitude O= 
classically. 


ate expressic 


( 


The validity of the classical descripto ee 
within pe is also contingent on the enfo results on € 
On combining (2.7) and (2.12), there 
nating, a ( 

(dN) Pe classical pa 


e 

This condition for the validity of f] si 
effectively independent of the OS arge 
packet describing the perturber i of & 
actually computed the dimens)® ire jitt 
packet [see Eq. (2.23)] and mi way ogee ae 
the size of the radiator. In 4 aa by 

the width of the Lyman-« i ok } 
computed classically at T= for PE 
mass >10 g, which is true 98; 
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TABLE I. Broadening of Hq by first-order Stark effect. Tem- 
peratures for protons and electrons below which the classical-path 
treatment holds, for various values of the density. 


n (°K) (°K) 
(cm~) (protons) (electrons) 
10” 4X 10° 50 
10" 4X 107 500 
1016 4X108 5000 
1018 4X 10° 50 000 


electrons. All this is in keeping with the present result, 
which is derived in a more general way. 

After substituting (2.5) along with the definitions of 
A and pe into (2.13) one finds 


Ly.(C./%) Pie 
aag i 
MMH 6.90} (3k) 


Tilo) 
(2.14) 


where v, the perturbers’ velocity, has been taken to be 
its rms value 
v= (3kT/M)}, 


M is the mass of the perturber, either a proton or an 
electron, and the y, are numbers of order unity (e.g., 
Y2=T). 

This result has the advantage of containing no 
reference to p, which has no experimental meaning. 
However, inequality (2.14) is a sufficient condition 
which is in some instances too severe. For hydrogen, 
the first order Stark effect (c=2) involves!’ 


Cox 3 (t?/m)n' (n’—1), 


where 2’ is the principal quantum number of the 
excited state of the radiating atom and 7 is the electron 
mass. This is a sufficient approximation. Accurately, 
n'(n'—1) should be replaced by N, such that 


N3=4.5, Ns=9.9, N5s=23.6, Ne=31.8. 
Hence (2.14) becomes 

I (tC)? 
=K 
niM? 6.90}: (3k)? 


For the first Balmer line (Hq, n'=3), the inequality 
= (2.14) gives Table I. For the higher Balmer lines the 
et tures appearing here are multiplied by the 


‘or a pure hydrogen plasma, thermal ioni- 
begins at about 7000°K ‘so that appreciable 


ji 


(Springer-Verlag, 
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TABLE II. Broadening by second-order Stark effect. Tem- 
peratures for protons and electrons below which the classical-path 
treatment holds, for various values of the electron (ion) density. 


n (°K) (°K) 
(cm73) (protons) (electrons) 
10” 107 0.1 
104 1019 100 
1016 10'5 105 
10'8 1016 108 


broadening by electrons and protons does not occur at 
temperatures below this value. Thus much of Table I 
is purely academic. 

For a typical second-order Stark effect in the optical 
region (c=4) Csh =~2X 107 esu. Here we obtain from 
(2.14) TKM 5S (44C4/h)?!3 (3k)! 8/6.9h and y4= 77/2. 
This leads to the values of n and T shown in Table II. 
Table II shows that electrons may be treated classically 
for n> 10! cm™ in the second-order Stark effect, while 
protons and heavier ions always permit such treatment. 


B. Criteria for Adiabaticity 


We now transfer attention to the radiating system. 
Perturbation of its states can be treated adiabatically 
when the inequality (1.11) holds. The state of the 
system is then maintained throughout a collision except 
for its alteration through the radiation process. There 
are two cases of interest: (1) the states of the unper- 
turbed radiator may be classified into groups of de- 
generate states and (2) the states of the unperturbed 
radiator are all nondegenerate. In case 1, two questions 
can be asked: (1a) Does the perturbation cause transi- 
tions between two states containing within the same 
group of degenerate levels? (1b) Does the perturbation 
cause transitions between a degenerate state in one 
group and a degenerate state in another group separated 
from the first by a finite energy? The perturbation may 
well be adiabatic in the sense of 1b, diabatic in the ~ 
sense of la. In this section, we treat only case 1a, 
leaving 1b, which is more difficult, for the most part 
until Sec. V. Only a brief qualitative remark concerning 
1b is presented at the end of this part of Sec. I. 
Existence is assumed of a finite perturbation time Tp 
the rate of change of the perturbation dV/dt being dif- 
ferent from zero only during that time which, as we 
have seen, is of the order d/v. In addition, we here 
treat single impacts only. 

For case la the energy differences of the instan- 
taneous states appearing in the condition for adiaba- 
ticity are due solely to the perturbation itself, since all 
other states of the radiator may be ignored because of 
their much larger energy separations. 

As the simplest example, we consider a radiator whe 
excited initial state has a two-fold degeneracy, andt 


BRAAGY, & Emed the first order of the 
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turbation theory. The adiabatic condition then becomes 


ie d/dt'| Vi2(t'’) | 
->  2|Vie(?')| 


i p” 
xexp| — F 2| vaar ae “1, (2.15) 


a.) 


where Vi2(t)=(1|V(é)|2), the indices 1 and 2 desig- 
nating the two states in question. 
The phase shift between the states is given by 


aeo=—f | Vie(t’) | de’. 


i) 


When |Ag(t)|>>1, there have been many oscillations 
, of the exponential part of the integrand in (2.15) at 
time /. These oscillations cause the integral to be quite 
small provided that 
(a) (d/dt)|Vie(t)|/|Vi2(2)|. is a slowly varying 
function of the time during an average period of the 


oscillation #/2(| Vi2|) (wher {| Vi2|) is a time average 


of |Vi2| for the whole perturbation, 


Tp/2 
(Vial=f [Vaan 
—Tp/2 
and 
(B) the perturbation time is much greater than 
h/2(| Vi2|). 


The first requirement a reduces to the second 8 when 
the classical-path procedure is used. In that case 


Vilt) =C12/ (P+), teh t&rp/2 
=0, ltl >Tp/2 


and (d/dt)|Vi2|/|Vi2| = (—00t)/ (P +P). We wish 
this to vary slowly within an interval of the order of 
h/(|Vi2|). This can occur only when |t|>>%/(|Vi2!). 
To maintain condition a, therefore, the contribution of 
(d/dt) In| Vi2| coming from small |¢| must be inap- 
preciable, and this requires that 7,>>h/(|Vi2|), which 
is condition £. The two conditions are not independent. 
When the degree of degeneracy is greater than 2, 
Eı— Em is no longer to be identified with 2|Vim|, but 
it is still of the same order of magnitude. The result 
just obtained is therefore quite general in degenerate 
situations, and we must depend entirely on long, slow 
collision processes for the validity of the adiabatic 
assumption. Our criterion for adiabaticity in case la, 
when written in a more general way, is, therefore, 


h| Ei— Em|)rp>1. (2.16) 


But #—((Z:— Em)) is the average frequency shift of the- 


initial state during the collision. Neglecting the fre- 
quency shift in the lower state of the radiative transi- 


5 
tion, condition (2.16) takes the form ge 
Ag>1. 

x 2. 
Hence, in the presence of degeneracy, Collisj E 
fall within the optical radius are adiabatic lons which 
diabatic. This result has been derived by Re Ss are 
more detailed calculations for the La line. Pitzer! in 


In the absence of degeneracy, case 2 ; 
ferences (E:—Em) are much larger hanes & dif- 
were considered in case 1. We may thus take pees 
~Ef—E,°, where E} and Em? are the upe ae 
states of the radiator. With this understandin ee 
quality (1.11) reads a 


Sora) 


i 
x epl = AE Baht at 
(A 


1 
| EP—E,?| 


<1. 


(2.18) 


Here it is convenient to distinguish between slow and 
fast collisions with respect to the oscillatory period 
h/|Ef—E,.|. For slow collisions, 7,>>h/|E—E,,°| . 

Integrating (2.18) by parts and making use of the 
fact that V (+ %)=0 we have 


Ki. (2.19) 


Hi i 
f Vim(t’) exp| - ` (BoE) |at 
a í 


1 
h 

Therefore, when V;(¢) varies insignificantly over the 
period %/|£°—£,,°| the inequality (2.19) will certainly 
be true provided 7,>%/|E°—E,°|, which is our 
premise. This assumption about Vim holds for times 
such that |¢|>%/|£2—£,°| and that the adiabatic 
criterion reduces merely to tp>>ft/|E°—E,o|. Hence, 
all slow collisions are adiabatic. 

For fast collisions, 7»<h/|£°—Em|, and we may 
take the exponential appearing in (2.18) equal to 
unity. This yields, upon integration of (2.18), 


| Vim(o)— Vim(— o)|/|£°—En?| <1, 


and the quantity on the left is always zero. The adia- 
batic criterion is therefore satisfied also when the 
perturbation time is short, provided the radiator has 
large energy separations. 

Thus both very slow and very fast collisions are 
adiabatic when degeneracy (or Pear degeneracy) js 
absent in the levels concerned. In the case of degeneracy 
only the slow collisions are adiabatic. In this latter 
instance the condition for adiabatic! ìs similar to one 
of the conditions for the classical-path approximation, 
namely Ag>1. 

Under the subject of degen 
attention must be given to spa 
that different orientations of 


1 L, Spitzer, Phys. Rev. 58, 348 (1940): 
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mentum vector correspond to the same energy of the 
radiating system. The meaning of the term adiabatic is 
a little confusing in this case, as the following con- 
sideration shows. 

Suppose the atom is radiating in a *P, state, and 
assume first that an external magnetic field is present. 
The orientations in space for which m=—1, 0, 1 will 
then have slightly different energies. The passage of a 
perturber past the atom is said to be adiabatic if the 
interaction with the perturber does not transfer the 
radiator from one m value to another or, classically 
speaking, if the radiator’s orientation remains fixed in 
space during a perturber passage. If we retain this 
meaning of the word adiabatic even in the absence of 
an external field, then an adiabatic perturbation is one 
which does not succeed in turning the radiator about, 
or in altering the value of m. 

The interaction between the radiator and the per- 
turber depends on the value of a quantum number m. 
However, this m is not identical with the foregoing, 
which refers to some fixed direction: the interaction 
depends on space quantization with respect to the line 
joining radiator and perturber, and this line moves in 
space as the latter flies by, turning through an angle 
of 180° during a complete passage. We call the space 
quantum number relative to this moving line, w. 
Adiabaticity with respect to m is clearly different from 
the former version because it implies rotation of the 
radiating atom. Spitzer,” Lindholm,” and Unsdld!8 use 
the term in the latter sense. 

For the present problem this choice is more appro- 
priate because in the derivation of our criterion we have 
taken | Eı— Æ| to be | Vi2|, and this implies that the 
radiator is in a state for which the linear combination 
of m eigenfunctions changes in time while m’ remains 
fixed. With this understanding, inequalities (2.15) and 
(2.16) tell us under what conditions m’ is unaltered by 
an impact. 

But if m’ is not changed, so that the impact is 
adiabatic in the second sense, then, since the radiation 
has to be analyzed in a fixed coordinate system (the 
spectrograph does not revolve about the radiating 
atom), an adiabatic impact corresponds to a phase 
change of magnitude r. This seems to imply that simple 
impact theories, like those of Lorentz and Weisskopf, 
which ignore phase changes resulting from this rotation 
but consider phase changes of magnitude m produced by 
dynamic effects as singularly important, contain grave 
errors. Distant impacts, for example, are not counted 
in these impact theories; yet they should produce a 

phase change m if they are adiabatic in the second sense. 
Physical intuition argues that this cannot be. The 
resolution of this apparent dilemma is simple: criterion 
(2.16) shows clearly that distant impacts are not adi- 
abatic in the second sense. This means they do not 


enho, dissertatiang Vreataků Kthgri University Haridwar Collectio? Cigilizaschy SSF maate 


AND M. LEWIS 
succeed in swinging the radiator around. They are 
adiabatic in the first sense. 

This problem, which is practically of minor im- 
portance but presents points of fundamental interest, 
is considered carefully by Spitzer.” He shows. how the 
impact theories manage to obtain a reasonable result 
by compensating errors, the errors being an unwar- 
ranted claim of adiabaticity and a neglect of the 
rotation effect. Further light is shed on the “rotation 
problem” by an analysis*! having an entirely different 
aim, but which indicates nonetheless how and why the 
Lorentz theory takes correct account of the change in 
orientation of Debye dipoles induced by collisions. 

Before concluding the analysis of adiabatic conditions 
we take a brief look at case 1b and inquire how one 
should deal with transitions between one set of degen- 
erate states and another set, separated from the first 
by a finite energy interval. Suppose that for transitions 
between any two states of the initial set and also the 
final set, 

Ag<l. 


Then the perturbation is diabatic with respect to the 
(perturbed) degenerate states. In that case a consid- 
eration of Unséld'® makes it appear plausible that an 
optical transition between the two separated level sets 
can be handled as though each were nondegenerate. 
The reason is that within the limit of the uncertainty 
principle the degenerate levels do form a single level 
even after they have been split by the perturbation. 

To see this we remember that the adiabatic hypoth- 
esis for transitions between the originally degenerate 
states of one energy level breaks down when the reverse 
of (2.16) is true, viz. 


[Ei— Em] rhi. 


This implies that the energy uncertainty in the 
system, of the order of #/r,, exceeds the discrete energy 
shifts due to the instantaneous perturbation | Eı— Enl. 
The states which are split by the field are therefore not 
revealed individually through radiation processes in- 
volving them since they overlap in energy by large 
amounts and form what amounts to a single diffuse 
energy band. 

The use of the impact theory when coupled with the 
suggestion of Unsöld is roughly equivalent to a pro- 
cedure which takes account of the diffuseness in energy 
of these states by solving the time-dependent per- 
turbation equations for the probabilities of states 
formed through quantum transitions that are caused 
by the collisions. Foley®4 and Kolb?? have done this 
using the classical-path assumption, while Kivel, Bloom, 
and Margenau, and also Landwehr? have carried 
through the calculation without this assumption. 


ao) Van Vleck and H. Margenau, Phys. Rev. 76, 1211 
2 A. Kolb, Thesis, University of Michigan, 1957. ; 
i University, 1956 (unpublished). 
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In conclusion we call attention to an early calculation 
by D. Blochinzew* who investigated the transition from 
diabatic to adiabatic behavior of an atom perturbed by 
a harmonic electric field, Æo cosw/. 


C. Construction of Wave Packets 


Since the classical-path assumption (Sec. ITA) is not 
always easy to justify for electron-broadening appli- 
cations, we discuss this problem in quantum terms. 

How can the state function of the perturbing electron 
be constructed? Inequality (2.9b) is the necessary con- 
dition for success of the Born approximation, which is 
so useful in scattering theory. But (2.9b) is normally 
satisfied for the plasma electrons, and as (2.9a) leads 
to (2.10), so (2.9b) leads to its contrary which can 
roughly be written p>>p,. Therefore, when densities are 
very low and encounters as close as the critical radius 
are rare, conditions under which one ordinarily employs 
a plane wave in describing the perturbing electron seem 
to be satisfied. Nevertheless, the plane wave description, 
which has received extensive use in problems where 
collimated beams of particles are directed at scattering 
centers, is inappropriate to our problem for several 
reasons. 

First, we are here concerned with a single electron 
encountering the radiator, or, as the density is increased 
sufficiently, with a number of electrons statistically 
distributed in momenta and acting simultaneously on 
the radiator. The single electron hardly acts like a plane 
wave because the latter is stationary throughout the 
radiation time, and where in reality the electron has a 
spectrum of momenta the plane wave possesses a single 
value. Furthermore, the plane wave does not yield any 
momentary asymmetry in the charge distribution about 
the radiator, a situation which practically eliminates 
the possibility of a Stark effect.'* As a description of 
the many-electron perturbation the plane wave has 
similar disadvantages for the treatment of broadening, 
except that in this case the assumption of stationarity 
of the perturbation is more acceptable. At large den- 
sities one would expect the fluctuations in the per- 
turbation to be small and the plane wave picture to 
become more accurate. 

In stellar atmospheres where n= 10!/cc, the critical 
radius is only about 10‘ times the Debye cutoff at 
temperatures between 5X10* and 10! °K. Thus we 
might expect something like single impacts by the 
electrons, and strong objections to the use of a plane 
wave for the electron must arise. For such cases 
Margenau and Kivel!” employed a wave packet state 
function. For simplicity, they construct wave packets 
in one dimension and determine the width of the packet 
from the condition of thermal equilibrium. 

The quantum-statistical density matrix for the system 
in a volume V and for states labelled by wave numbers 


% D. Blochinzew, Physik. Z. Sowjetunion 4, 501 (1933). 
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x and x’, is taken to be 
Sy, i P ôx, K’y 
(2rmkT)# We (2.20) 
P, = — exp| — | 
y3 2mkT. 
and the wave packet is defined by 
W(x,l) = >>. Chez (2.21) 


Correspondence is established between the description 
(2.20), which is a stationary mixed case, and (2.21), 
which is a nonstationary pure case, by equating expec- 
tation values for the momenta 


Pr=|C(k,) |P, (2.22) 


This procedure yields the amplitudes but not the 
phases of the coefficients C(k,f). These are chosen on 
the basis of a physical argument. If the collision occurs 
at the time /=0, then because at this instant the elec- 
tron involved in collision has a maximum energy uncer- 
tainty, the packet should have its minimum concen- 
tration about the radiator at that time. But min- 
imum concentration means equality of all phases, i.e., 
P,=C(«,0)?. From (2.22) and (2.21) one then finds (upon 
carrying out the summation over « as an integration) 


y(x,0) Sexp[—mkT22/h? ], (2.23) 


which represents a Gaussian probability packet whose 
width is equal to #/(2mkT)*. This is precisely the mean 
momentum of the ensemble described by (2.20), and 
the diffusion velocity of this packet is, satisfying enough, 
the root-mean-square velocity of the ensemble. The action 
of such an electron is to squeeze itself about the atom 
at the instant of collision and then to diffuse away 
again after the impact time. More general wave packets 
of this type, centered about points distant from the 
radiator and having finite mean momenta, so that they 
move bodily and diffuse, are even better representations 
of the electron which may be of assistance in broadening 
calculations. A paper by the present authors? shows 
how the classical formula for Stark broadening arises 
out of a wave packet treatment as the wave packets 
become more and more concentrated. 


D. Summary and Appraisal 


Section II has dealt with rather basic and intricate 
matters, and has led to some slightly surprising con- 
clusions which, if true, restrict the validity of classical 
approaches more severely than might have been antici- 
pated. It may also seem amazing that inequalities of 
the form Ag>1 and Ag<1 decide so many different 
issues. The first defines the range of validity of the 
classical-path description as well as the domain of 
adiabaticity (in the case of degeneracy). We also meet 


235 H. Margenau and M. B. Lewis, Phys. Rev. 106, 244 (1957). 
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it again in Sec. IV, as a criterion for the correctness of 
statistical theories. 

The typical structure of the arguments used in this 
chapter is the following. We carefully specified condi- 
tions under which certain conclusion (e.g., “The classical 
path is a valid assumption”) are true. Then we said, 
quite without logical justification, that when the con- 
ditions are violated the conclusion will not be true. All 
we should have said is that we did not prove it to be 
true. We did not investigate the conclusions of more 
general conditions, and have, therefore, no assurance 
that more general premises will not lead to the inter- 
dicted conclusions. Hence all inequalities here en- 
countered are permissive, are sufficient conditions, and 
mean to say that, under their terms, the type of de- 
scription in question is certainly adequate. They do 
not strictly rule out the possibility that the description 
may be appropriate even when the inequality fails. The 
logic outlined is not peculiar to our treatment but 
characterizes most of the arguments found in the 
literature. They provide guides, but no final solutions. 

Fortunately, the criterion derived for classical-path 
description is at times too limited. But there is no way 
of discovering this, or of assuring it, short of proceeding 
with more general premises. This means actually 
starting with quantum theory in the initial phases of 
the calculation to show that such treatment does, in 
fact, yield the same results as the classical-path de- 
scription. 

Il. HOLTSMARK THEORY 


A. Review of Conventional Treatment 


The oldest and, within its limits, most successful 
theory of line broadening by the particles of a plasma 
is Holtsmark’s; it treats the diffusion of intensities 
within a line as if each part of the line arose from a 
Stark effect caused by the electric field associated with 
a temporary configuration of the moving ions. Most of 
the older review articles and newer texts deal with this 
theory in considerable detail; we, therefore, merely 
sketch its features and develop as much as is needed for 
present purposes. 

A simple approximation gives the frequency distri- 
bution at large distances from the line center extremely 
well. This takes into consideration only close binary 

encounters between the radiating atom and one ion, 
leaving out of account the numerous—and therefore 
highly probable—collaborative but weaker perturba- 
tions of many ions further away. Thus it falsifies the 
enter but describes adequately the wings of the line. 
W. call this approximation the binary form of the 
lE e theory and present it as follows. s 
Selecting the radiating atom as center, we describe a 
e of radius 7 about it. Denote by P(r) the prob- 
3 her hat there is at least one ion within it. To obtain 
ae i pee focus attention on its complement P_(r), 
A ae y, the probability that there shall be no particles 
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of combination of probabilities, 
P_(r+dr)=P_(r)-p_, 


where p_ is the probability that there be no ion between 
r and r+dr. But we know p_: it is 1 less the probability 
for the presence of particles, one or more, within the 
shell dr. Hence, if n is the number density of ions, 


(3.1) 


p-=1—4rnrdr— (4arnr*dr)?—---, 


and the powers of dr beyond the first, which represent 
the probabilities for the occurrence of 2, 3--- particles 
in dr, can be omitted for sufficiently small dr. Thus Eq. 
(3.1) takes the form 


P_(r) +P (r)dr= P_(r)[1—4nrdr], 


whence on integration P_(r)=C exp[— (47°/3)nr?]. The 
constant C must be adjusted so that P_(0)=1 and is 
therefore 1. Hence P(r)=1— expl — (47/3)nr* ]dr and 
the probability that the shell contain at least one ion is 


dP (r)=4rnr expl — (4r/3)nr Jdr. (3.2) 


It is also apparent that this is the nearest ion, for the 
factor 4rnr?dr expresses the indiscriminate likelihood of 
the presence of an ion between r and r+dr, while the 
factor exp[— (4r/3)nr’] conjoins it with the condition 
that there be no ion within r. The latter factor is often 
insignificant numerically and is sometimes omitted; 
however, it is necessary if P is to be correctly nor- 
malized. Equation (3.2) leads to EdP (r)=1— e1 
if R is the volume containing N =n(4r/3)R? particles, 
whereas /o’4rnrdr=N. 
Introducing the abbreviation 


(4r/3)n=r. (3.3) 
Equation (3.2) may be written 
dP (r)=exp[_— (7/10)? Jd (1/10). (3.4) 


The distance ro thus defined is the radius of a sphere 
whose volume equals the mean volume per ion; it is a 
little smaller than the mean distance between ions, #7}. 

Equation (3.4) is the probability distribution in r. 
Since a given r defines an electric field, F, and an electric 
field defines a frequency displacement via the Stark 
effect, that equation also represents the distribution of 
the last two quantities: F is given by F=Ze/r?; so sub- 
stitution 7?=Ze/F in (3.4) converts it to a probability 
distribution in F 


dP(F)=exp[_— (Fo/F)!\d(Fo/F)} 
= (3/2F) (Fo/F)? expl—(Fo/F)3]dF. (3.5) 
Here Fo=Ze/r¢’, and the minus sign was inserted in 


order that the normalization be correct relative to F, 
not Fo/F. That is to say, the normalization is such that 


F=0 


dP(F)=1. 
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In the linear Stark effect, 
Aw=sF, 


s being a constant. Writing Awo=sFùo, Eq. (3.5) leads to 


roof) 
22) E) e 


By the fundamental assumption of the statistical theory 
dP (Aw) is identical with the intensity distribution: 


dP (Aw) =I(Aw)d (Aw). (3.7) 


(2) f-(22)} o 


In the quadratic Stark effect, 


Hence, 


I (Aw) =—— 
24w 


Aw=iF, Aw=tF?, 
t being another constant. When this is substituted in 
(3.5) we obtain, again with the use of (3.7), 


3 /Awo\? Awo] è 
I (Aw) =—— =) ep| - (=) | (3.9) 
44w \ Aw Aw 
The “normal” frequency Aw, though denoted by the 
same symbol, has different values in the linear and the 
quadratic Stark effect. 

The foregoing elementary considerations, which yield 
the binary approximations to the intensity distribution 
Eqs. (3.8) and (3.9), have been superseded by the work 
of Holtsmark,” Verwey,” Schmaljohann,”® and others, 
who included the cooperative effect of many ions. Their 
results are most easily derived by a method introduced 
by Markoff, a good presentation of which is available 
in a review by Chandrasekhar® while its application to 
other line-width problems is to be found in reference 2e. 
The treatment under review ignores the Boltzmann 
factor which attaches to the configuration probability 
of the moving ions. Effects of this simplification are 
considered later. 

The field F, which appears in Eq. (3.5), must here 
be written as a vector sum when it is compounded from 
the fields of many ions in different places: 


F=|; F;|. 


The analysis leading to (3.5) must be carried out in a 
configuration space of 3N dimensions. In place of (3.5), 


(3.10) 


26 J. Holtsmark, Anu. Physik 58, 577 (1919); Physik Z. 20, 
162 W1; 25, 73 (1924). 
27S. Verwey, Dissertation, Amsterdam, 1936. 


28 P, Schmaljohann, Staatsexamens-Arbeit, Keil, 1936. For 
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Fic. 2. Graph of W (8) 
vs B. See Eq. (3.11b). 


it yields 
dP(F)=W(6)d8; B=F/Fo, 


(4/30) 6?(1—0.46362-+0.122761- - -) 
1.4966-3(1-++5.1078-3+-14.936-*+ - - -). 


The two alternate expressions are useful for small and 
large values of 8, respectively. Near B=3, both series 
converge slowly. According to Schmaljohann and 
Unséld W(3)=0.175; W(3.5)=0.122. In Fig. 2, the 
function W (6) is plotted, showing that 


(3.11a) 


W(6)= (3.11b) 


Fo\ ! 
limW = 1.4966-*= 1.496 (=) i 
B=>o F 


This is very nearly the same as the limit of Eq. (3.5), 


dP(F) dF 5 --(“), 
dF dg 2 
as asserted earlier. 


These statements are not quite correct. In the binary 
theory Fo was defined as 


Ze/re°=Zel_(4r/3)n |?=2.60eZn. 


The many-ion calculation involves a slightly different 
parameter, 
Fo=2.61Zen!, 


Ag 
but the change is so small that its importance is quite _ 
academic, so we do not trouble to alter notation becausg 4 
of it. 

From Eqs. (3.11) we pass to the frequency distrib - 
tion in the same way as before. In the linear Sta 
effect Aw/Awo=F/Fo=£, and since in general 


T(Aw)d (Aw) = W (8)aB, 
I (Aw) =W (Aw/ Aco) (1/Awo). 
In the gradratic Stark effect 


we find 
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and Eq. (3.5) gives 
T (Aw) =W[(Aw/ Aw)! ]-3 (Aw: Aw). 


The connection between the preceding considerations 
and what was described in Sec. I as a statistical theory 
is not altogether clear. For here, we calculate the prob- 
ability of fields and make the passage to an intensity 
distribution via the Stark effect, whereas the statistical 
quantum theories fix attention upon the probability of 
a given energy perturbation e, a scalar quantity, and 
identify this probability with the line intensity at 7Aw. 
Equivalence of the two procedures is intuitively 
expected. Margenau and Meyerott,4* who made a 
quantum mechanical calculation using the binary ap- 
proximation for the La line of hydrogen, indicate how 
the equivalence comes about mathematically. Among 
other things e is not the sum, J;e; for individual 
perturbers but is composed very much like F from the 
vectors F;; hence the Holtsmark theory does not 
depart from the statistical pattern. 

This treatment includes the effect of the nonuni- 
formity of the Stark field produced by ions. In the 
ordinary theory of the Stark effect, F is treated as a 
constant, whereas the ions produce a field which 
depends on 7, the dependence being greater the smaller 
r. In general the nonuniformity effect is small, increasing 
in magnitude with ion density, W(8) as given by 
(3.11b) no longer has the asymptotic form 1.5 6-3, but 
1.5 (6-$+-26-a'/ro), a’ being the radius of the lower 
atomic state (first Bohr radius for La). The nonuni- 
formity also causes shifts and widths in Stark com- 
ponents which do not show a first-order effect, the 
“forbidden” half-widths being of the order a’/ro times 
the regular Holtsmark widths. 


(3.14) 


B. Effect of Perturber Interactions 


In part (A) of this section the probability of a given 
ion configuration was taken to be proportional to the 
volume of configuration space assignable to it, the 
Boltzmann factor expl—V(n,r2---,rv)/kT] being 
neglected. This factor confers smaller weight upon con- 
figurations in which several ions are close together and 
therefore reduces the probability that many ions shall 
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be situated near the radiating atom. Hence the 
Holtsmark theory overestimates the likelihood of strong 
perturbations and therefore the intensity in the wings 
of a line. 

An accurate calculation with inclusion of the Boltz- 
mann factor has not been made. Broyles”? considers the 
problem most carefully, but his results are relevant 
primarily for the case where the radiator, too, is ionized 
and repels the perturbing ions by a Coulomb field of 
its own. His results are examined in the next section. 
Here we summarize a contribution by Ecker, which 
deals directly with the problem. He relies upon’ the 
Debye-Hiickel screening mechanism to provide a semi- 
quantitative solution, which introduces the electrons as 
well as the ions into the picture. 

In the presence of a cloud of ions and electrons the 
Coulomb field produced by an ion is given by 


F= (Zer/r*)[(1/r)+ A/D) Je", 


as may be seen by differentiating the expression for V, 
Eq. (5.4). According to (5.5), 


D=[kT/4rne(1+Z?) }}. 


Analysis of the line-broadening problem even with 
this simplified potential remains formidable; Ecker 
chooses as an approximation to F the form§ 


Zer/r3 if 
0 if 


r<D 
r>D. 


i 


‘Machine computation by the method of part (A) leads 


to a dependence of the line shape on the parameter 
ô= (4r D3/3) 12, 


whose physical meaning is the number of ions within 
the Debye radius D. Clearly, if D — ©, the result must 
agree with Holtsmark’s. Figure 3 shows the results of 
Kcker’s calculations for three values of ô. The expected = 
effect, reduction of the intensity in the wing is clearly 
in evidence. 

In principle, a further correction must be made in 
the Holtsmark distribution. The Debye-Hiickel cutof = 
modifies the effective field at large r. For small r, the 
Coulomb field breaks down because of atomic screen- 
ing, the potential going to a finite value as r — 0. The 
theory of this effect is easily developed* and predicts a 
flattening of the Holtsmark curve far from the center 
of the line. In the region affected, the frequency is pro- = 


» A. A. Broyles, Phys. Rev. 100, 1181 (1955). 
% (a) G. Ecker, Z. Physik 148, 593 (1957); (b) 149, 254 (1957); 
(c) Z. Naturforsch. 12, 346, 517 (1957). 
§ A calculation of the shielding correction which avoids this 
cutoff was made by H. Hoffman and O. Theimer, Astrophys. J. 
127, 477 (1958). Ecker (private communication) has performed 
a careful machine calculation based on the correct potential and 
obtains results in substantial agreement with his former approxi- 

mation and differing from those of Hoffman and Theimer. 
H. Margenau, Progress Report AF-18 (603)-15, January 
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portional approximately to 6, not to 6-3, But this 
region lies so far outside the measured range of line 
widths (hundreds of angstroms for the Balmer lines, 
as a simple estimate, based on the assumption that the 
breakdown radius is of atomic dimensions, shows) that 
this correction is at this time academic. 


C. Treatment for Radiating Ions 


The analysis leading to Eq. (3.5) is further defective 
in assuming no potential energy between the moving 
ions, and none between the radiating atom and the 
ions. The former is always present but has ordinarily 
no very profound effect on the frequency distribution. 
The second potential energy is present when the radi- 
ator itself is ionized; in that case its effect is more 
important. Attention has been called to these short- 
comings by Mayer and Broyles,” whose interest is 
primarily in highly ionized exploding plasmas where the 
ions themselves emit the lines. A completely adequate 
theory of these effects is not at hand. We develop here 
first the binary approximation to the many-ion analysis 
and compare the results with Eq. (3.4). (Our develop- 
ment completes somewhat the treatment of the afore- 
named authors who omitted the condition that there 
be no ion within 7 and obtained an unnormalized dis- 
tribution.) 

The perturbing ion of charge Ze will repel the 
radiating ion of charge Ze with the Coulomb energy 
Z\Z2¢7/r. The former p— will then be 1—4ane-*’"r"dr, 
provided 


a= (Z,Z0e2)/ (RT). (3.15) 
In place of (3.2) we obtain 
dP (a,r)=4rn expl (—a/r)—4rnA (a,r) ]r°dr, (3.16) 
where 
A(a,r)= f rear (3.17) 
0 


Hence, the ratio of (3.16) to the Holtsmark distribution 
(3.4) is 
S=dP(a,r)/dP(0,r) 

=exp{— (a/r)+4rn[A (0,7) —A (a,r) ]}. 


The function A(a,r) is easy to calculate; it may be 
written 


(3.18) 


A (a, fa) = 300 ($); 
TO= (@- HEE) 1): 
the function % is plotted in Fig. 4. Clearly, 
AO,n)=7/3. 


For the binary approximation to be valid, the ratio 
(ro/r) =B must be greater than about 7. On the other 


(3.18a) 


wie Mayer, Los Alamos Scientific Laboratory Report LA-647, 
1947. 
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hand, ro>=0.62n—}. The last results, then, begin to be 
useful at values of 7 in the neighborhood of 1/42- and 
remain so for smaller radii. At this critical distance, 
Eq. (3.18) becomes 


T 4r 1 
S=exp{ — 4an? +— — = nad ( ) l. (3.19) 
48 3 4an? 


The terms in the exponent become important when 
an}~1. This is the condition for equality of mean 
kinetic energy and Coulomb energy at the mean distance 
of separation of the ions. 

If 4an*=0.1, S=e-", the value of %§(10) being 845. 
For 4an}=1, S=e-® (since %(1)=0.258), and for 
larger values of 4an* only the term exp(— 4an?) of S 
remains, the others being negligible. 

In the light of these numerical results we consider 
two cases, first the perturbation of singly ionized helium 
atoms by protons in a star or a discharge plasma. Here 
Zı=Z:=1. The condition that the binary approxima- 
tion be significant, namely r<0.62n73, is easily com- 
patible with the requirement that r be greater than a 
helium ion, r>5X10-° cm, since these inequalities 
merely imply 7<10*4 cm™. But to have S appreciably 
smaller than 1, anè must be near 1. This means that 
T= (2/k)n}=1.7X10-n! °K, which can occur in a 
shock front or in an are. 

The situation is quite different for atomic explosions. 
At amosphere of iron ions, each with Z=23, and with 
T= 107 °K implies a=8.4X 10-8 cm. At normal density 
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n= 10% cm™ and the condition upon r is r<10-§ cm. considered only between the radiating ion and the 
Because of the small size of the iron core this leaves a plasma ions; interaction between plasma ions them- 
considerable range in which the binary approximation selves is ignored. This method avoids some uncertainties 
is correct. Indeed at r=1/4n-, for which (3.19) was inherent in the approximations using the Bohm-Pines 
computed S~+10-, indicating that the Holtsmark approach (see the following), for the approximations 


Ae I TEED s <7 


i i formula is enormously in error. are clear in their physical meaning and the analysis is 
i We now treat the many-ion problem using the otherwise exact. 
fy following model: the free electrons form a uniformly With this model, the probability P(F) for an electric 


Sp 


smeared out negative charge, and interactions are field F is given by 


= 


ERI AE 


f exp(—a Xion ri )8(F—D isn Qor,/r#)dry: + -dry 
<<<. (3.20) 


fo fea dere an: ary 


where a is defined by Eq. (3.15). Equation (3.20) can where 


re ag f ae 
EAEE AORE -~ 
me z 


Ai my i r a: 


be written in a more convenient form, best obtained 
through use of Chandrasekhar’s method.5 As before, T,(8)= f |- C ) } (i Si singdx. (3.24) 
we denote by W (8) the probability of finding a field of 
strength B=F/Fy [see Eq. (3.11a)] It is seen that (2/rß)Io(8) is the Holtsmark dis- 
Fo=2.61Zeen!. tribution, to which Eq. (3.24) reduces in the limit, 
T— «~(8—0). 


Fo is the quantity introduced previously and repre- The integrals in Eq. (3.24) can be expanded in a 
sents essentially the field produced on one ion by an- series and evaluated for 8X3 and £ > 6. The range from 
other ion at the average spacing between ions, and n 3 to 6 is difficult because of the slow convergence of the 


is the number density of ions. Then series. The coefficients of the first two terms in Eq. 
; (3.24) are listed in Table III. 
a Case 2: a>1, ba. 
Ba) = e| — G) n|: sinde, (eA) For this case, n is expanded in an asymptotic series 
iny: 


=—(- )) f dza "l?(z— sinz) =>( ) (- ah ar H ya m= e) (3.25) 


Bvt 
xep] ~asi(~) | (3.22) 
$ x If only the first term is retained, one finds 
ie [Zre(Z2e)(kT) JF o}. 


the limit of large 8, the leading term in Eq. (3.21) w ve exp(— 6a (21)#/5) 
_ is the same as that of the binary theory given before 2m} (5/8) (2/m)* T 
[see Eq. (3.16)]. W(8) can be computed directly from 
aoe) @ 21) and (3.22). However, for certain ranges of ; a}6? exp[—0.5a8?] 
a (a1 ‘1 and a>>1) the form of W (6) can be simplified. =—_______—._ (3.26) 
Case 1: 0K, ap? K1. 1:25 
In this case 7 can be expanded in the ratio y=2/6: Equation (3.26) is identical with a formula proposed 


4 ae ge eae a -, (3.23) by Mayer”? on the basis of a different, simpler physical 


5 TABLE III. Coefficients of the first two terms of Eq. (3.24). 
5 C=. Values of (2/m8)Io(8) from reference 5. 
4 


B (2/76) Io(6) (2/xB)c111(8) 
0.1 0.004225 0.00745 
0.6 0.129598 0.21264 
1.0 0.271322 0.3860 
2.0 0.33918 0.1791 
3.0 0.176 —0.08707 
6.0 0.02417 —0.0507 
8.0 0.01038 —0.0273 
10.0 0.00556 —0.0168 
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model. Why the two models lead to the same result is 
not clear. 

Broyles?” has proposed another way of computing 
the probability P(F) of finding an electric field of mag- 
nitude F at a radiating ion in a plasma. He uses the 
method of Pines and Bohm* to separate the potential 
energy and the electric field into a short and a long- 
range component. 

The work is based on a model in which the free 
electrons are represented as a uniformly smeared out 
negative charge. The following units are used: unit 
length is the radius of a sphere whose volume is the 
volume per ion ro; unit field strength is Fo, and unit 
energy is that of two ions separated by unit distance, 
and the temperature is expressed in the above energy 
units. 

The probability density for occurrence of a field 8 is 


> [see Eq. (3.11a) for definition of £] in analogy to Eq. 


(3.20), 


f . f exp(— V/0)è(6—}: ri/ro*)dri- - -dry 
Pe a I Iama 


fof EEE 


provided V is the potential energy, and ô the Dirac 
“function.” On using the Dirichlet representation for ô 
one obtains 


(3.27) 


P(G)=(2n)-* f TO exp(ig-Dal 
(3.28) 


f o f pi O IO alleen ie 


l 


Since P (8) depends only on the magnitude of 6 we have 
W (6) =4n6?P (6). (3.29) 


We now employ the Pines-Bohm method to split the 
potential energy and the electric field into short and 
long-range components. The iow charge density p; is 
written as 


T()= 


pi= 25 ô(r—r;). 


But the total charge density p can be expanded into a 
Fourier series (unit volume per particle) to give 


p= Dx’ p(k) exp(ik-r), 


p(k) = X; exp(—ik- rj), 


(3.30) 


(3.31) 
with 


Ai z 
Dinine andio ae Ohn: Phys: R Yiu S) 938 (I Zity Haridwar cifs 


where the prime indicates the omission of k=0 from 
the sum in order that the net charge be zero. The 
potential and the electric field when expressed as 
functions of k can each be written as a sum of two 
terms, one for high and one for low values of k. The 
high & terms are functions of the individual r; and are 
referred to as the particle or short-range part, the low 
k terms are expressible as functions of the p(k) and are 
called the wave or long-range part of the respective 
quantity. With this separation, T (l) can likewise be 
approximated in consistent fashion; it becomes a 
product of Te (wave parts) and T, (particle part). The 
wave part can easily be evaluated; the particle part is 
approximated by Broyles in two ways: 

(1) Short-range central interactions (SRCI). In this 
approximation only the interactions between the radi- 
ating ion and the perturbing ions are retained, the 
interactions among the perturbing ions being neglected. 
This approximation is similar to the one in the previous 
section, except that it was there applied to the full 
Coulomb interactions of the ions whereas it is here 
restricted to the short range components which arise in 
the Bohm-Pines method. 

(2) Short-range nearest neighbor (SRNN) interac- 
tions. Interactions are considered only between the 
radiating ion and its nearest neighbor. 

Broyles*® has employed his method for iron ions at 
normal densities, with a charge of 23 electron charges 
at a temperature of 1 kev, a case already considered 
earlier in this section. This corresponds to 0=0.186. 
Under these conditions, a as defined in the previous 
part of this section [Eq. (3.22) ] has a value of 5.5. 
It is therefore possible to use Eq. (3.26) for B&S.5. The 
error attending the use of Eq. (3.26) is less than 10% 
for B=1 and decreases for decreasing 8. Table IV 
compares Eq. (3.26) with Broyles’ SRNN for 0.3<¢8 
<1.5 A more detailed comparison is given in Fig. 2 of 
reference 29. The curve marked “simple harmonic 
oscillator” is the result of Mayer’s calculation and is 
identical with Eq. (3.26). 

A separate publication by Broyles** deals extensively 
with the approximations involved in using the Pines- 
Bohm method for this problem and devises an improved 
procedure? which is applicable for 620.6. It is con- 
cluded that P(@) is rather well determined by the 
SRNN approximation for 0 > 0.6. 


TABLE IV. 
B Eq. (3.26) a =5.5 Broyles’ SRNN; @ =0.186 
0.3 0.69 0.63 
0.5 1.25 1.13 
1 0.64 0.66 
1.5 0.04 0.147 


8 A. A. Broyles, Phys. Rev. 105, 347 (1957). 4 F 
3 A. A. Broyles, Atomic Energy Commission Report RM— —_ 
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IV. REFINED IMPACT THEORY. FUSION OF 
IMPACT AND STATISTICAL THEORIES 


A. Impact Theory 


A consistent and ingenious development of the 
impact method was carried out by E. Lindholm’*1% 
who was preceded in several important respects by 
W. Lenz®* and G. Burkhardt.*” In deriving Lindholm’s 
results we use recent stochastic methods. 

Let us return to Eq. (1.13), which represents the true 
line form, provided only the adiabatic hypothesis holds 
{see formula (1.11) ]. The immediate problem is to 
calculate the correlation function C. To strip it of 
irrelevancies, we continue to suppose that the radiator 
has only two states, and that only one of these is 
occupied: $;=1, S2=0. We also omit subscripts. Then, 


wo 


C(n=2 f evilelt+r)—e(OIqy. 


-20 


(4.1) 


The phase ¢ is defined by (1.12). In that expression, 
E, and Em contain constant parts, Æ? and £,,°, plus the 
perturbations. The constant parts contribute to ım the 
amount [ (E°—E,,°)/#]t=w't, w being the normal fre- 
quency of the line. In this section, we redefine o (without 
changing notation) to mean only the variable part which 
is contributed by the perturbations. As a result, we change 
Eq. (1.13a) to read 


MON f dreit C (7). (4.2) 


Positive ø corresponds to a positive difference AZ 
(upper state)—AZ (lower state). There is some con- 
fusion on this point in the cited literature. 

The quantity g(t+7)—¢(t) represents the per- 
turbational phase change which has occurred in the 
interval (/, +r). The integrand in Eq. (4.1) is a time 
series in which this quantity changes from instant to 
instant /, and we are integrating over ¢. As is customary 
(ergodic hypothesis) the one time series is replaced by 
an ensemble of time series and the time integral is 
identified with an ensemble average (except for a constant 


factor). Thus 
C(r) 2 (e-#0) y= f e"P(p)dy, (4.3) 


where P,(¢) is the probability of a phase change ¢ in 
the interval r. It is convenient to make the assumption 
characteristic of the impact approach, namely that the 
collisions are sudden and occur singly. The value of y 
‘s then the sum of a whole number of individual phase 
aS es i each associated with one collision. By the 
a f probabili ty, P,(¢) is related to the probability 
ru 
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P,(k) that k collisions have taken place in 7, by 


Pr(e)= X Pell) © [11 P(e) (eX ei) (4.4) 


Qe kL j=l 


and 

P,(k) = (Tv) T/k, (4.5) 
which is Poisson’s law. As before, ve is the collision 
frequency P(¢;) the probability for the occurrence of 
a phase change of magnitude g;, and ô is the well- 
known Dirac “function” which may be replaced by its 
Dirichlet representation 


k iL (p= i 
1> ei)-—f dy exp] in( -X ei) | (4.6) 


The symbol }/1,2..., is meant as a summation over all 
1, $2, ***, k in every factor that follows it. With the 
aid of (4.5) and (4.6), one may write Eq. (4.4) in the 
form 


P,(g)= 


eret (a) 


© VeT 
f iser È TE Preje, 
n k=0 k! j 


2r 


and this may be expressed as 


etet (ze) 
P,(g)=— | dyeiverrra, (47) 
2m Ys 
provided 
(4.8) 


a(y)=E Ploe. 
7 


On substituting (4.7) in (4.3) we have 
e7reT n n 

C)=— f ‘edo | dyeivetreraty) 
2r Jo Be 


1 ir) 
=e faero — f eiamordo], 
2rd s 


The factor { } is ô(y,1). Hence, 
C(r) Sener lia), (4.9) 


This result is sufficient for many purposes and 
underlies most of the work on impact broadening. A 
generalization which allows the collision probability to 
depend on velocity, and which also gets rid of the usual 
assumption that the collisions are sudden and occur 
singly, is contained in Kolb’s”38 report and is given 
below, following the lines of the preceding account. The 
result [Eq. (4.16) ] is used in Sec. V. 

Let us divide the time interval (— œ, œ) into ele- 
ments Aż;. The phase change produced by one perturber 
in the time interval r depends on the time of closest 
approach, the velocity v, and in general on some impact 
parameter /. Let us call this phase change pints 


38 A. C. Kolb, AFOSR-TN-57-8, Astia Document No. AD115- 
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where 7 refers to the time interval Az; which contains 
the time of closest approach. If the perturbations are 
scalarly additive, the total phase change is 


elr)=} P(T); 


where ¢i»(7) is the phase change produced in the time 
interval 7 by all perturbers described by (i,v). Clearly, 


N (iv) 


gin(t)= Z, ginl;(7), 
7=1 


if t(7v) is the number of perturbers characterized by 
(i,v) in some effective volume. We first compute P (giv), 
the probability of a phase change piv due to all per- 
turbers in the class (7,2). 

Assume that the effective volume of our system is a 
sphere of radius R and write 2(v) for the number per 
cubic centimeter of perturbers with velocity v (assumed 
at first to take on discrete values). We designate by 
W (L) and W[Yt(i,v) ] the probability distributions for / 
and St(7,v). The latter is given again by Poisson’s dis- 
tribution, 


(4.10) 


(vpAl;) ROM EP wAti, 


1 
J V N ( Lv a 


iv)! 


where v, is the collision frequency for particles with 
velocity v and is given by 


v= nR vno). (4.11) 


We then have 


P( 9,0) = yf LAGON 
aS- =0¥ ]} INC, v) 


MN iv) 


Niv) 
x | Il W (l,)dl; jo Piv— > einn ). (4.12) 
j=0 j=l 


Again we replace the ô function by its Dirichlet 
representation 


1 [e] 
i(e—w)=— f dyer), (4.13) 
214 o 
substitute (4.10) into (4.12) and obtain 


1 ia) 
P( giv) -—{ dy exp{iy giv vAtiLai, o(y) = 1J} (4.14) 
QTY _ 


provided 


ais) = | WDePrdl 
Now P(¢) can be written 


fe f T Pioden- es). 
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On using Eqs. (4.13) and (4.14) this becomes, after 


rearrangement, 


P= f Gan dye”râtila:, v(y)—1] 


5 f dgyeietatr9 l. 
Wis 


The last integral is 6(y—s), so that 


1 a 
P(g) -— f eeds II ev vAti[aiv(s)—1] 
21 ¥ xn iv 


1 na 
-— f exp{ise+>> nAt:[ ai.» (s)—1]}ds. 
2r —n i,v 
C(r) then becomes proportional to 
es) 1 w% i 
f ds exp{ > vyAt,[ai, »(s)—1]} —f dgese 
= iv 2r Io 


and, therefore, 


C(r) ep A nAn e 


This finally can be written with the use of Eq. (4.4) and 
replacement of the sum by integrals, as 


C exp] fav f amzent (4.15) 


We now apply this general formula to the special 
case of impacts. In this approximation, all collisions are 
assumed to be short enough so that those having their 
time of closest approach in r are also completed in 7, 
and one may disregard the others. This means, in terms 
of svp that 

=0 if ¿is not in 7 
Ptv l 
= ¢», 1 lf ¢ is in 7. 


In the second instance gs v, ı is independent of time, and 
therefore ¢,,, represents the total phase change pro- 
duced by one particle with time of closest approach in 
7 and with impact parameter /. Under these conditions 


at,»(1)=1 if Z is not in 7 

ane(1)=as(1)= f dv Deen if ¢ is in 7. 
Equation (4.15) then leads to 

sorme Sa (4.16) 


s is the general form of C (r) for an impact theory 
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based on scalarly additive perturbations. We return to 
this equation later (Sec. V). Equation (4.16) reduces to 
(4.9) under the proper conditions. To show this we 
first assume that all perturbers have the same velocity 
v’. This means we replace 2(v) by d(v—v’), where 7 is 
the number of particles per cubic centimeter. In view 
of Eq. (4.11) and because v, is now identical with our 
former ve, we then recover Eq. (4.9) 


C(r) fod ercla(1)—l] 7 


This is the correlation function for perturbers all of 
which have the same velocity. This derivation shows 
that its validity rests on the premise of scalarly additive 
perturbations and does not require the restriction that 
the impacts fail to overlap in time; this feature was not 
evident from the simpler proof which led to Eq. (4.9). 
We return to calculation of the line intensity, em- 
ploying the correlation function given by Eq. (4.9). We 
split a(1) into its real and its imaginary part, e=a1-+-ia2 
and put 

y(1—a1) =m, — va: =z (4.17) 
so that 


C(r)= —(urHiuz)r. (4.18) 


The calculation of J in accordance with Eq. (4.2) is 
then easy. One further detail has to be remembered. It 
follows from (4.1) that C(—7r)=C*(r), a relation 
necessary for the reality of J (w) but obscured by the 
explicit form (4.18) which is valid for r>0. Thus 


C(r)=C*(—7) =e)" for <0 
and by (4.2) 


0 o 
Ilw) x f euirti(w—w’—u2) ‘art f eu tt+i(w—w/—u2) ar] 
—0 0 


= 22+ (lw— w — u) |. 
To make fI (w)dw=1 we must write 


I (w)=1/n[ur+ (w—w’—u2)? FP. (4.19) 
This represents a dispersion curve with a shift 
us=v. 2, P(y;) sing; (4.20) 


7 

to the blue of the normal peak at w’, and a half-width 

w= 21 =2vf1—) P(y;) cose;]. (4.21) 
: i 


Lindholm now falls back upon Eq. (1.5) for a com- 
putation of the p;, which he identifies with the Ag(p,2) 
of that expression. This proficient and simple step is 
arranted under two conditions: (a) The impacts are 
Be n and do not coincide; (b) the perturber has a 
ae ath, so that L(e?+v7)*] is meaningful. The 
i ] been discussed; the second is 
ye examined in the light of 


AF 
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Equation (4.21) distinguishes between phase changes 
by the arbitrary index 7; Eq. (1.5) classifies them by the 
impact parameter p which is the closest distance of 
approach. The connection is this: If 9; is taken to be 
o(p) then Pl ¢(p)]=2mpdp/g, where q is the total 
collision cross section. On the other hand, »v,.=nqu. 
Therefore (4.20) and (4.21) become 


Eee E — 


u=no | 2xpdp sinA g(p,v), (4.22) 


w= 2m q— | 2rodo cosae(o) 


=2m | (1—cosde)2xpdp=4eno f sin?A pdp. (4.23) 


The integrations over p are to be carried from 0 to 
(q/)}. Because of the rapid decline of Ag with p it is 
often permissible to replace the upper limit by œ. In 
the case of long-range Coulomb interactions there is the 
added difficulty of giving a meaning to q. 

The evaluation of Ag proceeds in accordance with 
(1.4); it is profitable only for simple force laws of the 
form 


elr) =20,/7°. (4.24) 


Here we have replaced the former C, by #2, to save 
writing. For our purposes the form (4.24) is satis- 
factory, for we shall be dealing mainly with linear Stark 
effects (where e=fsf’'=hse/r*, whence o= 2) and quad- 
ratic Stark effects (where e=AtF?=ħte/r*, whence 
o=4). Here eis the difference in the Stark displacements 
of the two states; it refers to the line, not to a given 
state. 
With (4.24) Eq. (1.5) yields 


TiC: 1)/2] Q 


de=o, f di(p?+-070?)-¢ 2 =a 3 
= T(c/2) p74 


z 
- 
a 
E 
: 
= 
E 
pps 
= 
= 
l 


In particular, for o=2 
Ag= (T9)/ (vp) 


Ag= (T )/ (2v6). 


We now compute (4.22) and (4.23) for these two cases. 
The shift for the linear Stark effect is not observable 
as such because the splitting pattern as a whole is 
symmetric. Lindholm and Unsöld do not compute it for 
that reason. Nevertheless, it is meaningful as a measure 
of the shift of each Stark component, and it does 
manifest itself in each wing of such lines as Hg and Ha 
which have zero intensity at the center. Since q is not 
well defined, the upper limit in the integrations of (4.22) _ 
and (4.23) will be taken as ro (see Sec. III: (4r/3)ron 


(4.25) 
and for c=4 
(4.26) 
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Coulomb force is effectively screened.|| When c=4 the 
convergence is fast and the integrations may be carried 
to ©. For o=2 both integrations are straightforward 
(see Lindholm) ; the forms below involve an expansion 
of the integral sine and cosine. For o=4 a graphical 
integration is necessary to determine the numerical 
factors. Thus one finds: 
Linear Stark effect, c=2 


; nfo m Th 
Shift = m=22( 000 ), (4.27) 
TR 4 2uro 
n TR 
Half-width =w; = 27°Q.?— (0.923 D 
v vro 
TR 
— +) (4.28) 
240° 19" 
Quadratic Stark effect, c=4 
Shift =2=9.80;30!n, (4.29) 
Half-width=a;= 11.40,30'n. (4.30) 


In these formulas v is the speed of the perturbing 
particle (in the center of gravity system). 

Equation (4.27) is somewhat embarassing: it makes 
the shift depend directly upon the cutoff radius and 
thereby emphasizes the uncertainty of our assumptions. 
Numerical considerations even show that for electrons 
as perturbers, (4.27) is generally much greater than 
(4.28), and this exposes an inconsistency of the present 
method. Table V shows the ratio 2#2/w; from (4.27) and 
(4.28) for different temperatures and electron densities. 

Burkhardt,3? Lindholm,*-* and particularly Unséld'® 
couple an interesting consideration with the analysis 
leading to Eqs. (4.29) and (4.30), where no convergence 
difficulties are encountered. If o=4, Eqs. (4.22) and 
(4.23) read 


m=no | sin (po/p)*- 2mpdp, (4.31) 


w, =4nv f sin?4 (p0/p)*+ 2mpdp, (4.32) 


because in this instance Ag, given by (4.26), takes the 
form (po/p)? if we put po= (T9,/2v)è. This, is the value 
of p (always in terms of our single-impact analysis) for 
which the phase change Ay=1 and therefore corre- 
sponds to Weisskopf’s critical radius pe. 

Consider the integrands in (4.31) and (4.32). Clearly 
sin(po/p)* fluctuates rapidly about zero for small p and 
becomes monotone and decreasing for large p. The 
initial fluctuations practically make #2 depend on the 
contributions it gets from large p. But as for (4.32), 
its integrand is never negative; hence, the initial fluc- 

y || The Debye-Hückel radius would be a better limit; but the 


p results should not depend critically on this choice if they are to 
ny be believed. 


TABLE V. Electron broadening. Ratio of shift to half-width in the 
linear Stark effect. 


T(°K)\n 1012 1013 1014 1015 1016 
10 8.6 3.5 2.6 1.4 1.05 
108 22 11.7 6.3 3.5 1.9 
104 58 30 16 8.6 3.5 
105 158 81 42 22 11.7 


tuations contribute greatly to w, while the small con- 
tributions from large p are less significant. Now the 
division between large and small p comes at p= po, the 
point beyond which neither integrand oscillates. 

All this suggests that perhaps a useful simplification 
results if we integrate (4.31) from po to œ, and (4.32) 
from 0 to po. Furthermore, we replace the sine in (4.31) 
by its argument and revert to the original form for t2, 
Eq. (4.20). It reads 


t2=ve 2 P(o) 9j=reGj- (4.33) 
1 


The line shift, which arises primarily from small Ag 
(large p) is in effect the average number of phase shifts 
per second. 

The integration of (4.32) from 0 to po changes the 
numerical factor in (4.30) very little; hence, we may 
conclude that, under conditions in which (4.32) is 
valid, contributions from Ag smaller that 1 may be 
ignored. This, however, is not the same as saying that 
(4.32) is valid when Ay>>1. Within the limits set by 
the example of the second-order Stark effect, and for 
force-law parameters o greater than 4, one arrives at 
the qualitative rule that impacts within the optical 
radius broaden, impacts outside the optical radius shift 
the line. This somewhat precarious generalization is 
belied by Eqs. (4.27) and (4.28), and by Table V; 
hence it has no meaning for first-order Stark effects. 

What are the conditions for validity of (4.22) and 
(4.23) of which (4.31) and (4.32) were special cases? 
We have derived them by assuming the impacts to be 
sudden. A little reflection shows that the derivation is 
essentially unchanged provided the collisions do not 
overlap, i.e., are binary. The case of multiple collisions 
has been treated by Lindholm (though in a way which — s 
makes no allowance for the vector superposition of fields _ 
in the Stark-effect problem and which remains within | 
the “classical path” approximation) who succeeded in 
showing that the foregoing results of this section hav 
wider validity than the present analysis suggests. So 
further consideration is given by Krogdahl? andy gb á 
Kolb.: n: 


B. Core and Wing Theorems for è | 
Individual Collisions i 


Under the assumption of binary collisions the c la 
for the impact and the statistical theory take § espes al 
forms. wee 


3 M. Krogdahl, Astrophys. J. 110, 355 
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1. Core Theorem 


The impact theory (see Sec. I) was established on the 
assumption that the impacts are sudden and separated 
in time. The general form for the intensity distribution 
depends on integrals of the form /etle—Ae-dq?, A 
single impact, and therefore the change in ¢ corre- 
sponding to a single impact, are completed in a time 
of the order of p/v. If the change in Aw:t over this time 
interval is small compared to unity, i.e., if Awp/v>1 
the impact can be considered to be sudden. This 
inequality can always be satisfied for sufficiently small 
Aw, hence the impact theory is valid in the core of the 
line (core theorem). 

For single impacts the phase change is (see Sec. ITA) 


Cor TL (e—1)/2] 
T(o/2) ` 


With the result we can derive! a more specific criterion 
for validity of the impact theory. If we assume that the 
important impacts are those that occur within the 
optical radius pe for which Ay~1, we have 


Pc (C./hv) oe. 


Combining this equation with the foregoing inequality 
we have for the range of validity of the impact theory 
in isolated collisions 


AwXv(hv/C,) Ue, 


In certain cases Kolb?®38 has shown that this relation 
is valid under more general conditions than the re- 
strictive ones imposed for this simple derivation. 


hop? 


2. Wing Theorem 


The wing theorem, Sec. I, shows that for sufficiently 
large Aw (measured from the normal line position) the 
statistical theory holds, but it does not say beyond what 
frequency it may be applied. When the line arises from 
individual impacts, a very simple argument covers this 
point. For there appears in the entire formalism only 
one time (and hence one frequency) that is charac- 
teristic of an impact namely, the duration of the per- 
turbation 7p. Hence, there is only one frequency rp! 
to which appeal can be made as a critical limit. The 
condition, Aw > %, may therefore be replaced by 


Aw:r,>1. (4.34) 


A more pictorial demonstration of the meaning of 
this inequality has been given by several authors, 


Aw Evh 


— d —— 


à 


: resentation of collisions. Long and flat rec- 
Fic. f ae gende impacts of long duration; short and tall 


; les re etic 
_feetangles represent onets 


Ratt brisdrumuPKehigri University Haridwar chB&ibn thignidautores tiendadistiaace, d, is constant). Then 
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among whom Burkhardt’? seems most explicit and 
circumspect. He pictures different collisions as produc- 
ing different rectangles on a perturbation energy vs 
time plot, as drawn in Fig. 5. Distant passages have 
large tp but small Aw, for in this instance we may 
identify a specific Aw within the observed line with a 
specific passage. A given passage is assumed to affect 
recorded frequencies in two ways, by virtue of the 
phase jumps occurring at the beginning and the end of 
a perturbation, and by the actual change of frequency 
during the perturbation. The former leads to a formula 
like (4.21) (in which ve is strictly twice the collision 
frequency). Empirically, the resulting width is usually 
small. 

The detuning over an interval 7, requires different 
treatment in different cases. Taken by itself, a given 
rectangle in Fig. 5 produces a line at the frequency Aw 
and of Lorentz width 1/7». Consecutive impacts yield 
superpositions of such lines, and these superpositions 
produce a statistical distribution about the different 
values of Aw if 1/r,Aw, i.e., if the different impact 
lines do not overlap. Here we discover again the 
inequality (4.34). 

This reasoning is alone perhaps not quite convincing; 
for it will occur to those with experience in Fourier 
analysis that the replacement of a set of continuous 
perturbations by the rectangles of Fig. 5 is a risky 
practice. Still this demonstration possesses merits, even 
if rigor is not among them. 

There is some likelihood of confusion in the criterion 
(4.34). We noted, in the text following Eq. (4.32), that 
the condition Ag>1 selects those encounters which 
make important contributions to the impact width of 
the line. This same condition is implied by (4.34). But 
(4.34) is also the condition for validity of the statistical 


theory. If impact broadening and statistical broadening ` 


were mutually exclusive or contradictory effects we 
should encounter here a paradox. This, however, is not 
the case. Inequality (4.34) means what it says: it 
permits the use of statistical theory. The earlier in- 
formation is also true; it means simply that an impact 
theory, when carefully employed under these conditions 
(which is in general more difficult to do), must give 
approximately the same answer. The literature contains 
many instances showing this to be true.*4 Lindholm’s 
theory, for instance, leads to Margenau’s statistical 
formula for Van der Waals broadening, a case which 
has been repeatedly discussed in this connection. 

The overlapping of impact and statistical domains is 
further illustrated by the following coincidence. We 
saw that the line shift (4.33) arises from small phase 
changes, which violate the criteria Ag>>1. But if we 
compute it under certain circumstances, it is never- 
theless the statistical shift. Assume, for instance, that 
every g; arises from a perturbation AZ; lasting a time 
which is inversely proportional to the velocity v (50 


= 


oy 


SHOT (C WIR ANIC 


pj=(AE;/ħ)(d/v). Also, ve=nqv and by (4.33), uz 
=nqdAk/h, a result which displays the important 
characteristic of all statistical theories inasmuch as it 
does not depend on v. In fact, it is exactly the statistical 
result. In Fig. 6, A is the mean free path. Along it 
AF=0 except in a region of length d. Hence, 


(AE) 10-A+AE-d ngdAE 
Aw =—— = = 


h h A h 


because 2qA= 1. 


C. Some Numerical Estimates, Mainly 
Regarding the Balmer Lines 


The wing theorem for binary collisions says, in 
effect, that for Aw greater than w”, such that w’7,=1, 
a statistical description is appropriate. But as we have 
seen, wrp also equals Ag if this simple picture holds. 

We now consider the first-order Stark effect. Here Ag 
is given by Eq. (4.25). Putting 


Ag= (1Q2)/(vp-)= 1 


FIG. 6. Aw vs’ distance 
travelled by perturber. 


Cp) = =n t 


we have pe=r9/v. The frequency displacement for 
this pe is 

w” = 22/7? =2»/p2=0/ (WM). (4.35) 
This “statistical frequency limit” within the line is 
independent of the perturber density. 

Let us now consider the half-width of the line which 
would be calculated by means of impact theory, Eq. 
(4.28) : 

w, = 2770/0. (4.36) 
The dependence on v is interesting; the statistical 
frequency limit moves out farther with increasing v, 
whereas the impact width becomes smaller. 

To make a comparison of (4.35) and (4.36) we need 
the values of Q». These are discussed by Unséld (whose 
C is our 22/27) are are given for the Balmer lines: 


Jala Jal Jel dilg 


Q2 3.96 10.35 20.5 27.6 (cgs units) 


As a typical case we take the H, line, emitted at a 
temperature of 10000°K and an ion density n= 1015 
cm™. At that temperature v (electron)=6.23X107 
cm/sec and v (H-ion)= 20.6 10° cm/sec. From (4.35) 
and (4.36) one then computes the following values. 
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TABLE VI. Statistical wavelength limit AX” in Angstroms for the 
Balmer lines (after Unséld). 


T= 25 000° 10 000° 5000" 3000° (K) 
H Electrons 580 230 110 70 

« \Protons 0.63 0.25 0.12 0.08 
H (een 120 48 24 14 

8 \ Protons 0.13 0.05 0.03 0.02 
H Electrons 48 19 9 6 

Y \Protons 0.05 0.02 0.01 0.006 
H E 32 13 6 4 

§ \Protons 0.03 0.01 0.007 0.004 


H,, line broadened by 
w,=4.8X 10" sec, w= 1.110" sec. 
electrons: w= 1.6X 10" sec, w’=9.8X 10" sec. 


H ions: 


For all these cases the Holtsmark half width is ~ 10! 
sec™!. The ions, if their broadening effect were computed 
by the impact theory, would show a half-width greater 
than œW”. But since for Aww” the statistical theory is 
valid one may, so long as one is not interested in the 
intensity distribution for smaller Aw, disregard w, for 
the ions. The electrons, however, produce an œW” well 
beyond the limit of interest; hence, a statistical treat- 
ment for them is not proper. On the other hand a, for 
electrons is very small. The important contribution to 
the line structure comes from the statistical effects of 
the ions beyond œw” and everything else, indeed the 
entire role of the electrons, can be ignored. The actual 
Holtsmark half width computed for the zons under the 
conditions here assumed is of course greater than w 
(it is about 8X10" sec). 

In cases of this sort it is sometimes claimed that the 
part of the line beyond w” can be treated statistically 
while the inner portion arises from impacts. There is 
no logical warrant for the second half of this statement 
despite the core theorem, for the limit of validity of 
the core theorem need not coincide with that of the 
wing theorem. Since the interior part has usually been 
of little interest in line analysis no damage is done by 
that pleasing supposition. 

Table VI, taken from Unsdld,!* shows the wave length, 
AX”, in angstroms from the line center, beyond which 
the statistical theory is applicable; AN” = (\2w’”)/ (2zre) 
if \ is the normal frequency of the Balmer line. In all 
these instances the limit in question is close to the 
line center for proton broadening, so that Holtsmark’s 
theory may practically be applied throughout the 
intensity distribution to the protons. The limit for 


electrons is so far out that they may not be treated 
statistically. But their impact widths are small, and 


presumably their entire effect is therefore negligible. 

Our logic regarding the electrons contains a bli 
spot, which we hope the remainder of this article w 
in part remove. At this point we observe only the pr 
portionality of w, with » in Eq. (4.36), a feature whi 
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indicates that at higher densities the electrons do 
become important. Had we chosen n= 10! cm™ and 
T= 10 000°K in the example of Ha (see text following 
Eq. (4.36)) œw for electrons would have been greater 
than œw” for protons. 

Before concluding we take a glance at the second-order 
Stark effect. Here Ag is given by (4.26), and if this is 
to be 1, pe= (#24/2v)*. The corresponding 


oo! = 04/pb= (20/m) $043. 


A typical value of 9 is 2X10! cgs units. Because of 
the dependence of w” on v! we encounter here a situation 
like that presented in Table VI: protons may be treated 
statistically fairly close to the center of the line, elec- 
trons only beyond a large, uninteresting distance from 
the center. But according to Eq. (4.30) the impact 
width wi=11.40,?v'n, It is proportional to v3, not to v~! 
as in the first-order Stark effect. Impact broadening 
therefore favors the electrons in the present instance. 
A numerical example illustrates the situation. If 
T=10 000°K, w’=7X10" sec. Assume again n= 10% 
cm. This makes Awọ in the Holtsmark theory 
[Awo= 4 (Fo?/e*), (compare Sec. III) ] equal to 1.36 10° 
sec. For the frequency w”, where this theory becomes 
valid, the parameter w’’/Awo~500, and at this value 
the function W[[(w’”/Awo)?] (see Fig. 2) is already quite 
small. Hence the Holtsmark theory has little to say 
about the case. The impact width for electrons, how- 
ever, leads to w,=8X10" sec! which is the dominant 
contribution. 

The formula for the electron contributed to the line 
width in the case of the quadratic Stark effect may be 
written more explicitly. For a hydrogen-like atom (if 
Z=1) or an ion of charge Z with principal quantum 
number g one finds, approximately, 


1/ g Nî ace? 
(e 
2\Z h 
g 6 
=14(=) X10 
Z 


g 4 
w= 6.6X 10-(=) mt. 


wigs tt AB TS 


Hence 
(4.37) 


V. DISAPPEARANCE OF LINES WITH HIGH 
QUANTUM NUMBERS 


Special consideration must be given to the Balmer 
Jines imvolving high quantum numbers, and in general 
to lines emitted from highly excited states of atoms or 
ions within a plasma. The states near the continuum 
Jimit have a tendency to disappear for two reasons : 

‘4, Broadening of the levels causes them to merge, 
EN aa ffect begins among the excited states which lie 


> together on the energy scale. We speak of this as 
erging of states. 
ere also an effective lowering of the con- 
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tinuum limit or, more precisely, an upward displace- 
ment of the higher levels because the electron in the 
atom does not move in a pure Coulomb field. The actual 
field in a plasma is subject to the Debye-Hückel cutoff. 
As a result, levels do not approach the energy zero with 
large quantum number g like — 1/g°, but there will be 
a finite distance between the energy of the level with 
the largest g (say g*) and energy zero. This has been 
recognized in connection with the problem of calculating 
partition functions, where actually both effects, 1 and 2, 
are important. 

Effect 1 has been treated by Inglis and Teller, 
effect 2 most carefully by Unsöld” and by Ecker and 
Weizel.” 

The best simple derivation of the formula for the 
merging of levels (differing only slightly from that of 
Inglis and Teller) and reliable comments on its precision 
are found in Unsdéld, whose method is followed here. 
Our analysis will again be applied to the levels of 
hydrogen, for which E,=—e?/2g?a, and therefore the 
undisturbed level separation at large g is 


AE= @/ gao. 


Here ao is the first Bohr radius. The splitting of the 
levels in the linear Stark effect for the extreme com- 
ponents is approximately 


Be acel’. 


The most probable value for the ionic field is 4.2 end; 
Inglis and Teller chose the smaller value 3.7 em. 
Merging will take place when the splitting equals 
~AE/2. Hence the last discernible level will have the 
quantum number g* which is defined by 


il @ X a3 Terni 
— =~(g*)?a03.7e?n3. 
2 ao(g*)® 2 


When logarithms are taken this becomes 


logion = 23.3—7.5 logiog*. (5.1) 
This formula has often been used for the determination 
of the ion density x. 

So far we have neglected the electrons. If only their 
statistical effect is considered (assuming that their 
impact width is small), Eq. (4.28), which represents 
the limit beyond which the statistical theory holds, tells 
what role is played by the electrons. For if hew” is much 
greater than the above AZ/2, they do not contribute 
to the ionic breadth when it equals AH/2 and may 
therefore be neglected. But if 


jes! = (he?) / (4202) <AE/2 


“ D.R. Inglis and E. Teller, Astrophys. J. 90, 439 (1939). [For 
osata verification see F. L. Mohler, Astrophys. J. 90, 429 


a Uneeld, Z. Astrophys. 24, 355 (1948). 
c ker ad W, Weizel, Ann. Physik 17, 126 (1956). 
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the electrons must be considered. For large g, the linear 
Stark constant [defined in Eq. (4.24)] has approxi- 
mately the value eaog*/#. The value of v for electrons 
is taken from 


3kRT=3 mr. 


Equation (5.1) is therefore subject to modification on 
account of the electrons when 


A 3kT/m e 
SS << A 
m’ €ao(g*)?/h 2(g*)8ao 
5105 °K 
< Seas, 
6h? kg* g* 


or (5.2) 


m met 


How to include the electrons when this condition holds 
—and it holds in many cases of physical interest—is, in 
` principle, a difficult problem. The custom seems to be 
to accord them a role equal to that of the ions and to 
replace # in Eq. (5.1) by 2n. Since this adds to logio 
only the amount 0.3, and the number 23.3 on the right 
of (5.1) is hardly certain to within that amount, it is 
not reasonable to worry about the electrons in this 
connection. 

The second effect, the drowning of the higher levels 
in the continuum, is treated by Unsdéld in a very 
simple schematic way. The Coulomb potential in which 
an electron moves never reaches the value zero because 
another positive ion is situated at a finite distance from 
it. The electron may slide over into the trough of the 
other ion, even at a negative total energy, much in the 
manner in which high atomic levels are depleted by a 
strong electric field. If the depth of the transfer channel 
is made to correspond to the mean distance between 
atom and nearest ion, the highest permitted quantum 
g* number is given by 


4r \ ` 
(g*)?= (Z4/8/6a0) (>) ; (5.3) 


With this simple model, g* does not depend on the 
temperature because it ignores the electrons. 

The treatment by Ecker and Weizel involves a solu- 
tion of the Schrödinger equation for an electron moving 
in a Debye field (or, if the reader is a nuclear physicist, 
a Yukawa field) of the form 


— (Ze?/r) exp(—r/D)+ const. 
The Debye radius is 
D=[kT/4xrne*(1+-Z) ]} 

T=10 000°K 


=10-° cm for | (5.5) 
n=10'* cm“ 


(5.4) 


in terms of Z, the charge on an ion. The result obtained 
is simple and agrees with the plausible expectation that 
the largest orbit of the cited electron shall be smaller 
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Merging of Levels | ~~~----—~—— 
(Inglis — Teller) 
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Maximum Permitted Quantum Number, g 


16 17 18 


10 10 


Fic. 7. Disappearance of levels in hydrogen. 


than D: 
(g*)?ao/Z< [kT /4arne?(1+-Z) P. 


For Z=1 this means 
g* 3X 10! (n/T): 


if g* is the greatest possible quantum number. 

Equation (5.7) is plotted for three different tempera- 
tures in Fig. 7. On the same figure we have drawn Eq. 
(5.1). Unséld’s Eq. (3) [Eq. (5.3) of this paper], which 
is not included, specifies values of g* which fall slightly s 
below the dotted curve. For high temperatures and low + 
densities the merging of the levels according to Inglis 
and Teller determines the highest permitted quantum 
number, whereas in the other extreme the drowning of 
levels in the continuum is the decisive effect. 

The latter has been studied in another way, employ- 
ing straightforward perturbation theory, in a recent, 
unpublished calculation.‘ This led to the result 


(g*)?=0.86D, (5.8) 


which reduces the limiting quantum number slightly 
below the value given by Eq. (5.6). 

Perturbation theory is not strictly applicable since 
for the last level the perturbation energy is of the same 
order of magnitude as the unperturbed energy. For this 4 
reason we present here the essentials of a parallel _ 
variational calculation] in which an exponential 
screening factor 8 is employed as variable parameter — am 
in connection with hydrogen state functions of prin- 
cipal quantum number g. It may be asked why this 
method should be expected to work, since variation of 
3 might push a hydrogen level initially assigned to the | a 
quantum number g down to energies corresponding to” r 
lower values of g. The answer is that variation of 8 
does not alter the number of nodes, which is controll 
by g, and that therefore a crossing from one g to anot 
cannot occur. j 

In view of the Debye-Hückel shielding effect thi 


(5.6) 3 


(5.7) 


8 D. Kelly and H. Margeran, Progress Report, October 1, 
Contract Nonr 609 (22). 
{ Contributed by D. Kelly. 
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atomic electron moves in a potential The variational constraint, d£,/03 =0, leads to 


1 4 B—Z+ 26S (8)/a6=0. (5.13) 
i v)=-ze(- = —) rib p mO. Ps 

j r b+D The principal quantum number of the last level (g*) is 
i V= given by the equation Z,*=0. This gives 

; -Ze expl—(r-2)/D] $ 

: Vo(r)= = ee y, 2Z— 3—2ZBS(8)=0. (5.14) 


1+2/D r 


Elimination of Z through (5.10) and (5.11) gives an 
equation involving only £, 
1— 26S (8)+6 (3S (6)/38)=0, 


which has B=1.27 as a root. Thus 


The maximum of the radial charge distribution falls 
at r= D, signifying that there is a considerable fraction 
of the surrounding ion cloud inside the “Debye sphere.” 
We assume 0D. As a consequence we can use V(r) 


(5.15) 


ac aE 


eee oe © 


=Vo(r) with little error. In view of these assumptions 
concerning 6/D and V(r) we write the Schrédinger 
equation for the optical electron in the form 


y= — (h?/2m)Vp—Ze(e7!?/r)y= Ey. 


Choosing functions ¥, which satisfy the Schrödinger 
equation for the hydrogen atom of nuclear charge 3 
for quantum number g we compute the integral 


(5.9) 


E,= f Yo Hydr 


and requires 04,/03=0. (8 is the variable charge 
number, Z the fixed ion charge number.) Since 


— (#/2m) Vp = [E+ (Be/r) Wo 
and E,°= 37En/¢* we have 


2 B— Ze"? 
A ar, (5.10) 
& r/ao 


where Ey = —e/2a9= — 13.53 ev. Expanding e—"” in a 
power series yields 


B— ZeD 
f >) |V l°dr 
r/ao 


K=4o/D 


((xr/a0)")= f [Yo 2(kr/a0) "dr. 


The values of ((xr/ao)") are given by Condon and 
Shortley.“ For S states, 


E,=Enl{l(—3°+282)/¢")—22Z«S (6)} 
=xg?/B 
356° 63 


3 D eee: 
so- — i 102 960 


(5.12) 
1+ 
nd G. H. Shortley, Theory of Atomic Spectra 


fc ae ge University Press, London, 1957). 
(Cambric 


B=[(g*)?a0]/3D=1.27. 
Reintroducing the true ion charge number, Z, we find 


which differs very little from the perturbation result, 
Eq. (5.8). 

We might also inquire how the departure from the 
hydrogen level structure affects the merging of lines. 
For this purpose we can apply perturbation theory to 
all but the last few levels, using hydrogen state functions 
with l=lmax=g— 1, since it is for these substates that 
the Stark splitting is extreme. These functions prove 
to be very simple and it is not necessary to resort to 
to the expansion in ((xr/ao)™). The calculation shows 
that the first order correction to the hydrogen levels is 
independent of g and the second order term is so small 
as to change g* as given by (5.1) less than one percent. 

There are other corrections. Edmonds*® has calcu- 
lated, for instance, how the effective mean distance 
between ions is altered by the Debye shielding effect. 
For present concerns, relating to the disappearance of 
the lines, this consideration is of secondary importance. 


VI. BROADENING OF DEGENERATE LEVELS 


Recently, Kolb has treated the broadening of 
hydrogen lines in a manner which is more careful than 
previous treatments in its consideration of the diffi- 
culties of degeneracy and which furthermore attempts 
to include both ions and electrons in their simultaneous 
actions upon the radiating atom. 

We present a synopsis of Kolb’s calculations after 
first discussing certain fundamental ideas of the method 
and also its principal results. The details given here 
differ from Kolb’s in this respect: our functions [see 
Eq. (6.8) ] are solutions of the unperturbed Schrodinger 
equation, whereas Kolb’s are said to be adiabatic 
functions. The latter choice leads to difficulties which 
our treatment avoids. Let us recall the meaning of 
“adiabatic” in the first and second sense discussed in 
Sec. IIB. The collisions a hydrogen atom can experience 
with an electron can be divided roughly into two extreme 
groups, namely, (1) close collisions that turn the atom 
around (adiabatic in the second sense) and (2) distant 


4 F. N. Edmonds, Astrophys. J. 123, 95 (1956). 
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or weak collisions that do not turn the atom (adiabatic 
in the first sense). We assume that under certain speci- 
fiable conditions the weak collisions are mainly respon- 
sible for the broadening, so that the analysis can be 
restricted to collisions of type 2. One can now speak of 
adiabatic and diabatic in the following sense; an 
adiabatic collision leaves the atom in, or returns it to, 
its original state (permitting the energy to change while 
the collision is in process), a diabatic collision causes 
transitions to other final states. 

Principal conclusions are (a) the electron effects are 
an important factor in the broadening of the hydrogen 
lines; (b) the adiabatic and diabatic electron effects are 
of comparable magnitude. 


A. Mathematical Foundations 


The energy radiated and absorbed per second due to 
transitions between states m and the state (providing 
E,°>E,°) is given by” 


T,,(w) x >Lem(0)—pn(0) ] 


i 1 T 2 
m dinne (iei! ) (G) 
0 Av 


x lim — 
Ton T 

Here pm(0) and p,(0) are Boltzmann factors for the 

states having unperturbed energies En? and £19; pnm? (t) 

is the dipole matrix element defined by 


POS f X,*()pxn(Udr, (6.2) 


where yw is the dipole moment operator. The functions 
X, satisfy 

1X n (t) = [Hot Hilt) ]Xn (8), (6.3) 
provided Ho is the Hamiltonian for the atom and H,(é) 
the interaction between the atom and the perturbing 
ions and/or electrons. Equation (6.1) is valid if 
E> Em. For states (m,n) such that E,°>£E,°, eiet 
must be replaced by e+. 

Equation (6.1) was derived by calculating the 
increase in population of the nth state due to the 
action of the light. Since £,°>£,,° the first term, 
pm(0) | J unmet]? is seen to represent absorption, i.e., 
transitions from m to n; the second term p,(0) 
X | Sunm?(e-*|? represents induced emission (elec- 
tromagnetic field was not quantized) i.e., transitions 
from n— m. Therefore J,(w) is the net absorption. 
The average in Eq. (6.1) is to be carried out over 
different collisions. 

Suppose that the states n and m are degenerate in 
the absence of perturbations. For the level n, we call 
the collection of states f and label the individual states 
es; for the states m we use i and a;. Occasionally, we 
omit the subscripts on e and a. Equation (6.1) thus 
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becomes 
1 
Tyla) = E [oas(0) pes = 
iai 2m 
1 T 2 
x lim — f dieja; (t) i®t » c (6.4) 
Too Th 0 Ag 
Defining 
1 T 2 
I eja5=— lim iS dieja; (eit (6.5) 
2r T-x2 A) Ke 


we now sum over all degenerate f states, obtaining 
De Iglo) & Vik{Veailpai(0)—per(0) Merai}. (6.6) 


The term in brackets represents transitions between the 
groups of 7 and f states. The first term represents ab- 
sorption, and we define the absorption coefficient J; as 


g= ae Loa: (0) erai]. (6.7) 


In keeping with the weak-collision hypothesis, a solu- 
tion of Eq. (6.3) is written as 


Xa() =Z Caralt) Pa! 
x | ‘fe 0+ (Hy) yar | (6.8) 
ex TIOS a’ Ll) a'a’ ? . 
P pal, 


where gq? and E«”? are eigenfunctions and energies of 
Ao, and (Ai) aa is given by 


Hara | oaout. 
When Eq. (6.8) is substituted into Eq. (6.3), we obtain 
: i 
Caralt) =— fz Dns Caralt) (Ai) arrat 
h a! ,a'4a!! 
Xexpl—i{warar%+Pare} | (6.9) 


upon using the following abbreviations: 


Wa" = Ea /h 


Weta! P= Wa — We? ; 


(6.10) 


1 t 
Pæar=Pau— Par; Pua f (H)aadt. 
0 


h 


We take as an initial condition that only the state a is 
present at ‘=0. With weak collisions only the initial 
state is appreciable at a later time, i.e., 


|Caa(0)|~|Caa(t)|~1; |Cara(t)|<K1 for a’ Ža. 
Hence, Eq. (6.9) can be written 


i t 
Caralt) =— -f (Hi)ara 
ho 


Xexpl—7(waatl’+Paa) jdt’. (6.11) 
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In this approximation Eq. (6.8) takes the form 
Xali) = pa” expl—i(wat+ Pa) ] 
+ DY! Carat) ga’? expl—t (watt Pa’) J. 


a’, a'a 


(6.12) 


; When Eq. (6.12) and a similar one for X(f) are used to 
= calculate p.a°(#) the result is 


Heat (t) =Hea exp[2 (weat+Pra) | 
ay bea Cee (t) expli lwe ttHPea)] 


m < e 


Dey HeaCata(t) expli lwe tt t+HPea)], (6.13) 


where p.2= /vug.dr and terms of the order C? 
have been neglected. Equation (6.13) together with 
Eq. (6.5) determines the absorption coefficient. We are 
now ready to consider two special cases of interest. 


B. Adiabatic Approximation 


The off-diagonal elements are neglected in this 
approximation, only the first term in Eq. (6.13) being 
used. This simplifies the theory so that it can easily 

be applied to any two states (a,e). The resulting ap- 
proximation is general enough to include both electrons 
and ions and serves as a guide for the more complicated 

_ theory below, which includes the off-diagonal elements. 

= On substituting Eq. (6.13) into Eq. (6.5) we have, 

= neglecting the off-diagonal elements, 

Pa wl, 


S | Hea? |? im 1 


2r Ta T 


T 2 
x i dt exp[—i(Aw.a®—Pea)]] , (6.14) 
0 Ay 


=w— Wea. Equation (6.14) can be put into 
usual correlation form (see Sec. IV) 


fin Oe o 
Lea= [He Re f dr 
0 


T 


Xexp (tAw.a°r) (expl —iP ca(T) ])wv. 


term P.a contains interactions involving the ions 
ctrons. We therefore write 


Pea= Peat t+ Peat 


the ion and electron averages can be 


(6.15) 


sume that 


herefore permit the 
independent. With 
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where F is the instantaneous ion field, e the charge on 
the ion, and Q* are terms connected with the Stark 
shift of the states (eœ). 

The electrons are treated by the phase shift (impact) 
approximation. Consider 


(expl —iP ea" (7) ])av 


ofif ranna) 


w 


(6.17) 


where H,’ is the interaction of the atom with all the 
electrons in the plasma. In the evaluation of (6.17) we 
employ Eq. (4.15), following Kolb. This does not 
require that the impacts are sudden or isolated. It is 
applicable to scalarly additive perturbations, that is 
perturbations for which 


ECDs Ay. (6.18) 


The sum here is taken over all the electrons contained 
in some effective volume. The phase change during the 
time interval r produced by a single electron depends 
on the velocity v, the time of closest approach ¢; and 
on some other parameter /. Since 


Hy= X Nli p, DH (iv), (6.19) 
i,v,l 


where 9U(z,v,/) is the number of electrons characterized 
by (z,v,l) in the effective volume, Pea? becomes 


Pati E inl) | CGD) 
t,v,0 0 


— (Hi(i,2,l))aa]dt, (6.20) 


Peat= 2 Nli) Pi, v, ı(7), 


Pi, v 1(T)=h f [ (Ai (4,0,2)) ee— (Hi (4,0,1)) aa ldt. (6.21) 


Using the results developed in Sec. IV we find 
(expl — iP .a°(T) ]) av 


=exp| oR? f domda], (6.22) 


ne(v) being the number of electrons with velocity v per 
cubic centimeter. The effective volume of the system is 
assumed to be a sphere of radius R and A 


a,(1)= f dW (etev, (6.23) 


Here ¢,,1 is the total phase change produced by an 

electron of velocity » and parameter / which has its time 

of closest approach in the time interval 7. Now suppose 
n. Digitized by S3 Foundation USA A 
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that the parameter / labels two impact variables p and 
c, p being the distance of closest approach and o some 
other characteristic variable defined later. The prob- 
ability distribution functions for these are 


W (p)dp=2mpdp/rR?; W(c)do=do/Z, (6.24) 


where > is the total cross section for o. Then 


ee f f MOLO cA an (O25) 


and hence 


(expl —iP Co) Dumenn| 2e fon. (daof ado 


x f (do/2) (exp(—ien ne) (6.26) 


Finally, we replace ne(v) by neW (v) understanding by 
ne the number of electrons per cubic centimeter and by 
W (v) the Boltzmann distribution function; we write 
Peal ©) for Pv, po, letting (ew) indicate the two states 
involved and omitting the (p,v,c) dependence; the 
symbol (%) means that the phase change produced by 
one particle is completed. One then finds 


(exp — iP ea°(z) Iw =expL— (u Hiu:)r], 


where 


= u‘ Re o R 
( )= ( Jorn: f ow (odo f pdp 
— ug“ Im 0 0 


2 do 
x f —{expl—igea()]—1}. (6.28) 
0 a 


(6.27) 


Substitution of Eqs. (6.27), (6.16) and (6.15%) into 
(6.15) results in 


__ lhea? 
Neg Re IP exp| i| Awa? — t" 


26% 


= ~r) T— U‘ “|e (6.29) 
e Av (ions) 


MAP ( 
T (Awa? — 


We shall return to this equation. At present we wish 

- to develop a formula including the off-diagonal ele- 
ments, which are taken to be zero in the treatment 
above. 


46% 
QotF/e— uz)? + (u)? Av (ions) i 
(6.30) 
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C. Diabatic Effects in Degenerate Systems 


The ions are neglected, so that the atomic states are 
degenerate, in this approximation. The resulting for- 
mulas are useful because they indicate the magnitude y 
of the diabatic effects as far as electron broadening is ¢ 
concerned. In particular, the results of the present . 
development, when applied to the Lyman-a line, and 
then compared with the adiabatic approximation (when 
ions are neglected), clearly demonstrate that the 
adiabatic theory can contain large errors. The method 
of this section can be generalized to nearly degenerate 
systems, degeneracy being removed by inclusion of 
the ions, but the results are difficult to handle and have 
not been carefully studied. 

In the weak collision approximation we assume that 
|Pea |1 so that Eq. (6.13) can be written in the form 


Leat (t) = hea? exp (iwat) IEEVERRES 


0 

He'a 
I 
Mea a 


Heal q 
Cot tZ Cac. (O50) a 
Hea 


, 


Since | P..°|1 and the sum are small, we can write 


Hea’ (f) = hea? EXP (iwea%) Exp (iea), (6.32) 


with 
Hera’ k 
OE] qE — Ce tH 


Mea a Hea 


Cur} (6.33) 


When this is substituted in Eq. (6.5) there results 
leatl? 1 
| Hea?| i 


Lea= im — 


Dep BSP 


T 2 
x if dt exp(—1Aw,2%) expC) . (6.34) — 
0 Ay 


In the weak collision approximation, then, Eq. (6.11) A 
for Cara becomes mee 


4 t 
Cais z all (Ai) ar exp(—iPaar’)di' 
0 
ba 


if sak 
= f (duet. (635) 


If one assumes, as in the adiabatic case, scalarly ad diti 
perturbations ‘[see Eq. (6.19)] then, in view of 
(6.35), 


Pics D EUG L) Ki, v, b 


t,0,1 


=~. 


kin (= 0 i) ulz antn . 


m E7 6 “i 
eel 


(6.36) 


parab 
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Equation (6.34) is formally the same as Eq. (6.14). 
On the other hand, Eqs. (6.36) are similar to Eq. (6.21). 
We can therefore use the results of the adiabatic 
treatment when the proper replacements are made. Our 
results in that case do not contain the effects of the 
ions. Instead of Eqs. (6.35) we thus find 


2 


uy 


(Awea?— uz‘)? + (um) ? 


ao 


Re R 
( Jen f vW (v)dv f pdp 
Im 0 “0 


= do 
x f —Lexpl—ikea()J-1], (6.38) 


Hea? 
| (6.37) 


ca 


— 16% 
E) 


T 


( 


subject again to the use of the phase shift approxima- 
tion, which accounts for our writing kea(%) in place of 
the kj»,: appearing in Eq. (6.36). The condition 
| P.«|<i1 is violated after a long enough time, and Eq. 
(6.37) is valid only in the wings. In applications, since 
|Kea(00)|<1, we retain only the first two nonvanishing 
terms in the expansion of e~*«, In the wings (6.37) 
becomes 

Tea= 1 “/[r(Aw.a’)”], (6.39) 
and 


o R Zdo 1 
u*= Irn, f W (2)edo f pdp ii =r 
0 0 0 22 2h? 


+o 2 
| S UE pede E paraded] (640) 


with the understanding that Hı refers to one electron 
with specified p, v, o. We shall now apply the formulas 
of this and the previous section to line broadening by 
electrons. 


D. Electron Broadening in the Adiabatic 
Approximation 


In this section, the adiabatic theory of part B is 
applied to broadening by electrons only. Here “adia- 
batic” means that we retain only the first term in Eq. 
(6.13). This calculation, although restrictive, is useful 
in estimating (a) the error introduced by the electron 
collisions that fail to satisfy the conditions for an impact 
theory; (b) the effects of close collisions. The value of 
Gir, (2); defined by Eq. (6.21) except that the inte- 
ration is now extended over all time, is (with suitable 
changes of indexes) 


Peal ©) = (222*/pv) cosd, (6.41) 


where 
3 (eaoh) {Lm (ha’— Fa) J— [n= (ki*— k2%) J}. (6.42) 


EE 2 


3 ession 
In this express 
T olic coordinates, 


n‘, kı‘, kz‘ are quantum numbers, in 
of the state e and @ of Eq. (6.41) 
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is the polar angle describing the location of the position 
vector of closest approach 9 in some fixed coordinate 
system ; ô is the ø parameter mentioned earlier. Equation 
(6.28) then gives 


uU“ = 
; 2 Pm 
m= Dane | AWW (odo f pdp 
0 0 
sin (2Q2‘*/pv) 


if 
(2Q2*/pv) 


where pm is a cutoff distance and is chosen as the Debye 
length of Sec. II. With the definition | 


Om = 2R Pmt, 
Eq. (6.43) then becomes 


| (6.43) 


o W (v) 
um“= Srne) f in| 0.20044 Ind,“ 
0 v 
E 
+. | (6.44) 
240 


The velocity average is approximated by replacing the 
v! outside of [ ] by its Boltzmann average (w 
= (4/7) (1/(v)) in Eq. (6.44). We then obtain, finally, 


ut = (32/(2))1¢(Qz**)°G (ôm), 


(6.45) 
G(x) = (0.2094— 4 Ina-+a?/240-- -) 
and 6,,°% is redefined as 
Ôm = 202¢¢/ (Pml). (6.45a) 


Let us now return to the two points mentioned at the 
beginning of this section. 

(a) According to the criterion for the validity of the 
phase shift approximation (Sec. IV B), the velocity 
must satisfy 

O (Q Aw) i =v, 


for the present results to be acceptable. One can estimate 
the error resulting from extending the velocity inte- 
gration from zero to infinity by computing the con- 
tribution to (v) from velocities less than ve. One obtains 


(O O <te i m (Aw) Qs‘ 
(w) i. 4kT 


For 10 000°KĶK, and for Hs (4861 A) at 30 A from the 
line center, 90% of (v=) comes from velocities which 
satisfy the phase shift approximation. ; 

(b) It is reasonable to treat close collisions by the 
Lorentz formula [Eq. (1.2)]. This leads to an Ja 
which is similar to Eq. (6.30) (the ion contribution 5 
again neglected) except that 2‘* is replaced by at 
Te being the mean time between collisions. Hence the 
suggestion that w“ can be represented, in an approx! 


(6.46) 
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mate way, as a sum of two terms 


UrS*= (261°*) p> pet (41%) p<pey 


(6.47) 


where (21°%)p>ec is the contribution from the weak 
collision theory, computed according to Eq. (6.32) 
except that p is now allowed to vary from pe to pm. Here 
pe is the value of p which roughly divides the impacts 
into strong (close) and weak ones, chosen so that 


ôe“ = 200'*/(p(v)) =1 (6.48) 

or 
pe= 2R (0). (6.49) 
In view of Eq. (6.41), 6.‘* is the maximum phase shift 


that an electron with velocity (v) and distance of closest 
approach pe can produce. But 


(116) p <pe= Tot = Tp env) = (4ir/(v)) (Qe**)?n- (6.50) 
and 
(u) p> pc= (322-/(0)) (Qot*)2[ G(bm**) — G(1) ] 
= (32n-/(0)) (Q2°*)? 
X[—0.0042—4 Indntt+ +++]. (6.51) 


Comparing (6.50) and (6.51), the contribution due 
to close collisions can be neglected if 


(21%) p<pe 3 1 
ES eee (6:52) 
(241%) p>pe 4 In(1/5m‘%) 
that is, if 5 
In(1/8mt*)>1.5. (6.53) 


When (6.53) is satisfied, the quantities expressed by 
Eqs. (6.51) and (6.45) are approximately equal. 

Example: If ne= 10%, T=20 000°K and Q/A =1.1 
(where A = 3eao/%) then In(1/6,,°%) ~6. 


E. Broadening of the Lyman-a Line by Electrons 


In this section we treat the broadening of the Lyman 
lines by electrons only. Both adiabatic and diabatic 
effects are considered. The wave functions used are the 
Stark wave functions with the Z axis fixed in space 


1 1 
P+ en 210), CoS Ca 210), 


P2141, 100) 


the subscripts on ¢ denoting, in spherical coordinates, 
the quantum numbers nlm. 

We use Eqs. (6.39) and (6.40), dropping Xa in Eq. 
(6.40) because of the nondegeneracy of the ground state 
(100). Letting a designate the state (100), e the states 


(21+1), Gia); (= 


one finds 


ut O= Uy 10 


141211 ,100 — upm 10 = 0, 


(6.54) 
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so that 
To11, 100= I 21-1, 100= 0, 
uzt! up 100 
T4.,100=—————;_ I 100 = ———.,__ (6.55) 
1 (Aw, 100)? a (Aw, 100)? 
urt O= yu O=} |u, 100°| (6€7a0/vph), 
and 
6e?’ao 2 pm 6e7a9 
( )=2mnef oW (hd f odo( ). (6.56) 
pvh 0 p vph 


The off-diagonal elements contribute twice as much as 
the diagonal elements to Eq. (6.55). We may therefore 
write Eq. (6.54) in the form 


3 (urt) ad 


(6.57a) 


Ihe, 100 = , 
T(Aw, 100)” 


3 (umaa 
1 (Aw_, 100)? 


where (2;+!™),a is given by Eq. (6.40) with only the 
diagonal term, u+ 100° (H1) appearing in the sum. 
Finally, then, 


(Cram) l= (u1) 
_ le 100|? 144a et ex 
D 


3 h? 


(6.57b) 


Me, 100 = 


“u("). os 


This result for (z:+),4a agrees with the corresponding 
one obtained from the adiabatic theory of part D of 
this section, Eq. (6.51) with Q.°*/A=2 and definitions 
of (6.49), (6.45a), and (6.42). Thus a pure adiabatic 
treatment (part D) which ignores collision-induced 
transitions underestimates the broadening by a factor 
of 3, a conclusion which is important later when an 
attempt is made to introduce both ions and electrons. 
Such an attempt succeeds simply only in an adiabatic 
theory. But with the information now at hand one can 
estimate the contribution to u‘ from the off-diagonal 
elements (diabatic effects) by using the results of this 
section. 


F. Simultaneous Broadening by Ions and Electrons 
in the Adiabatic Approximation 


In this section both electrons and ions are taken into 
account in the adiabatic approximation to the theory. 
The basic formulas are Eqs. (6.27), (6.28), and (6.30). 
The vys are treated in the manner explained in part D. 
So far as the ions are concerned the averaging process 
indicated in Eq. (6.30) can be performed with the use 
of the Holtsmark distribution for the field F since the 
ions are to be treated in the statistical approximation. 
Thus, define W (Q) to be the probability distribution for 
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the quantity Q =F/e; Eq. (6.30) then becomes 


_ leel 
Mew 


C W (9)dQ 
u“ 


6.59 
QQ)? (u15)? ( 2 ) 


‘eee 


and, according to the adiabatic theory for electrons, is 
u“ zero, and u“ is independent of Q. Using the 
Holtsmark distribution for W(Q) and adding together 
a pair of lines that satisfy Q2*=—Q,*’ [this simplifies 
the results and makes J(Aw) symmetric ] one gets after 
some mathematical manipulation (see Kolb) 


2 |ue DIE 
m (Q243)? 


AW ea E Awe E 
x | ao sin( Jas cos( ) (6.60) 
N2£* Qt 


where n; is the number of ions per cubic centimeter. 
This equation can be expanded for large and small 
frequencies (Aw,.,°). We present here only the large 
frequency expansion, since the other is subject to too 
many invalidating uncertainties. For convenience we 
make the following changes in notation: 


Zea =4.5209°n3/A, A =Je?a0/h, 


Lea= 


ie gdt exp[_(—4.21221;¢!— uE) /Q2** ] 


i.e., Zea is the Stark shift (in radians per second) result- 
ing from the mean Holtsmark field strength, and 


B= Aden’ /Z ea; 
i.e., B is the shift in units of Zea while 
i IRV hee IKE 
i i Ris a measure of the relative importance of ion and 


electron broadening. With these changes in notation, 
for large — (Aw,.°), Eq. (6.60) becomes 


o (—1)"7 
a R(B4+1/R2) =? (n—1)! 
PE (n+1)/2) cos{{ (3n—1)/2} tan(BR)] 


Equation (6.61) reduces to the Holtsmark distribution 
for ions if we set #:°*=0 (R= ©), for in this limit 


= (—1) = T[(3n+1)/2] 
=2 (n—1)! 
Xcos[{(3n—1)/4}r]. (6.62) 


Eq. (3.11b) when the coefficients are 
) reduces to the results of the 


 Ia(B)dB= {ead |24B- 


. (6.61) 
Ben) 12 


az a(B)dB= == | fir) us- X Bn) 12 


aye | ras 
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The electron effects are revealed by comparison of 
Eqs. (6.61) and (6.62). In addition to the correction 
factors in each term of the sum in Eq. (6.61), there is a 
term that depends only on the electrons. In the wings 
of the line this is dominant since it is of the order B- 
whereas the mixed electron ion effects are of the order 
B—. This feature is present in all theories which 
attempt the fusion of statistical and impact features: 
the intensity in the statistical theory for large B is 
proportional to B~ that in all impact theories B~. 

Equation (6.61) is applicable to any two pairs of 
states (eœ), (e'a) for which Q**=’@’ since it 
neglects collision-induced transitions (off-diagonal ele- 
ments of C). Formally, Kolb does generalize Eq. (6.1) 
to include the off-diagonal elements. This leads to 
mathematical difficulties because the ** are then 
dependent on the ion field strength, which complicates 
the ion-averaging process. The dependence arises 
through the Cara (where a’ a) (taken as zero in the 
adiabatic approximation). Under certain conditions, 
these can be neglected. If, for instance, the average ion 
field is large the degenerate states are split far enough 
apart so that the off-diagonal elements vanish. Under 
other conditions, the term in Cea that causes the dif- 
ficulty, namely exp(iwaal) (where wea is the sepa- 
ration of two degenerate levels due to the ions) can be 
replaced by unity. This will be true if the collisions are 
fast enough, i.e., if the collision time is small compared 
to (waa’). In this latter case, one can employ the 
adiabatic theory of Sec. E and estimate the diabatic 
contribution to 2** from a knowledge of the diabatic 
electron effects, using the procedure given in the 
treatment of the Lyman-a line. That is, one first deter- 
mines w‘* (adiabatic and diabatic) by considering only 
electrons and then uses this :‘* in the formula for 
broadening by ions and electrons in the adiabatic 
approximation. 

As an example of this procedure, Kolb considers the 
following situation. For ne=n;= 1016 cm™, T=15 000 
°K, and Q*/A=10. While an analysis of this line to 
determine the diabatic effects has not been carried out, 
Kolb estimates that they are of the same order of 
magnitude as the Lyman-a line. He takes the con- 
tribution to the weak collision to be twice the adiabatic 
contribution Eq. (6.51): 


ee 


Pa Se ae 


ca A 


6 wah dye 


16 ne 
(11°) p >p ~> — (R2)? nda (6.63) 
3 (v) 
and for the close collisions 


(t1°*) p <= [e] (6.64) 


assuming that 
Us°*= (u1) p< pot (11°) p> pee 


The result is shown graphically in Fig. 8. 
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Tic. 8. Line contours produced by different agencies (after Kolb). 


This result may be compared with that obtained 
from the classical theory. Lindholm’s result [see Eq. 
(4.28) ] is 


- (11°*) g =a ;/2= r? (Qet*)?(n./(0)) 

X [0.923 —In (T/m) ]. 
We have replaced the cutoff ro by the Debye cutoff pm 
and v by (v). The ratio is then 


aye In (pm{v)/Q2*®)+0.485 
=0.345 | 65) 
(m) & In (pm(v)/ Q2**) — 0.212 
For large In(pn(v)/Q2**), therefore, 
u~ 0.345 (u12) &. (6.66) 


The quantity In(p»(v)/Q) is in general large enough so 
that the difference between Eqs. (6.66) and (6.65) is 
within the uncertainties of Kolb’s method. 

For the condition which led to Eqs. (6.63) and (6.64) 


In (pm(v)/Q2*) = 3.25. 
Equation (6.65) then gives 
uy*~ 0.43 (241%) ç. 


This is the line width of a single Stark component 
which is unobservable. The disparity between u, and 
(21) g, however, will remain no matter how the com- 
ponents are combined into a resultant contour. Because 
of the numerous approximations and inherent uncer- 
tainties involved in either of the classical path calcu- 
lations it is difficult to say which formula is numerically 
preferable, even though Kolb’s approach is the more 
circumspect and realistic one. Later we compare the 
result of the quantum-mechanical treatment with (21) £ 
(after compounding of Stark states) and find closer 
agreement. 


VII. EXPERIMENTAL RESULTS AND CLASSICAL 
INTERPRETATIONS 


A. Experimental Determination of 
Plasma Properties 


Among the important physical parameters charac- 
terizing a plasma are the concentrations of various ions 
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n;, the electron concentration ne and the temperature T. 
In a simple plasma, i.e., one containing but a single 
type of ion, ni=n.. The remarks of this section are 
confined to this case. In complex cases, Dalton’s law 
of total pressure must be employed in addition to the 
equations discussed below,*® but otherwise nothing 
fundamentally new emerges in the analysis of lines from 
complex plasmas. 

A common temperature T exists for all constituents 
of a plasma if it is in thermal equilibrium. This is not 
necessarily true in the presence of agencies (discharge 
currents, ionizing radiations, shocks in rarefied media) 
which are not part of the thermal mechanism. In such 
instances it is customary to assign different temperatures 
to different components (neutral molecules, ions, elec- 
trons, photons) and to find ways of measuring each. 
The limitations of such an approach should, however, 
be apparent; for the external agencies just mentioned 
do not always produce a Maxwellian distribution of 
particle speeds,“ and a temperature is definable only 
for a Maxwellian distribution. In a discharge plasma at 
low electron densities (7.-<10' cm) the temperature 
concept is known to break down, but it becomes mean- 
ingful once more at higher densities when long-range 
Coulomb forces among the ions and electrons begin to 
randomize the distribution toward its canonical form. 
This report does not deal with the pathologies of T, and 
we assume for the most part that a common T exists. 

It has been standard practice to determine 7;=1, 
by measuring the intensity distribution of the Balmer 
lines and comparing it with the Holtsmark formula. 
Since the temperature does not enter in this procedure, 
simplicity strongly recommends it. Recent investiga- 
tions by Edels and Craggs,‘* Lochte-Holtgreven and 
Nissen, and by Olsen and Huxford® are based on it. 
The risks involved in this method are evident from our 
earlier theoretical considerations (and from numerous 
experimental findings of more recent date) : reliance on 
the Holtsmark theory alone is permissible at low ion and 
electron densities (7.<10!® cm™). At high densities, 
repulsion between perturbing ions produces a smaller 
intensity in the line wing (where measurements and 
theory are usually compared), but the electrons add to 
it. It is likely that a fortunate, rough cancellation 
between these opposing errors has sometimes led to __ 
correct values of n; even when the method was not __ 
strictly valid. ae 

Ion densities obtained by Edels and Craggs and by 
Lochte-Holtgreven and Nissen in hydrogen arcs 
ra = * 


46 See, for instance, W. Lochte-Holtgreven, Temperature, 
Measurement and Control in Science and Industry (Rei 
Publishing Corporation, New York, 1955), Vol. 2. Also 
op Progresi in Ehyaa 21, 312, 1958. vé 
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of the order 1016 cm™, in a range where the simple 
method can hardly fail. Those reported by Olsen and 
Huxford are somewhat higher (condensed flash dis- 
charge) and very probably subject to some error. One 
should also mention that “‘fitting a Holtsmark contour” 
is not a wholly obvious matter. It involves the com- 
position of many Stark-broadened component lines into 
a single distribution with appropriate weights. The 
weighting has not always been performed correctly; 
there are differences, for example, in Olsen and 
Huxford® and Craggs and Hopwood* (factor 2 for 
perpendicular Stark components). 

When n; has been determined, it is a relatively easy 
matter to find T from the Saha equation (see the fol- 
lowing), the relative intensity of different Balmer lines 
and other known effects involving n; and T.46 How- 
ever, since line broadening is strictly also a function of 
T (mainly through the electron effect), and since we are 
furthermore interested in testing line broadening theo- 
ries, it behooves us to discuss briefly some available 
methods for determining 7; and T that are independent 
of the details of plasma broadening. 


1. Saha-Eggert Equation (Mass-Action Law) 


This is a relation correcting , and T as follows: 


NMe 2Z; (2xm-kT)} 


SS ae) ~ 
no Zo hs P| 


x= Âx 


l. (7.1) 


Zo and Z; are partition functions for neutral atoms 
and ions, 7 and z; their concentrations. For protons, 
Z; has the classical form (27M ,kT)?//3; for neutral H 
atoms, however, Zo consists of two factors, the one just 
written (since My~M,) and the sum of states 
Din Bn Cxp(—E,/kT) with weights g,=2n? and En 
=hydrogen energy for principal quantum number 7. 
This sum diverges because HZ, <n. Physically, how- 
ever, this divergence is prevented by the drowning of 
levels discussed in Sec. V. A good practice, established 
on theoretical and experimental grounds, is to cut off 
the sum at n=6 or 7. 

The quantity x is the ionization potential for the 
ion, and Ax is a correction resulting from the drowning 
of the high levels or, to put it another way, by the 
lowering of the ionization potential in the presence of 

ions. It has been computed by Unsold, Weizel and 
Ecker, and others (see Sec. V). IN : 

Equation (7.1) holds in thermal equilibrium. Strictly 
speaking, this means that radiative processes as well as 
corpuscular collision processes, both those generating 

ions and destroying ions (recombination), must be 
balanced in detail.” If, for instance, radiation escapes 


“ay. D. Cra gs and W. Hopwood, Proc. Phys. Soc. (London) 
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partly from the medium, the equation is not exact. 
There is some evidence that for low 7%. photoionization 
and recombination by triple collisions are not fully 
effective. The Saha-Eggert equation is then impaired, 
and Elwert®* has proposed its replacement by 


Ne=8.4X10° kT 


— —x/kT 
E xI 5 


No nx g 
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where g is a number around 2 given in reference 53. 

Saha’s equation is sometimes used when the tem- 
peratures of different plasma components are different. 
It is then said to define an ionization temperature T7;. 
The physical meaning of this concept is far from clear; 
still it may be a useful parameter. But if the energy 
distribution of the electrons is not Maxwellian (e.g., 
discharges with »,~10" cm™) the Saha equation may 
be vastly in error. E. Dewan*' has derived the analog 
of (7.1) for electron distribution functions given in 
references 47 and found departures from Eq. (7.1) by 
large factors. 


2. Absolute Intensity of a Line 


The temperature alone is involved in the formula for 
the absolute intensity of the Balmer lines (in emission): 


reh fn? En— Eo 
I = — nil exp( — =) 
MA? kT 


(7.3) 


(à= wavelength of line, f= oscillator strength, quantum 
number, n and £, refer to upper state of line, / is the 
thickness of the radiating layer, which must be small, 
i.e., optically thin). If Z and no can be determined with 
precision, this formula is the most direct and reliable 
means for determining T. 


3. Relative Intensity of Different Lines 


Two Balmer lines, one originating in the state of 
principal quantum number n’, the other in state 7, and 
of frequencies vw and v», have an intensity ratio 
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if all excited states are in thermal equilibrium. A, is 
the coefficient of spontaneous emission. Again, in the 
absence of external agencies, and at high electron-ion 
concentration even in their presence, this condition is 
satisfied. Recent measurements in a hydrogen arc 
discharge by Edels and Craggs, who compared Ha, 
Hg, and H,, have revealed large departures from for- 
mula (7.4). 


5G. Elwert, Z. Naturforsch. 7a, 432, 703 (1952). 
55. Dewan, Dissertation, Yale, 1957 (unpublished), 
56 See reference 48, p. 562, 
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4. Intensity of the Balmer Continuum 


The intensity of radiation at a wavelength \ below 
the Balmer continuum limit of 3646 A is given by the 
formula®® 
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where no is again the concentration of H atoms, / the 
layer thickness, “,=energy of nth quantum state 
= — Rhc/n?, and R=Rydberg constant. The last two 
terms in [ ] represent contributions of higher continua 
in the region of the Balmer continuum. 


5. Temperature in Shocks 


Shocks are an increasingly important agency for 
producing high temperatures of short duration, yet 
long enough to emit measurable spectra lines. Tem- 
peratures and ion densities obtainable in shocks surpass 
by far the limitation of most electrical discharges.” 
Formulas for the temperature in shocks may be derived 
from the Rankine-Hugoniot equations, are given in 
Turner’s dissertation*’ and, in a more elaborate manner, 
in Courant and Friedrichs.*® 

Of the five methods thus far surveyed only the first 
presents means for determining. ne; the remainder are 
independent ways of finding T. The Saha equation, in- 
volves both 7, and T; hence it must always be coupled 
with one or more of the later formulas. Unfortunately, a 
determination of ne from Eq. (7.1) is highly inaccurate 
because Ax and T appear in an exponent; the error in 
T resulting from an uncertainty in a given value of ne 
would be much less. 

A higher degree of confirmation can be given to a 
Saha 2, if use is made of the Inglis-Teller formula (Sec. 
V) which involves 2; (=n.) and only n; directly. 
Although it depends on line widths, details do not enter 
that equation, and the action of electrons affects it in 
a minor way. But of course it yields only an estimate. 


B. Some Recent Experimental Results 


The need for modifications of the statistical theory 
because of the impacts of electrons was recognized by 
Unsold and Lochte-Holtgreven at Kiel where, in con- 


56 G. Jürgens, Z. Physik 134, 21 (1952). 

7 See, for instance, Petschek, Rose, Glick, Kane, and Kantro- 
witz, J. Appl. Phys. 26, 83 (1955); E. B. Turner, Dissertation, 
University of Michigan, 1956. 
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sequence, a systematic series of experiments were 
recently performed on the Balmer series by Jiirgens,*® 
Griem,®® Dieter-Henkel,© and Bogen.” High-intensity 
carbon electrode arcs maintained in a water channel 
and supporting currents up to 200 amp were used as 
sources of radiation, the first six series members being 
observed for various values of temperature and electron 
density. Although their experiments were done inde- 
pendently, the sum total of work of the Kiel group 
exhibits a large measure of unity, justifying a discussion 
of their findings as a whole. 

In these experiments, the perturbations arise from 
protons and electrons acting together in the discharge; 
these always occur in equal numbers so that ne= np 
where np is the density of protons per cubic centimeter. 

In order to allow the application of a theory of 
broadening to given experimental contours, values of T 
and ne must first be determined, for as we have seen 
in Sec. II, the validity of a particular broadening theory 
in general depends upon the values of these parameters. 
Several independent methods for determining the 
temperature were employed, such as: measurement of 
the relative intensities of the successive lines of the 
series emitted from optically thin portions of the 
discharge, measurement of the degree of inversion of 
the lines when emitted from optically thick portions of 
the discharge, and variation of the intensity of the 
series continuum. The temperatures calculated by 
these several methods were found (doubtless in part by 
good fortune) to agree within a few percent, the values 
ranging from about 5X10? to 1.410! °K. The deter- 
mination of ne was made by the single method of 
matching the observed contours to an intensity dis- 
tribution. Two distributions were used, the first being 
the statistical distribution of Holtsmark and the 
second a distribution formed by smearing the statistical 
distribution over the Lindholm impact distribution. 

The Holtsmark fitting is performed in the far wings 
of the line where the inequality, Eq. (4.27), holds. 
Large discrepancies between this distribution and the 
observed contours were found, especially in the cases 
of H, and H,, both of which have undisplaced central 
components in their Stark patterns which contain large 
fractions of the total intensities of the lines (38% for Ha 
and 16% for Hy). The Hg line, having no undisplaced 
component, lends itself best to a fit by the statistical 
distribution. An application of the conditions, Eq. (5.2) 
to the last resolved member of the series shows that the 
electrons may be neglected as statistically broadening 
particles at the temperatures calculated. Hence the 
perturber density determined by the statistical fit is 
interpreted as that of the protons only. 

The dashed curves of Fig. 9 show the contours as 
found for the first three lines of the series along with 
their Holtmark wing fits; these curves were taken from 
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Jürgens and correspond to T=12 300 °K and n.=8.4 
X 10!6/cm'. 

According to these findings, the experimental con- 
tours agree with the statistically calculated ones beyond 
30 A, and the agreement occurs for a value of ne 
identical with that obtained from the Saha equation. 
This was disquieting to theorists when the data ap- 
peared, for it seemed to show the old but unjustifiable 
belief in the unimportance of the electrons (at these 
wavelengths) to be correct. But the later work of P. 
Bogen® has changed this picture. 

In his painstaking investigation, which includes a 
good discussion of errors, Bogen first measures the 
absolute intensity of the Hs line and uses formula 
_ (7.3) to find T. This requires knowledge of no, the 
number of hydrogen atoms, which for the water- 

stabilized arc employed in these experiments is not 
Te eadily at hand. It is obtained from a previous study® 
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tion of 7o as a function of T (with use of the Saha 
equation). Bogen then substitutes T in the Saha equa- 
tion and obtains ne=2.1X 1018 cm™ for the most fully 
documented example. When the intensity in the wing 
of the Balmer lines is plotted one obtains for each of 
_ the Balmer lines a graph such as Fig. 10. In the indi- 
cated wavelength range the experimental curve lies 
Seiten two Holtsmark curves, one drawn for n,=2.2 
< 16 cm, the other for twice that value. For Hz, 
not for the higher Balmer lines, the experimental 
re approaches the upper Holtsmark curve at about 
A, and the indication is that the higher Balmer lines 
rise show this feature, but the approach takes place 
o gher wavelengths. Bogen concludes that for large 
oltsmark theory is correct for all lines, provided 

ns and electrons have independent additive 
: e chooses 27 ad of 7e as the number 
‘he range of correctness coincides with 
é ie binary statist theory (see Sec. 
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IIL A) is valid. Electron effects appear to be confined 
to the center of the Balmer lines, and they extend to 
greater distances from the line center in the higher 
series members. 

We end this review of experimental results with a 
graphical summary of the shock tube work at Michigan. 
The latest data of this group may be found in the dis- 
sertation of Turner,’ analyzed by Kolb,” from whose 
thesis Fig. 11 is taken. Again the experimental curve 
falls outside the Holtsmark distribution for the proper 
ne at large distances from the line center. The dashed 
curve marked “ion and electron theory” represents 
Kolb’s calculation (see Sec. VI), adapted in a suitable 
way to the Balmer lines, for which a complete calcu- 
lation is still missing. fi 


C. Classical Interpretations 


The electrons must be included in an adequate 
theory. Their effect, according to all the foregoing 
developments, is most pronounced in the center of the 
lines. A simple way to incorporate it is to fold an impact 
distribution for the electrons into a statistical (Holts- 
mark) contour. This method, suggested in another 
connection by Margenau, Burkhardt, and doubtless 
others, was employed by Griem and Henkel in their 
data analyses. Griem adopts the use of the distribution: 
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Here 2A is the half-width computed for the electron 
impacts by the Lindholm theory ; W (x/Fo) is the statis- 
tical distribution for a normal field strength Fo. This 
is taken from the work of Verwey and Schmaljohann 
(see Sec. IIT), which excludes the central undisplaced 
component of a Balmer line. Hence the central com- 
ponent, whose total intensity is Jọ, must be added in 3 
via the first term of Eq. (7.1), which shows only an 
impact width. In Eq. (7.1), \ is measured from the - 
normal position of the line. The bar over W is to 
indicate that a proper average over different Stark 
components has been taken. Much better representa- 
tions of the data are obtained through the use of (7.6) 
as can be seen from the curves of Fig. 12 which are 
taken from the paper of Henkel. 
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A plausibility argument in favor of smearing the 
distributions takes a form similar to the one used in 
earlier sections. When a perturbation of the radiator 
contains two components, one of which acts slowly and 
the other quickly, the action of the slow one is to 
produce discrete shifts in energy which may be con- 
sidered to be constant over time intervals in which 
many broadening collisions involving the fast com- 
ponent occur. Hence, in our example, each discretely 
shifted frequency of the statistical theory, produced by 
the protons, is broadened by the many impacts of the 
electrons. The condition under which many electron 
collisions occur during a time in which the ions are 
essentially motionless is: 


beKlp, (7.7) 


where że is the time between successive electron col- 
lisions and ¢, is the time during which a proton collision 
occurs. For classically describable particles we have 
the definitions, 


te=1/(nird?v,), tp=d/Up, (7.8) 


where d is defined by Eq. (2.5); ve and vp are, respec- 
tively, the mean speeds of the electrons and the protons. 
The inequality (7.2) thus becomes: 


[(6.90)%rT3/n3 | (mp/m-)>1, (7.9) 


where mp is the mass of the proton and me is the mass 
of the electrons. This inequality holds for many im- 
portant applications of electron broadening, both 
stellar and terrestrial. 

It was shown in Sec. II that the classical-path 
assumption of the impact theory is not necessarily valid 
for collisions producing phase shifts less than unity. 
Such phase shifts, however, contribute most of the 
shift and broadening, as calculated by that theory, for 
the first order Stark effect [see Eqs. (4.27) and (4.28) 
and Table V]. Since this is the case, there exists no 
a priori justification for employing the Lindholm dis- 
tribution, as Griem does in his folding procedure; 
instead one should replace it with the results of a 
quantum mechanical calculation. For the quadratic 
Stark effect this is not the case because the large con- 
tributions to the broadening result from collisions 
falling within the critical radius as has been shown by 
Unsold'8 and in Sec. IV of this article. 

The peculiarity in the behavior of the first-order 
Stark effect is connected with the nonconvergence of 
the broadening when collision parameters out to infinity 


TABLE VII. Densities for which d=p, for the first-order Stark 
effect in hydrogen. 


n* 


Hio a N a 
Ha 6.2 X10'8/em 
Hg 1.6 X10!8/cm3 
Hy 5.48 X 10!7/cm3 
Hs 2.52X 1017/cm? 
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are considered. The arbitrary cutoff in the collision 
parameter, taken usually at d or n~t, introduces an 
uncertainty in the final broadening; but, more im- 
portant even, the nonconvergence leads to the phe- 
nomenon of the most distant collisions producing the 
greatest amount of broadening. Griem, in establishing 
the applicability of the Lindholm distribution to his 
broadening results, states that only phase shifts greater 
than unity are important and points to the improb- 
ability of multiple impacts within pe, in order to validate 
the single-impact theory. His arguments, therefore, 
ignore the long-range contributions to the broadening. 
In the cases where d Spe, the Lindholm theory may 
indeed be used. At the temperature of the Kiel experi- 
ments 7~104 °K, and we find the values in Table VII 
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for n*, the density above which all impacts occur within 
pe. Thus, for the observed densities, n=5X 108 cm™, 
the Lindholm theory should not be expected to be 
strictly valid, except for the very close collisions. 

Aside from these details, there is a very general defect 
of the folding procedure when it involves a statistical 
distribution varying more rapidly with \ than \~?, such 
as Holtsmark’s, in that it forces the asymptotic de- 
pendence of intensities, as \— ©, to be proportional 
to A, which contradicts the wing theorem. Hence, one 
should not expect a line folded in accordance with Eq. 
(7.1) to agree with observations at very large A. This 
criticism also applies to the use of Eq. (6.30) and all 
applications based on it. 

Ecker has discussed the desirability of including in 
the calculation of intensities both the electron effect 
and the modification of the statistical theory at large 
distances. He dealt with the latter in reference 30, 
where the parameter 6 is introduced to effect the cor- 
rection. In reference 30b he folds an impact distribution 
into a statistical one. But he does this for 5= and, 
therefore, obtains results which do not go beyond those 
of Griem and are in essential agreement with them. 

Another result obtained by the Kiel investigators 
demonstrates that electron collision effects increase at a 
more rapid rate than do the statistical effects as one 
proceeds to higher series members. This is in agreement 
with both the Lindholm theory and the quantum 
mechanical calculation. Difficulties in the inclusion of 
the electron effects are implied even in the early work 
of the Kiel group, as one may see from the following 
discrepancy. Henkel’s fit with the folded distribution 
leads to values of the perturber density which are about 
4 of those derived from the Holtsmark wing fits, i.e., 
the method employed by Jiirgens. When this result is 
applied to the data of Jiirgens, which antedated the 
suggestion for the use of (7.6), the Saha® equation no 
longer yields his measured value of the temperature. 


VIII. QUANTUM TREATMENT OF THE 
LYMAN-qa LINE 


The Lyman-a or resonance line of H, because of its 
position in the spectrum (A=1216 A), has not been the 
object of experimental line-width work. Its basic 
nature and the simplicity of the spectral terms it 
involves, however, have been inviting to theorists, and 
calculations regarding its behavior have been made. 
Much of Spitzer’s!®**- work has reference to this line, 
a review of which is found in Breen’s® article. An 
extended study specifically of the effects of electrons in 

broadening the Lyman-a line was made by Kivel, 
Bloom, and Margenau.¥ i 

Quantum radiation theory is applied from the be- 
ginning, and numerous details of little practical im- 
portance are discussed. In essence, however, three 


ther typical effects are distinguished. The first is 
e E Phys. Rev. 56, 39 (1939). 
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called universal broadening. It results from a schematic 
calculation in which only two levels are included, the 
lower and the upper states of the line. In the case of La, 
the upper state is actually degenerate, but this de- 
generacy is not taken into consideration at this stage 
of the analysis. The results are therefore not specific 
to the line in question and have a very general meaning. 
They ignore the first-order Stark effect by not including 
linear combinations of the degenerate states; and they 
ignore the second-order Stark effect by neglecting 
higher energy levels whose virtual excitation normally 
accounts for the distortion of the radiating atom. What 
the universal effect describes, then, is broadening by 
electrons which are scattered nearly elastically, as is 
apparent in the following. 

Polarization of the radiating atom is introduced by 
properly combining the degenerate states of the upper 
level. The four correct combinations are known from 
the theory of the linear Stark effect; they are 


1 1 
Y+ Srg Veta) ; Y= a (Weo0o—Wa10) 
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The subscripts on y denote, in spherical coordinates, the 
quantum numbers nim. The first and second of these 
correspond to different energies [(—e?/S8ao) +3eaF] 
whereas the last two belong in first approximation to 
the unchanged energy — e/8ao. Moreover, the states y+ 
and y_ have permanent dipole moments in the field F. 
Transitions between them involve, therefore, spatial 
reorientations of the atomic dipoles occurring as a result 
of the perturbing electrons. The effect of o-1 and you 
is more subtle; it represents the induction of a dipole 
moment by the passing electron. The calculation which 
includes only y, and y— is easy because it does not 
lead to divergent matrix elements. Its results on the 
line width were termed “polarization broadening by 
reorientation.” The inclusion of Wo1,1 and Wo141 produces 
divergences which must be dealt with more carefully. 
It results in much larger effects, here called ‘‘polariza- 
tion broadening by induction.” 

Finally, there is an effect connected with matrix 
elements of the electron perturbation between the states 
2p and 1s. These obviously represent quenching by 
electron collisions, a problem for which the theory had 
already been given by Wentzel.® 

Polarization and quenching are strongly dependent 
on the nature of the quantum levels entering the cal- 
culation. The results computed are, therefore, not 
capable of immediate generalization to other cases and 
are here only cited as numerical examples. The universal 
effect, however, presents certain features of wider 
interest, even though it is small in the case of the 
Lyman-a line. 


% G. Wentzel in Handbuch der Physik (see reference 52), Vol. 


R h. 
ae K sjen i: Geophys. Resear earher RanghUrverSh?Paidwar Coll&loh. BighB8d by S3 Foundation USA 


SPECTRAL LINE 


The lengthy calculations in reference 13 concerning 
the universal effect lead to the same result as a much 
simpler picture,!’ which we use here. While this analysis 
standing by itself is perhaps not convincing, its plausi- 
bility is fairly high. Instead of considering atom, 
radiation field, and electron as the quantum system, we 
concentrate attention on the electron alone and inves- 
tigate the change in its motion as it meets the radiating 
atom. If it loses an energy AZ while it passes the atom, 
one may say that the atom gains A£ and therefore 
emits a frequency shifted to the blue by w=77AE. 
What this picture does not suggest is the potential 
exerted by the radiating atom upon the electron as it 
moves near it, for the atom has a finite probability of 
being in the upper as well as the lower state during the 
emission of radiation. 

Here our earlier reflections are helpful. All through 
Sec. I it was evident that the perturbation responsible 
for line widths is e, the difference between the upper 
and the lower energy levels as a function of the position 
of the perturbing particle. This is the quantity that 
alters the phase of the radiation in the impact theories 
and causes the shifts in the statistical theory. Reference 
13 bears out the expectation that ¢, in the form proper 
for a Coulomb interaction, should be used as the per- 
turbation in the present problem. If the Coulomb 
interactions between the atomic electron at rr and the 
perturbing electron at r are denoted by C(rr,r), and 
Ci:(r) is the diagonal element of C(re,r) with respect 
to the ith state of the atom, then 


e(r)=C2— Cu, (8.1) 
provided 2 is the upper and 1 the lower state of the line. 

In accordance with the usual procedure in (time- 
dependent) perturbation theory it is now supposed 
that the electron, before the interaction, has an anergy 
E; and its state is represented by a probability ampli- 
tude a;, whereas afterward these quantities are changed 
to E; and ap. We take the states to be those of a free 
electron: E= (#2/2m)k.2, Yn œ exp(ik,-r). The ampli- 
tude ay is then a function of ¿ as well as of wy, and 
the intensity of the spectral line at a frequency wis 
beyond its normal frequency is given by 


T (wis) © lim| a; (wis) |°. (8.2) 


Strictly, the limit ¿—> © is meaningless because the 
radiative process lasts only for a finite time which is 
the reciprocal of the natural line width in cycles per 
second. This natural width is being neglected in the 
passage to an infinite limit. 
The equations for the amplitudes are 
ha ;= 


Dor ajepe erst, (8.3) 


thi;= a LOD. 


an N= Ce) 
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If we put a;=exp(—~ul), integrate (8.4) and substitute 
in (8.3) we find 
| exs| 2 eiwisttyul— J 


(8.5) 


f he twit H Yu 


The sum over final states is converted to an integral in 
a well-known manner: 


V 
DD =} S rkik sinĝdð, 
f (2r)? 


where 0 is the angle between the wave-number vectors 
of the electron, k; and f;, and V is the volume per 
electron. If w, is written for #k,?/2m, the integration 
over «wy in (8.5) can be carried out according to the 
formula® 


eleis) tt yut— 1 
[ fe2 tox fled, 


t i(wi—wy) + Yu 


and (8.5) results in 


mrki 
Yu= = f let sin6d6, 
(27)? 


with the understanding that the remaining integrand is 
evaluated at ky=k;. This shows that the scattering 
which leads to yu is essentially elastic. 

Now ei; is easily computed; it is expressible in terms 
of other atomic integrals frequently encountered in 
scattering theory, namely, 


(8.6) 


Pals= f exp(ix: Tr) |W.(rr) |*drr. 


Thus 
ire p Fy) 
ofS seaaeg A 22m tT 
Vr? 


and 


= h2-+k2—2kiky cosb. (8.7) 


From this last relation sin@d@ is computed, and Eq. 
(8.6) then gives 


2ret aom Poo— Fy, 
Yu” f (= )ax 
V hk; 


The variable x here stands for xèao and ao is the first 
Bohr radius; Fi, and Fə refer, of course, to the 1s and 
to the weighted mean of the 2 states of hydrogen. The 
upper limit, «’=4ka0?, represents the maximum value 
permitted by (8.7) when k:= kp. 

Let us call the remaining integral in (8.8) 21Y; it 
is a function of the initial energy of the perturbing 
electron through x’. In Fig. 13 it is plotted for the La 
line, Æ; being in units equal to the ground-state energy 


(8.8) 
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Fic. 13. Graph of the efi- 
ciency factor Y vs electron 
energy. See Eq. (8.9). 
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of hydrogen. If now we replace V in Eq. (8.8) by n~! 
and 7k; by mv; that equation reduces to 


Yu= nom (h/mn;)?Y (v). (8.9) 


From Eq. (8.4) 
en yul-twist_ 1 
—ihas= I. 
Yut ins 
because a;=e—%™!. Hence, (8.2) yields a dispersion curve 
of half-width 27.; from (8.9), therefore 


w, (u) = 2n[ a (h/m)? ]Y (2;) - vi. (8.10) 


A comparison of this result with Lorentz’ formula, (1.3), 
is interesting, for both are of the same form. Evidently 
the collision cross section for this type of electronic 
broadening is mà? times an efficiency factor Y(;), A 
being the DeBroglie wavelength. The rise of ¥ (v;) from 
small values at small v; illustrates again that a per- 
turbing electron, if it is to produce effects other than 
shifts described by a static theory, must have sufficient 
velocity. The development here sketched, and also the 
work in reference 13, relies upon the Born approxi- 
= mation. When the energy of the impinging electrons is 
of the order of magnitude of 1 volt, as in the examples 
under study, this approximation may be greatly in 
error and the numerical results (e.g., Fig. 13 and 
‘able VIII) may be inaccurate. When good experi- 
rental data are at hand, recalculation with better wave 
sunctions is desirable. 
= Meyerott and Margenau®’ computed the analog of 
Eg. (8.10) with the use of Lindholm’s impact theory, 
basing the work on (4.16). But instead of using a 
= schematic potential of the form (4.17) they employed 
= the potentials e% and e1, which the electron actually 
= experiences in its approach to the hydrogen atom. The 


TABLE VIII. Broadening of La by electron impacts at an electron 
concentration n= 10% cm, mean energy (1/25) (/2a0)(~4 ev). 
The quantity w;° is the natural line width, 3.12108 sec. 


Effect, i 1/24)/a3/29 
“Universal” broadening 0.13 
Polarization by 

reorientation 0.027 
induction 2.7 


~ Quenchin. k ; 
i Gon broadening by ions 
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result, plotted in Fig. 14, shows moderate agreement 
with Eq. (8.10). 

In several publications Sobel’man has shown that 
Eqs. (8.8) and (8.10) can also be derived in a simple 
way from scattering theory. His method is quite similar 
to that employed in reference 67, and he gives useful 
criteria for the applicability of the scattering approach 
to line-width problems. 

We now turn to the polarization effects, in which 
electrons cause transitions between the initially de- 
generate states. Here, the calculation is troubled by the 
occurrence of spurious divergences in the matrix ele- 
ments, divergences which can only be avoided by the 
choice of a cutoff radius like the d encountered in Sec. II. 
The uncertainties introduced by this somewhat arbi- 
trary procedure are not serious, however. Polarization 
brings the greatest contribution to the line width so far 
as the effect of the electrons is concerned. 

Kolb, on applying his classical-path theory to the 
line, has made a careful comparison between his results 
and those of reference 13. While he reaches numerically 
different conclusions (resulting from omission of one 
matrix element in Kivel et al., and also from the fact 
that Kolb averages the Stark components of La with 
proper weight factors whereas Kivel e/ al. do not 
average), he is able to show that even the polarization 
effects can be calculated for the conditions under study 
with the use of classical-path theory instead of the 
completely quantum mechanical theory here sum- 
marized. The only effect not appearing in Kolb’s work 
is the one called polarization by reorientation, and this 
is seen to be very small. In Sec. IX it appears that a 
similar conclusion holds for the Balmer lines—a for- 
tunate fact likely to obviate the need for further 
detailed calculations. 

Finally, there is the width caused by quenching. 
This, for La, turns out to be negligible. In illustration 
of orders of magnitude the various results are collected 
in Table VIII, where all quantities refer to a special 
situation that is approximately realized in the photo- 
sphere of the sun. The last line in the table represents 
the Holtsmark width produced by the ions which in 
this instance quite evidently outweighs all electron 
effects. 

In judging these results and in appraising the role of 
the electrons at higher charge densities and for other 
lines, two facts need to be borne in mind. First, electron 
effects increase with z, ion effects only with 23 (see Sec. 
III). Secondly, lines involving states of higher excitation 
than La offer larger targets to the impinging electrons, 
and this enhances their sensitivity to broadening 
impacts. The enhancement is, in fact, greater than the 
accompanying increase in static polarizability which 


‘accounts for the Holtsmark width. While this is not 


apparent from the present discussion, it manifests itself 


6T. Sobel’man, Optika i Spektroskopiya 1, 617 (1956); 
Fortschr. Physik 5, 175 (1957). 
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in our treatment of the Balmer lines in the following 
section, where the electrons begin to play a role that 
is no longer subordinate to that of the ions at attainable 
concentrations. There the work is limited to the effect 
which was here found to be predominant, namely 
polarization. In the terminology of Sec. IV, the adia- 
batic hypothesis is abandoned and full consideration is 
given to the degenerate states that go into the formation 
of a broadened line. 


IX. QUANTUM THEORY OF THE 
BALMER LINES 

An entirely quantum-mechanical treatment of the 
broadening of the Balmer lines due to electrons has been 
given by Landwehr.” A system composed of the radi- 
ator (i.e. an atom or molecule), a number WN of 
perturbing electrons, and the quantized radiation field 
all confined to a volume V, is described by the 

Schrédinger equation 
HY =ihv. (9.1) 
The Hamiltonian H is decomposed into an unperturbed 
part H? representing the total energy of the isolated 
components of the system, and a part H’ which accounts 

for their mutual interaction. We write 


W=HY+H P+ S H,3?, (9.2) 


j=l 


where Hp, Hp}, and Hp are, respectively, the unper- 
turbed Hamiltonians of the radiator, the jth perturbing 
electron and the radiation field. The perturbation 


N 


H'=Jr;+ dX Cri, 


j=1 


(9.3) 


where J rs is the interaction between the radiation field 
and the radiator, and Cr; the Coulomb interaction 
between the jth electron and the radiator. The Coulomb 
interactions between the perturbing electrons has been 
neglected. 

Landwehr’s method of calculation is similar to that 
of Weisskoff and Wigner® for the problem of the 
natural line width. It starts with an initial condition in 
which an excited atom but no photons are present and 
then derives the probability of finding a photon of a 
given frequency after a time long enough so that the 
atom has certainly radiated. At zero time the pertur- 
bations are “turned on” suddenly. We then have at 


t=0: 
V(0) =Yno ppro, 


where Wno® is the noth eigenstate of Hpr?, Yo! the null 
state of Hp, and Wao? the Aoth eigenstate of >> H,,°. 
The function yno? is taken to be a product of plane 
waves. The momentum distribution of the V electrons 
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is assumed to be the equilibrium distribution of the 
particles at the temperature T (i.e., the Boltzmann 
distribution). 

The solution of (9.1), corresponding to the initial 
conditions given above, can be expanded over the 
eigenfunctions of H° to give 


WV (i) => > anrs (ÙDYn Epp 


n\y 


a 
xexp| — Ente), (9.4) 
l 


with asv,xo,0= 1. The determination of the coefficients a 
is extremely unwieldy; hence, it is assumed that a 
satisfactory approximation can be obtained by neglect- 
ing all matrix elements of H’ that do not involve the 
initial state. This means that only the initial state can 
radiate a photon and all collision-induced transitions 
are from and into the initial state. With this assump- 
tion, the equations for the amplitudes become 


N 


2 Cri 


j=1 


Gno.r0, o(t)= ==> Jz NoNo 


h 
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i 
dn »,0(2) = — “(m 
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The approximations have not interfered with the ci 
normalization, as the solutions of Eq. (9.5) re 
property 
Z | @nar(d) [P= 1. 
nà» 


For J rs we take the single-photon o PR ; 


J rs=— (e/m)pr-Ay, 


where pr is the momentum of the r 
electron and A; is the vector p 
Coulomb interaction, C 


nie. 


p is 


(9.5) 
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from the center of the radiator and given by 


F e 
for O<r;<d 


|r;— rr] 


0 for r;<d. 

It develops that the solution of (9.5) for the initial 
state amplitude, is of the decaying exponential form 
when the following conditions hold: 


3 kT 


5 yt ee wo >Tcet+Ty. 
i 


Here w’ is again the frequency of the undisplaced line 
and I'¢+Ty is the total decay constant of the initial 
state, a sum of the contribution from electron scattering 
Tc and from spontaneous emission Ty. The inequalities 
above are both satisfied over wide ranges of interest; 
the case of electron broadening of the Balmer lines will 
be analyzed with respect to them below. 

There results, for the distribution of amplitudes cor- 
responding to frequency w due to transitions from an 
initial state (20,A0,0) to a final state (7’,Ao,w), the simple 


formula 
(CetTs)/r 


(w—o')?-+ (Tetra)? 


Here Ty=} n” Taon is the total radiative decay 
constant of the upper state, the sum being taken over 
all states into which the atom can radiate; 
Te= Din’ Tn” is the total collisional decay constant 


(9.6) 
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of the upper state, and the sum is over all states to 
which the atom can pass by collisional transitions. The 
Tc depends on the transition matrix elements, 


Boa l(x) | (v20| 1— exp (ix: rr) | 22’ J]2, 


er 


where F(x) is a function of x, the wave-number dif- 
ference between the initial and final states of the 
colliding electron (x= x;— xs), and rp is the coordinate 
of the atomic electron. 

When degeneracy exists in the states of the radiator, 
the largest contribution to Te comes from matrix ele- 
ments of the states that are degenerate with the initial 
state. In addition, F(x) is highly weighted for values 
of x which correspond to elastic scattering. The evalu- 
ation of Ic is, therefore, restricted to those states that 


are degenerate with the initial state. These matrix 
elements may be summed approximately in a con- 


venient manner by using known sum rules. 

The method leads to a spectral contour associated 
with a particular radiative transition between given 
initial and final states. In order to form a single spectral 
line, such as a Balmer line of hydrogen, it is necessary 
to combine the results obtained for each transition 
from one of the initial degenerate states with one of the 
final degenerate states. In this procedure each of the 
initial states is assumed to be equally populated and 
each transition is weighted by the relative transition 
probabilities per unit time in the dipole approximation. 
The line formed in this manner is symmetric about w’ 
but is not of strict resonance shape, being lower in the 
line core and higher in the wings. 

The diagonal matrix elements of H’ have been 
neglected in Eq. (9.6). These represent shifts in the 
individual line components but are presumably small 
because of the uniformity of the perturber’s charge 
distribution. These neglected matrix elements, of 
course, are responsible for the line width and shifts in 
a purely adiabatic theory; however, in the extreme 
diabatic case (collision-induced transitions) they con- 
tribute only to the shift. 

Landwehr has applied this theory to the first three 
Balmer lines. The half-width Tc arising from collision 
broadening of the atomic state (lomo) is expressible 
in the form 


c 2rm\ ? 
Tnolomo = no enea 
kT 


h? 
A—Bln (=) 
ne 2mkTnoto 


(9.1) 


+HCE(4/2mkTnèa?) 


where A, B, and C are constants dependent on (slam) 
and G(x) i is the exponential integral of x. 

Table IX, calculated for Me <10!° cm™ and 
T=10 °K, lists the ratio Tftomo/I'%22 for the first 
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The form of Eq. (9.7) is strikingly similar to Eq. 
(4.28) for the impact half-width. Indeed when com- 
parison is made of the results of the two theories we 
find exactly the same variation with temperature and 
electron density, as shown in Fig. 15. The dotted line 
is an application of the Lindholm theory, as adapted 
to the Balmer lines by Griem.*® Griem treats the nor- 
mally degenerate states in a given excited level as one 
state using an average splitting factor for this composite 
state. The quantum mechanical values of the half-width 
of the lines exceed those of the classical impact theory, 
by hardly more than the possible error introduced by 
Landwehr’s use of sum rules in the evaluation of matrix 
elements. Thus the two theories yield about the same 
results, the major difference being that the Lindholm 
theory produces a resonance contour and the quantum 
mechanical theory a less peaked distribution. 

The form of the variation of the collision damping 
constant with T and x, 


TC œn T- 


is a quite general result and has been found also by 
Rudkjobing” in a quantum mechanical calculation of a 
different sort. 


TABLE IX. Prolomo/Ino22 for ne=5X10!16 cm? and T=10' °K. 
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Figure 16 shows the complete quantum contour of 
Hg taken at ne=5X 10! cm and T=1.25X 10! °K as 
compared with a resonance curve (as predicted by 
Lindholm theory) of the same half-width and total 
integrated intensity. The solid curve in this figure 
results from a superposition of curves based on Eq. 
(9.6) for the different degenerate upper state. 

Figure 17 compares the values of the half-width of 
Hg produced by electrons with that produced by ions 
(Holtsmark theory) where the average broadening is 
given by“? 

Ty (3h/am)7.8n3. 


For stellar applications at n< 10!ë cm™, T>5X 10? 
°K the electrons do not contribute much broadening of 
the line and may be omitted from consideration. In 
some interesting terrestrial applications, such as those 
discussed in Sec. VII B, the electrons do contribute 
large effects to the line. Electron broadening becomes 
inappreciable for extremely large values of the tem- 
perature if the number density of electrons is held fixed. 


70M. Rudkjobing, Ann. astrophys. 12, 229 (1949). 
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X. SHIFTS PRODUCED BY ELECTRON 
COLLISIONS** 


Argon spectra from high-temperature plasmas pro- 
duced in shock tubes by Petschek, Rose, Glick, Kane, 
and Kantrowitz at Cornell proved surprising at first 
because the lines were not at their expected wave- 
lengths. It seemed that the shifts were greater than 
could be accounted for by the usual Holtzmark theory 
of static ion fields and the consequent Stark shifts.}T 
Hence the results were attributed to line shifts induced 
by the free electrons in the plasma. Baranger™ made a 
quantitative calculation based on the theory of 
Lindholm (see Sec. IV) which showed that this inter- 
pretation was reasonable. 

At the same time the group at Yale had been 
working out the details of the quantum mechanical 
theory of electron effects. Their predictions of hydrogen 
line broadening have been shown by Meyerott and 
Margenau® to agree with semiclassical broadening 
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Fic. 18. Matrix element |C(2/0 — 2s0) |? as a function of x=a0?K?. 
The shaded portion is uncertain. 


theories. This agreement and the calculation by 
Baranger led Kivel? to expect that the quantum 
mechanics will also predict an electron-induced shift. 
The situation in hydrogen is obscured because the 
atomic levels are nearly degenerate, distant collisions 
become important, and an artifact such as the Debye 
shielding to cut off distant collisions must be employed 
in the calculations. On the other hand, argon presents 
the difficulty of unknown matrix elements. Helium, 
having hydrogenlike atomic wave functions and non- 
degenerate eigenvalues is free of these two problems. 
The experiment in pure helium, however, is not easy 
because for elements of low molecular weight it is diffi- 
cult to heat gas to 20000 °K and obtain the high free 
electron densities (10!7/cm*) required. Nevertheless, 
Seay and Seely” of Los Alamos have largely succeeded 
in carrying it out by using a high explosive driver for 
their shock tube. The preliminary results are in agree- 
ment with the theory when an algebraic error in 
reference 72 is corrected. 

An interesting aspect of this work is that the Born 
approximation used in the theory gives a reasonable 
result. Though perhaps somewhat surprising, this 
- reflects the fact that the approximation requires only 
that the energy of the electron be large compared to 
_ the atomic energy separation, and not the atomic 
binding energies. Also, while the Born approximation 
fails to some extent for slow electron scattering, in our 
problem a small error in the electron trajectory is not 
serious. This same circumstance accounts in general 
for the success of the classical theories in line-broadening 
problems, i.e., the Born approximation is close to the 
classical correspondence limit. 


A. Quantum Theory 


The line shift might be expected as a matter of 
course since it arises from the coupling of two states by 
i , perturbation (electron-atom interaction), and such 
; ‘urbations generally cause coupled states to separate 

arov. To deal with this shift quantitatively, we 


aoe ample correction to the theory of the line width 
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in Sec. VIII where we used the approximation 


eiior) tHyt— 1 
Jœ) dws=r f (wi). 
J j ilwi=ws) +Y í 


(10.1) 


In this formula f(w,) is the collision matrix element 
integrated over the scattering angle; it depends on 
wy = (h/2m)k? through the quantity x, defined in (8.7), 
which appears in ey and therefore in f(e). As an 
example, we reproduce the sketch of |C(2p0 — 2s0)|2, 
which is proportional to f(w;) as a function of the 
momentum transfer ao’x*=x given in reference 13 
(see Fig. 18). The region of interest in helium is that 
of large momentum transfers, since the atomic states 
are not degenerate. Here the close collisions matter, 
the Debye cutoff does not enter, and the |C|? de- 
pendence on x is known. For hydrogen, however, where 
the energy levels are close together and the distant 
collisions become important, the problem is more com- 
plicated and is not worked out at this time. 

The physical meaning of the decrease in the matrix 
element with « is that the electron prefers to minimize 
its momentum change. Consequently, if the electron 
causes an atomic transition which requires energ 
(excites the atom) some of this energy will come from 
the photon (red shift) since the electron is stingy with 
its own energy change. Similarly for de-exciting col- 
lisions the photon will gain energy (blue shift). This is 
in agreement with simple expectation based on first 
order perturbation theory where coupled levels repel 
each other. Figure 19 shows this schematically: the 
perturbation C changes the optical transitions J in the 
indicated manner. 

Now we return to Eq. (10.1). In view of the slow 
variation of f(w) with w, Eq. (10.1) is not strictly valid. 
A more careful evaluation yields an extra, imaginary 
term on the right, and this term gives rise to a shift, ôn 
of the line. 

Quantitatively,” the shift (assumed small compared 
to energy separation; hence the formula given is not 
valid for hydrogen) is a sum over all coupled states 


hb ;= (hnvox/3) D lwni/ l wnil ) ( | Phil /a)?, (10.2) 


ni 


where n=electron density, »,=electron velocity, om 
=m(h/myry)?, [rni]? [enil H | ynil?+ [zni], a= free 
atom function for eigenvalue E, with space vector 1, 
and hwni= E,— E;. 

The error in Bethe” was the use of |Zni[2 instead of 
the correct |r,;|?, as given above. It is interesting that 
the shift does not depend on the magnitude of the 
energy separation if the electron has enough energy to — 
excite the transition. For a Maxwell-Boltzmann dis- 
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Fic. 19. Energy levels in a plasma. 


tribution of velocities we find (ô is in units of sec~) : 
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This assumes the Born approximation even for the 
zero energy electrons. 


B. Application to Helium 


As a sample application we consider the He line 
4s — 2p at \=4713 A. This is one of the lines studied 
by Seay and Seely at Los Alamos and has a measured 
shift to the red estimated to be 25 A when the electron 
density is n=4X10"/cm? and the temperature is 
T=20 000 °K. This shift is far too large to be accounted 
for by ions, which produce only a few angstroms. On 
the other hand, Eq. (10.3) predicts a line shift of 
about 35 A, as shown below. Considering that the 
measurement is preliminary and that the calculation 
uses the Born approximation, this seems to establish 
the electron effect. 

To evaluate Eq. (10.3) we note that the sum over 
the magnetic quantum number removes the angular 
dependence”; i.e., 


S| rasan ttn [ta 
m’ 21+ 


(Rar)? 
and 


Enim n'l—lm’ |? — (Rat n! I1)2, 
m’ 2l 21+1 


The radial integrals R are tabulated for hydrogenic 
wave functions in the quantum mechanical literature. 

For the 4s — 2p transition the most important term 
in the curly brace is the 4s — 4p term for which 
(Rass?) = 540, (1+1)/(21+1)=1 and the exponential in 
Eq. (10.3) is 0.9360, so that there is a contribution of 
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+505 (to the red); similarly the transition 4s— 5p 
contributes +56; 4s —> 6p, +8.4, 2p — 3d, —4 (to 
blue); and 2— 2s, +9. The net of all terms with 
principal quantum number <8 is 573. The corre- 
sponding wavelength shift for n=4X10" and T 
=20 000 °K (kT =1.72 ev) is 


A\=)°6/2nc=35 A. 


C. Comparison with Lindholm’s Theory - 


It is interesting to compare this result with that of 
Lindholm’s theory. From Eq. (4.22) we find for the line 
shift according to Lindholm: 


uz= 9.8 vin. 


Considering only the main contributions of 4s—4p to 
the polarization, one finds 


| Z4s, 4p|? €?/ao =| 


u= renead) 
ao hcAvov 


(10.4) 


Since a= 1/137, | 210, 4p|2=180ac2, Av4s,4p=919 cm the 
line shift for the same values of n and T as in B is 
about 24 A. We have written (10.4) so that it can be 
compared with the quantum result, Eq. (10.3), which 
reads, with retention of the same matrix element, 


5 {2 24s, 25] 
=mrnva" | 2—4 — } |. 
ao? v 


This gives a shift of 28 A. : 
Lindholm’s theory assumes that the perturber can be 
localized sufficiently long so that it is reasonable to 
consider a polarized atom. This is in contrast with our 
calculation, which assumes the particles to be moving - 
so quickly that the atom is at all times essentially free. 
Also our treatment which uses the Born approximation S x 
is restricted to perturbers with kinetic energy larg cy 
compared to the atomic energy level separations (G g 
Avis, 4p). i: 
These two extreme assumptions predict tl 
effects, since the physical situations being consi 
lie between their separate provinces. Their agre 
makes the magnitude of the result more secur 
cially since they are reasonably close to the experi 
value. 


(10.5) 
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I. INTRODUCTION 


Ke spectroscopy provides the most precise and 

reliable information about the general electronic 
structure of atoms. This method is likewise powerful 
in elucidating certain features of the general electronic 
structure of solids. Metals, semiconductors, and insu- 
lators may all be studied alike. 

In principle, x-ray emission spectra give a more or 
less direct view of the outer occupied electronic levels 
or, rather, of the corresponding energy states of the 
solid, e.g., the valence band of states; and absorption 
spectra provide a similar view of the states corre- 
sponding to the unoccupied levels. With a combination 
of emission and absorption spectra, one hopes to see 
the forbidden energy gap characterizing nonmetals, 
and to see the Fermi energy for metals. However, what 

one actually sees is often dominated by the local 
discrete states set up as a consequence of the inner 
acancy. The most difficult feature of this 
od is that the unperturbed or normal solid- 


y meth 2 
spectra and the local discrete state spectra 
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are superimposed, and careful experimental measure- 
ments and a careful interpretation are required. The 
interpretation is much more difficult than has been 
generally recognized, but also offers greater possibilities. 

Another feature of the method is that, in interpreting 
emission or absorption spectra, due account must be 
taken of the relative transition probabilities. These 
probabilities involve the wave-function 
(symmetries) of the initial and final states, in accord 
with the well-known selection rules. Although it is 
commonly supposed that the admixture of the various 
wave-function symmetries is rather great in the 
valence and conduction bands of solids, it is still not 
complete. However, with a combination of spectra 
characterized by two or more different high-energy 
states, for example with the K(1s) and the Lyrr(26) 
spectra, we can determine (again in principle since the 
interpretation is very difficult) the energy positions, 
widths, and shapes of the energy levels or bands, and 
the degree of occupancy of the corresponding outer 
electronic levels or bands, of any amenable solid. 

For the investigation of its x-ray emission spectra the 
solid specimen is prepared as an anode of the x-ray 
tube, or as a secondary radiator for fluorescence radi- 
ation. For absorption spectral measurements it is 
prepared as a thin x-ray absorber. All the elements in 
the periodic table (except hydrogen and helium) can be 
studied by this method, although for many of them the 
variety of practical high-energy states is restricted. 

Wavelengths employed in this work range from about 
0.5 A (~25 000 ev) to about 800 A (~15 ev). There 
is thus brought into play a wide variety of both old 
and new experimental techniques, techniques involving 
the radiation source, diffraction of the beam for mono- 
chromatization, dispersion and resolving power, and 
devices for precise measurement of the x-ray relative 
intensity. 

It is to be understood that this brief review is not 
intended to be exhaustive; it is intended to introduce 
and to orient the fundamental aspects of the subject 
in its present state of development for those physicists, 
chemists, metallurgists, etc., who have not had time 
to keep up with it. Other reviews of the subject have 
recently appeared! emphasizing the historical de- 


1A, E. Sandström, “Experimental methods of x-ray spectros- 
copy: ordinary wavelengths,” in Handbuch der Physik, S. Flügge 
and Marburg, editors (Springer-Verlag, Berlin, 1957), Vol. 30, 
pp- 78-245. 


2 D. H. Tomboulian, “The experimental methods of soft x-ray 


spectroscopy and the valence band of the light elements,” in 
t 
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velopment and most of the experimental techniques, so 
we place our greatest emphasis upon current (and 
perhaps future) problems of interpretation. 

Before launching into typical x-ray spectra and the 
interpretational problems, we first (a) mention the 
more general aspects of the method, (b) emphasize 
some experimental factors that must be appreciated 
before interpretation should be undertaken, and (c) 
point out certain essential features of the x-ray energy 
level diagram that have so far received inadequate 
attention. 


II. GENERAL ASPECTS 
Advantages and Disadvantages 


As compared with other methods for studying the 
electronic structure of solids, the x-ray spectroscopic 
method has five significant general advantages: 


(1) All types of solids can be studied (metals, semi- 
conductors, insulators, pure elements, compounds, alloy 
systems, etc.). 

(2) Details of the structure are obtained by the use 
of a relatively narrow high-energy state as a scanning 
probe. 

(3) A wide energy range of the structure is covered 
giving widths, shapes, and energy positions of many 
states and bands of states. 

(4) The degree of admixture or hybridization of 
wave-function symmetries is revealed by the use of 
two different types of inner electron vacancies. 

(5) The spectra are relatively insensitive to the 
chemical impurities, surface states, or other types of 
lattice defects (except the deféct due to the inner 
electron vacancy itself) which are so troublesome in 
most solid-state work.’ Two important general 


Handbuch der Physik, S. Fliigge/Marburg, editors (Springer-Verlag, 
Berlin, 1957), Vol. 30, pp. 246-304. 

3 C. H. Shaw, “The x-ray spectroscopy of solids,” in Theory of 
Alloy Phases (American Society of Metals, Cleveland, 1956), 

. 13-62. 
pr; Y. Cauchois, Les Spectres de Rayons X et La Structure Elec- 
tronique de la Matière (Gauthier-Villars, Paris, 1948). 

5H. Niehrs, Ergeb. exakt. Naturw. 23, 359 (1950). 

6 Landolt-Börnstein Tables (Springer-Verlag, Berlin, 1955), 
sixth edition, Vol. 1, Part 4, pp. 769-867. Many experimental 
curves are reproduced without any critical comment. 

7 Since the inner electron vacancy (the essential vacancy for an 
x-ray transition) involves a relatively high energy, its production 
is not much affected by the low energies that characterize the 
bonds in the crystal lattice. It is most improbable that an appreci- 
able fraction of the observed x-ray transitions takes place in the 
immediate neighborhood of a chemical impurity or other lattice 
defect (exclusive of the defect due to the inner vacancy itself) 
unless these defects are present in an unusually high concentration. 

8 Another type of impurity, however, is a serious problem. This 
is the matter of being sure that the material under study is really 
of the crystalline form desired and is really free of volume and 
surface contaminants. In emission from an x-ray tube anode, 
only the first hundred or thousand angstroms of material is 
sampled, a sampling depth that depends upon the material, 
geometry and voltage of the anode. The effective temperature of 
the emitting volume of the anode is difficult to know or to control, 
especially with a nonmetallic anode, and some physical or chemical 
change may be unwittingly induced. A thin layer of oxide or 
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disadvantages of the method are (a) the theoretical and 
experimental techniques for quantitative work arerather 
exacting, and (b) the problems of separating the two 
superimposed types of spectra, viz., the unperturbed 
solid-state band spectra and the local discrete state 
spectra, have not been generally solved. 


The theoretical aspect deserving special mention is 
the atrocious many-electron nature of x-ray transitions. 
Numerous simplifying approximations must be adopted 
for reasons of expediency and the final effects of these 
approximations remain unknown. Comment must also - 
be made on the very short times involved in x-ray S 
transitions,’ of the order of 10 to 108 sec, making 
it important in most instances to take time-depend- 
encies into account in the theoretical descriptions of = 
the pertinent states and transitions. On the experi- = 
mental side, the measurements must be made with the 
best techniques and with the highest accuracy in order 
to allow the necessary corrections to be applied with 
reasonable conviction. These corrections are listed later. 

Many spectra have been recorded in the literature for 
which the appropriate corrections are very large and 4 
have not been made. 

With regard to the second disadvantage, studies of 
the local discrete states themselves (which we shall call 
“excitation” states!) provide valuable information 
about the perturbation due to the inner vacancy; such 
a perturbation becomes an additional tool in solid-state 
analysis. The excitation states are most easily recog- 
nized in absorption spectra with semiconductors and 
with insulators, although they are probably present 
also in some emission spectra. They are very much akin 
to those commercially-important states (in optical 
fluorescence) associated with an impurity atom.” 


sulfide, or a time-dependent layer of tungsten or of a carbo- 
naceous deposit on the anode surface, is often a trial. In x-ray 
emission in fluorescence, the sampling depth is somewhat greater, 
the temperature is much easier to know and to control, and 
generally the physical and chemical state is much better known. 

9 An initial x-ray state may decay either radiatively or non- 
radiatively. If it decays nonradiatively, an Auger electron, rather 
than an x-ray photon, is emitted. The probability for radiative 
decay is called the fluorescence yield (it should be called the 
“radiative” yield). Values and discussion of this yield are given 
by Broyles, Thomas, and Haynes, Phys. Rev. 89, 715 (1953); E. — 
H. S. Burhop, The Auger Effect (Cambridge University Pr 
London, 1952). For any state of energy less than about 15 k 
the Auger process is the more probable one and, hence, domin; 
the state lifetime; and for states of even smaller ener à ; 
radiative probability becomes rapidly less, being less than 0.1 aaa: 
for states of energy less than about 4 kev. ee 

10 The term “exciton”? is commonly used in refer 
type of excitation states in absorption spectra, es 
ultraviolet absorption. ‘‘Exciton” is a theoretical term tk 
specifically to a coupled electron and positive hole t 
nally proposed by Frenkel, travel as a unit mor 
around in the lattice. This term has acquired so 
aura of the theoretical one-electron model th 
be misleading in practice, especially in the interpr 
species, Further comments on the one-electron’ 

ater. 

u The “impurity atom” in the presen’ 
of as one of atomic number Z-++1 in- 
taining an atom of atomic number . 
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Excitation states are discussed further in connection 
with the energy level diagrams. 


Two Energy Regions 


Requiring high resolving power (an energy resolution 
of about 0.1 ev) and accurate measurements of relative 
intensity (of the order of one percent), this spectroscopic 
method divides experimentally into two energy regions, 
with a strong tendency for a corresponding dichotomy 
of the subject as it appears in the literature. 

The division is essentially according to the type of 
monochromator: For the region roughly 0.5 to 18 A, 
reasonably perfect crystals (calcite, quartz, mica, etc.) 
are used either on a two-crystal spectrometer (some- 
times called a double crystal spectrometer) or on a 
focusing instrument in which a single crystal is carefully 
bent to focus the diffracted beam. For the region 
roughly 15 to 800 A, ruled concave gratings are used 
at grazing incidence in a Rowland mount. Vacuum 
instruments, or instruments whose x-ray path (includ- 
ing any transmission windows) is otherwise made 
easily penetrable, are employed for wavelengths greater 
than about 2.8 A. 

Another difference between the two regions has to do 
with the method of measuring the x-ray relative 
intensity: The precise and reliable photon-counting 
techniques, especially with the proportional counter,” 
are used most advantageously in the 0.5 to 18 A range. 
The less accurate photographic techniques are used in 

the longer wavelength range. However, much work in 
the former range, especially pioneering work, has been 
photographic, and some success has been achieved with 
photon counters in emission work in the latter range. 
In either range, the photographic method is far better 
suited for quick surveys of wide spectral regions since 
only a single exposure is required. It is well known, 
of course, that the photographic method is well adapted 
to precise determinations of wavelengths of well- 
resolved symmetrical lines in either range, but a difficult 
and often impractical procedure must be set up for 
each exposure in order to translate the photographic 
blackening to a reliable intensity (or absorption) scale. 
This fact seriously hampers accurate photographic 
studies of such spectral properties as widths, shapes 
and relative intensities. Indeed, reports of such prop- 
erties, if a reliable scale has not been established, are 
of very little or no quantitative value. 

A third difference between the two regions is in the 
type of radiation used in absorption work. Continuous 
x-radiation from a conventional type x-ray tube is 
evidently too feeble to be practical at the longer wave- 

ths. Work to date, with one exception, has been 
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done with optical spark lines. An excellent source of 
continuous radiation is now available from high-energy 
centripetally accelerated electrons in the orbit in a 
synchrotron.’ This type of orbit radiation may soon 
replace the radiation from conventional x-ray tubes in 
x-ray spectroscopy in both wavelength regions, the beam 
being used directly in absorption work and being used 
to produce by fluorescence the characteristic emission 
spectra of any material placed in the beam. 

Incidentally, the availability of the continuous 
radiation from the synchrotron ‘‘x-ray tube” will no 
doubt further stimulate the development of photon or 
electrical counters for the long wavelength region since 
this type of measurement system now becomes useful 
in absorption work as well as in emission. Because of 
the short pulses of radiation from the synchrotron, 
such counters will probably have to be arranged to 
measure an average number of photons per second by 
means of some integrating device. 


III. RESOLVING POWER AND CORRECTIONS 


In either energy region, two kinds of resolving power 
are involved: first, that due to the so-called ‘spectral 
window” of the instrument,!® and, second, that due to 
the effective width and shape of the high-energy state 
in the x-ray transition. The central region of the 
spectral window, in the case of the two-crystal spec- 
trometer, is determined essentially by the lattice type 
and degree of perfection of the monochromator crys- 
tals.16 The tail regions of the window are usually 


3D. H. Tomboulian and P. L. Hartman, Phys. Rev. 102, 
1423 (1956); D. H. Tomboulian and D. Bedo, J. Appl. Phys. 
29, 804 (1958). 

“L. G. Parratt, Rev. Sci. Instr. 30, 297(L) (1959). 

16 “Spectral window” refers to the effective over-all spectral 
response of the instrument. If the beam incident upon the spec- 
trometer has a flat spectrum, the spectral window may be inter- 
preted as the function which gives the probability that a photon 
of energy v will be counted when the spectrometer is set at 
energy vs. For example, the spectral window of a two-crystal 
spectrometer includes the effects of (a) the geometry of the 
horizontal and vertical collimating slits, (b) fluorescent radiation 
and the diffuse or Compton or other spurious scattering (especially 
from the second crystal), (c) any higher order spectra, (d) any 
frequency-dependent attenuation of intensity in the x-ray path 
in the spectrometer, (e) the relative wavelength sensitivity of the 
intensity-measurement system, as well as (f) the diffraction 
pattern of the two crystals in the (4, +n»s) position of operation. 
The true spectrum is distorted by the spectral window by an 
amount that depends upon the window’s width and shape—the 
somewhat remote tails of the window’s shape cannot be ignored. 

The dispersion of the instrument, viz., the spread of an x unit or 
of an ev in Bragg angular measure or in linear measure along a 
photographic plate, should not be confused with resolving power, 
although with some types of instruments dispersion and resolving 
power are closely related. 

1¢ Calcite crystals are the traditional favorites, but recently 
quartz crystals have been shown to have higher resolving power 
and comparable percent reflection. [Adell, Brogren, and Haegg- 
blom, Arkiv Fysik 7, 197 (1954)]. Practical attainment of the 
higher resolving power requires skillful operation of a well- 
designed instrument. For example, with the two-crystal spec- 
trometer with good quartz crystals, angular increments as sm 
as 0.1 sec of arc must be measured accurately and the axes of the 
two crystals must be maintained parallel to within a few seconds 
of arc for all angular settings including both the parallel and the — 
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determined by a combination of the so-called thermal 
diffuse scattering and the slit geometry. In the grating 
spectrograph, the window is usually dominated by the 
width of the slit. In the case of the focusing bent- 
crystal instrument, the window is due to some combi- 
nation of the crystal, slit, scattering, and focusing 
imperfections. 

In the 0.5 to 18A range, the over-all effective 
resolving power is usually determined, not by the 
spectral window, but by the width of the high-energy 
state. This state width in the 0.5 to 18A range is of 
the order of 0.2 to 8 ev at half-maximum, depending 
upon the particular state and atomic number. On the 
other hand, in the 40 to 800 A range, the slit is usually 
the more important; and it is of the order of 0.01 to 
0.5 ev wide. In the longer wavelength range, the 
“smearing” due to the over-all resolving power is often 
negligibly small, but this is never so in the 0.5 to 18 A 
range. 

The rather large width of the high-energy state in 
the shorter wavelength range has been considered by 
many investigators to be a serious disadvantage, often 
over-compensating for (a) the ease and efficiency of 
producing the x-rays, (b) the convenience of the more 
penetrating x-rays, and (c) the precise measuring 
techniques of the photon counters. Recently, however, 
methods have been developed!’ for correcting the 
observed spectra for both kinds of resolving power if 
(a) the experimental curves are very accurately 
recorded, and (b) if the width and shape of both the 
spectral window and the high-energy state are accu- 
rately known. With the two-crystal spectrometer the 
window shape (exclusive of the remote tails) is usually 
intermediate between Gaussian and Lorentzian; with 
the ruled grating it is usually more nearly Gaussian. 
The shape of the high-energy state is commonly taken 
to be Lorentzian, as due to lifetime, but it also contains 
a Gaussian component due to lattice vibrations and 
often a width due to unresolved multiplet states as well. 


Corrections to Observed Spectra 


In general, several different corrections must be made 
to the directly observed spectra in order to obtain a 
reasonably reliable view of the energy level structure 
of the solid. The most important corrections in the 0.5 
to 18 A range are commonly those for resolving power, 
both for the spectral window and for the high-energy 
state. In the 15 to 800A range, often the conversion 
from photographic blackening to x-ray intensity is the 
most important correction. In either range, other 
corrections are also involved. Briefly, the effects for 
which corrections must be made, and in the order of 
procedure for making them, are (a) nonlinearity of the 
intensity scale, (b) instrumental resolving power (spec- 


17 J. O. Porteus and L. G. Parratt (to be published). Preliminary 


reports appear in Bull. Am. Phys. Soc., Ser. II, 2, 55 (1957); 
ibid, 3, 263 à a n, 


(1958). 
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tral window), (c) self-absorption by the material 
emitting the x-rays, both of the emission line or lines 
and of the ubiquitous continuous radiation, (d) back- 
ground under the particular emission contour being 
studied, (e) satellites (or other “spurious” lines) in 
emission, and (f) width and shape of the high-energy 
state either in emission or in absorption. It is presumed” 
that the material emitting or absorbing the x-rays is 
clean and of known physical and chemical state. In 
case of mixed materials, correction for the foreign 
material must also be included in (c), (d), and (e). 

The relative importance of each of these six correc- 
tions, as well as the details of the correction procedures, 
depends upon the particular spectrometer or spectro- 
graph used and upon the particular spectra under 
study. Unfortunately, even a reasonably adequate 
discussion of these corrections has not been attempted 
in most of the published work in this subject to date, 
and this fact has greatly impeded progress in interpre- 
tations. 

It is significant that now it is possible, in either 
wavelength range, albeit difficult in some instances, to 
obtain “experimental” curves in which essentially the 
product Na(E)T(E) is plotted vs the energy Æ. In this 
product, Na(E) is the density of those states, including 
the excitation states, having the appropriate wave- 
function symmetry for the particular transition accord- 
ing to the radiative selection rules, and T(E) is the 
transition probability. [Logically, T(E) includes the 
effects of wave-function symmetry, but it is customary, 
because of the use of the energy level diagram, to place 
the subscript s or p, etc., on V(Z).] If one desires a 
view of the normal energy band structure of the solid, 
N (E), unperturbed by the inner electron vacancy and 
regardless of the wave-function symmetries of the 
states, corrections for two additional effects must be 
made: (g) excitation spectra and (h) variations in the 
transition probabilities, with due regard for wave- 
function symmetries,!® for the pertinent transitions 


18 For a free atom, the transition probability is proportional to 
v| M; s|? in emission and to 1/v|M: s|? in absorption, where 
M;i, s= SVs" (0/dx)WVidr, v is the radiation frequency, Y; and Wy 
are the wave functions of the initial and final states, and the 
integration is over all space. For a solid, a similar expression 
applies but now the wave functions, especially for a state which 
contains an outer vacancy (e.g., the final state in emission) or 
an extra outer electron (e.g., the final state in absorption), are 
not at all well known, and, in particular, they cannot be assumed 
to have one-electron properties or even a convenient symmetry 
for purposes of integration. 

As a part of the wave function of the final state in an absorption 
transition, the wave function of the ejected electron, if it is 
ejected completely free of the positive hole it leaves behind, is 
commonly assumed to be that of a plane wave in the lattice. Mme 
this assumption is valid, then in the transition probability, the 
linear momentum of this plane wave electron must be expressed 
as an angular momentum about the atom from which the electron 
is ejected. . 

In emission, the transition probability is proportional 
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from (or to) each and all states in the bands. With 
these eight corrections (a) through (h), one would 
obtain the density of states curve N (E) vs E. But we 
do not yet know how to make corrections (g) and (h) 
satisfactorily. For example, each solid-state energy level 
or band contains an admixture of several different 
symmetries, and correction (h) presumes knowledge of 
this admixture. At present the most reliable method 
of obtaining this knowledge is from experimental x-ray 
spectroscopy with two or more high-energy states of 
different symmetries. However, in many respects, the 
degree of admixture is as interesting as the fully 
corrected N (E) vs E curve itself. 

Before discussing a few typical experimental spectral 
curves, let us prepare generally for some of the problems 
of interpretation. We devote considerable time to the 
qualitative details of the energy level diagram since it 
is here that confusion is most apt to arise. 


IV. ENERGY LEVEL DIAGRAM 
Terminology 


At least to the uninitiated there seems always to be 
confusion among the terms “state,” “energy level,” 
“energy level diagram,” “energy band,” “electronic 
shell,” “orbital,” “ordinary x-ray state,” “x-ray satellite 
state,” “ionization state,” “excitation state,” and 
“electronic level.” Let us first try to clarify these terms. 
This is a major step in the understanding of an energy 
level diagram. 

“State” refers to an assemblage of particles in a 
particular configuration in a system. The system may 
be a gaseous atom, a polyatomic molecule, or a localized 
part of a large piece of liquid or solid material. In a 
liquid or solid, the degree of localization is determined 
by the lifetime of the state in question and by the 
Franck-Condon principle. The energy of a state is also 
called the energy level in implicit or explicit reference 
to an energy level diagram. Such a diagram is discussed 
in detail later. An energy band in a solid (or in a liquid 
or in a polyatomic molecule) refers to a band of energy 
levels extremely closely spaced. 

Those electrons that are described by the same wave- 
function characteristics (except for the direction of 
electron spin) are said to make up, or to be in, one 
“electronic shell” or “orbital.” For example, the 2p; 

electrons (called Lrr or sometimes L» electrons) com- 
prise one electronic shell or orbital (called the Lyr or 
L2); so do the 3d; electrons (called Mry or M4); etc. 
The term shell or shells is used especially for the 
appropriate group or groups of electrons or orbitals in 
a monatomic gaseous atom and for the inner electrons 
in a polyatomic molecule, a liquid or a solid, but not 
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very often in reference to the outer electrons in a 
nonmonatomic system. 

It follows that each state of a many-electron assem- 
blage—a gas, liquid, or solid—refers to a configuration 
of many constituent electronic orbitals. It also follows 
that only an electronic shell or orbital (but zot a state 
or band of states or an energy level) can be said to be 
occupied or unoccupied, i.e., by one or more electrons. 

An ordinary x-ray state always refers to an electron 
configuration in the system in which there is one and 
only one inner electron vacancy. For this reckoning, 
any bound electron in the system, except a valence 
electron, is an inner electron. This definition of an 
ordinary x-ray state unfortunately excludes the state 
corresponding to a vacancy in the valence orbitals. In 
spite of this exclusion, the valence state is, for obvious 
reasons, always included in the x-ray energy level 
diagram of ordinary levels. 

Two or more inner vacancies correspond to a second- 
or higher x-ray state, sometimes called an x-ray satellite 
state.” Higher order or satellite states are usually 
emphasized only in the emission of satellite lines, but 
they are also of direct interest in absorption spectra. 

Any ordinary x-ray state may be an ionization state 
or it may be one of various types of excitation states. 
It is an ionization state if the ejected electron is com- 
pletely free from the Coulomb field of the atom (or ion) 
in question, and if there is no abnormal configuration 
of the valence electrons. If the inner electron is ejected 
from its shell but is not completely free, it remains 
bound in an orbital outside of the occupied valence 
orbitals. In this event the system is in one type of 
excitation state. Excitation states are discussed in 
detail later. 

In this terminology, we refer to the type of many- 
electron assemblages actually encountered in the inter- 
pretation of any and all x-ray spectra. Insofar as the 
gross features of the inner part of the atom are con- 
cerned, an x-ray state may be thought of as a “‘one- 
electron vacancy state.” One of the theorems in atomic 
spectroscopy says that a one-vacancy system behaves 
in many respects like a hydrogenic or one-electron 
system. In hydrogen, it suffices to treat the energy of 
the system as being the same as the energy of the one 
electron, and no distinction is made between the 
electron energy and the system energy. Because of the 
difficulty of theoretical calculations of a many-electron 


1 We use the term “valence electron” to refer to any electron 
in the outermost band of electronic orbitals that is normally parti- 
ally or fully occupied. In this use, we do not separately distinguish 
a conduction electron in a metal, nor heed the special role o 
certain valence electrons in the physical or chemical bond in 
molecules or solids. 

% This is the conventional definition of an x-ray satellite state, 
but the exclusion of a valence vacancy as the second vacancy does 
not mean that the energy of the state is not affected thereby. In 
fact, if the second vacancy is in what we later call an excitation 
valence orbital, especially if it has s-type symmetry, it produces 
a very significant change in the energy of an inner-vacancy state- 
We shall Jater refer to such an altered state as one type of x-ray 
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system, the theory of the one-electron approximation 
is used as the basis of the theory for the many-electron 
atoms and for the solid state. It is common in solid-state 
lore to refer to a change in a single electronic level as 
being simply the change in the energy of the system. 
This, of course, would be valid if, as the electron shifts 
in its momentum or energy position, there is no shift 
of any other electron in the system. Actually, in most 
physical phenomena when examined in some detail, 
including x-ray spectra, the interaction between elec- 
trons is so great as to invalidate this one-electron 
approximation. 

The symbol terminology of the one-electron model, 
viz., $, p, d, etc. (lower-case letters) to indicate electrons, 
electronic orbitals, bands of orbitals, states, energy 
levels, and energy bands, is usually retained in x-ray 
spectroscopy in spite of the intrinsic many-electron 
nature of the problems.” These one-electron symbols 
are retained with the implicit understandings that (1) 
they refer to the principal jumping electron in the 
x-ray transition, (2) they do not indicate an absence 
of interaction between this principal electron and other 
electrons in the system, and (3) in applying the radi- 
ative selection rules to determine the allowed transi- 
tions, the assumption or approximation is made that 
the change in the angular momentum of the entire 
system is given by the angular momentum change of 
just the principal jumping electron, i.e., that no other 
electron changes its angular momentum. It must be 
realized that retention of the one-electron symbols 
carries along a tendency to confuse the many-electron 
system energy levels with the one-electron system 
energy levels; every effort must be made to avoid this 
confusion. 


“One-Electron-Jump” Diagram 


Figure 1 presents a rough sketch of a conventional 
so-called energy level diagram (a) for a metal and (b) 
for an insulator. Although this type of diagram contains 
some confusion and is inherently inadequate, as implied 
generally above and as pointed out specifically below, 
it is the one that has been most used in this subject to 
date and therefore deserves discussion. 

By definition, each level in the diagram is supposed 
to denote the energy of the entire system when one 
electron is missing from the particular electronic shell 
or orbital specified in the energy level terminology. It 
is supposedly a diagram of x-ray ionization states. A 
few so-called excitation levels are sketched in for the 
case of the insulator although, as we shall see, inclusion 
of these states in this type of diagram is somewhat 
abortive. The principal feature intended, of course, is 
that the transition arrow give the energy of the photon 


*1 Upper case letters are often used in referring to theoretical 
x-ray satellite states corresponding to multiple inner vacancies. 
In this case the electron configuration is presumed to be com- 


pletely known and to behave as a monatomic many-electron 
system. 
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Fic. 1. Conventional x-ray “energy level” diagram (arbitrary 
scale) for each of two solids. This type of diagram was devised 
for a combination of a one-electron-vacancy model and a one- 
electron model; it is incomplete and misleading (see text). 


emitted or absorbed. With a linear energy scale, this 
energy is interpreted as being the length of the arrow; 
however, for convenience in x-ray spectroscopy where 
a wide range of energies is involved, a logarithmic scale 
is usually used. 

The confusion in this diagram appears in both 
atomic and solid-state work, but it is in solid-state 
work that it becomes especially serious.” We shall 
presently propose another type of diagram, one that 
is more complex but that, we believe, has the necessary 
features to avoid the pitfalls of misinterpretation 
inherent in the diagram of Fig. 1. 

One type of confusion is the following. The lower 
part of the “occupied” region of the diagram is just an 
inversion” of the diagram conventionally used as a 
satisfactory approximation in interpreting the gross 
features of x-ray spectroscopy of the “inner” or high- 
energy levels. But, in the “occupied” and “unoccupied”? 
regions near the valence and conduction bands, we 
have deviated from the one-vacancy system and have 
drawn in the normal or unperturbed solid-state bands 
in which the effect of the vacancy is ignored. In fact, 
we usually treat the upper or solid-state part of the 
diagram as a one-electron system instead of as a one- 
vacancy system. This composite leads to difficulties 
when transitions between the lower and upper parts 
are considered. One difficulty becomes readily apparent 


“Tt may be argued that, in a general sense, any diagram is 
merely a shorthand aid to memory, and if someone finds a partic- 
ular scheme useful as a memory aid, it cannot be properly called 
inadequate or inconsistent for him. But diagrams somehow have 
a penchant for becoming pedagogical and interpretational aids 
as well as memory aids, and a poor diagram is often definitely 
harmful. 

° All the bound stationary states of a hydrogenic or one-electron 
system are of negative energy, and all unbound or continuum 
states are positive. In the case of a one-vacancy system, the 
bound states are properly reckoned as positive. Accordingly, 
several years ago, before “solid-state spectroscopy,” the x-ray 
diagram was drawn with the K state at the top instead of at the 
bottom. The arrangement with the K state at the top is con- 
ceptually better for “atomic” x-rays but it does not alee con- 
veniently for the continuum of states. The inverted arrangement 


of Fig. 1 has been used in deference to the convention in solid state 
WOrK. t 


CC-0. Gurukul Kangri University Haridwar Collection. Digitized by S3 Foundation USA 


eaea aE 


ae 


622 G 
when it is realized that the initial state for x-ray 
absorption is actually one for which the system is in 
its normal unexcited state, i.e., in its “single-valued” 
ground state,™ whereas in the diagram the energy of 
this initial state is drawn as some one of many different 
energy levels in the unoccupied part. This is evidently 
a confusion of the energy level idea with a part of the 
electron configuration in the system. 

As just implied, the diagram of Fig. 1, with the 
direction of the transition arrow reversed, is better used 
to indicate the initial and final positions of the principal 
jumping electron. Such a one-electron-jump diagram 
could purport to give the energy values of the initial 
and final states on the argument that each electron 
configuration, when completely specified, corresponds 
to a definite energy of the system. The difficulty in this 
use of the diagram is that there are generally many 
different electron configurations associated with each 
inner vacancy, and there may be several different 
electrons jumping in each transition. For example, it 
is clear that an excitation energy level exists only 
because of, and during the lifetime of, the inner electron 
vacancy. Consequently, when an electron jumps from 
any occupied orbital to fill the inner vacancy, the 
excitation level ceases to exist (or changes to a different 
excitation level), and any electron in the excitation 
orbital must also jump to some new orbital. In the 
special event that the inner vacancy is filled by the 
excitation electron, i.e., by the electron in the excitation 
orbital, the final vacancy is in no sense in the excitation 
= energy level as this level appears in the diagram. 
____Energy-wise, where did the electron jump from? No 
transition arrow can be drawn in the conventional 

d eam to indicate this jump. 

X-Ray Excitation States: BEE and VEC Types 


_ A proper x-ray energy level diagram should have a 
ingle energy level corresponding to the ground state 
th is conveniently called zero energy) as the initial 
tate for each and every x-ray absorption transition; 
he final state, being one in which the ejected 
tron is in any one of many otherwise unoccupied 
electronic orbitals or in the continuum, should be 

i ted as one of many high-energy levels. For 
o for K absorption, the diagram should show 
variety of final-state energies as a type of splitting 
e conventional K state, including the continuum 
K le vel as well as the several discrete K levels. Any 
hese many K levels may be the terminus of the 

1 of the arrow representing a K-absorption transi- 
y, each one of the other conventional 

levels (Lr, Lir, Lrrr, Mr, Mrz, etc.) should 
veral discrete levels and a continuum. 
hich the ejected electron resides in 
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To the many possible states of this sort, we assign the 
descriptive name “‘bound-ejected-electron”’ (BEE) exci- 
tation states. 

Bound-ejected-electron excitation states are com- 
monly produced in x-ray and ultraviolet absorption 
but are believed to be only rarely produced by bom- 
bardment with cathode electrons in an x-ray tube anode. 
Such states probably dominate the irregular structure 
observed near the absorption edges, e.g., see Fig. 2 for 
a curve for a monatomic gas,” Fig. 3 for a polyatomic 
gas, and Fig. 4 for a solid. But, especially in solids, 
other types of excitation states may also be involved. 
Since 1920, any observed structure near the ‘‘main” 
absorption edge has been known as Kossel structure.!: 

Another type of excitation state was implied earlier, 
viz., a state in which the system has some abnormal 


IS-np, n>3 
{1 titi 


l 
J 
/ 


Ki 
Tex 


“XESS 


À 
ON 


Energy (ev) 


Fic. 2. K-absorption spectra (uncorrected) for gaseous argon 
[L. G. Parratt, Phys. Rev. 56, 295 (1939) ]. The structure is 
dominated by bound-ejected-electron excitation states; the 
“main” absorption edge occurs at the series limit of these states. 


configuration of the valence electrons. For many 
materials, it is highly probable that only the so-called 
normal configuration of valence electrons is significantly 
involved. But for some materials, e.g., solids having 
s-type valence electrons and perhaps an incomplete 
inner shell (transition metals), the “abnormal” con- 


26 The correction for instrumental resolving power as reported 
by L. G. Parratt [Phys. Rev. 56, 295 (1939) ] is not quite right 


as judged by the results of the new correction procedure. The 


curve in Fig. 2 is not corrected for resolving power. 


J. A. Soules and C. H. Shaw [Phys. Rev. 43, 470 (1959)]] also 


report this absorption spectrum but they evidently have an in- 

vertent “thickness effect” in their curve. Indeed, many or most 
of the x-ray absorption curves reported in the literature suffer 
from this effect. For a discussion of the thickness effect, see 
Parratt, Hempstead, and Jossem [Phys. Rev. 105, 1228 (1957)] 
26 A. H. Compton and S. K. Allison, X-Rays in Theory a 

periment (D. Van Nostrand Company, Inc., New York, 193 
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figurations are believed to be important. Let us look 
at the valence electrons more closely. 

As the inner vacancy is being produced, a change 
occurs in the Coulomb field in which all the electrons 
of the system find themselves. No doubt, the final 
configuration of all the electrons would be essentially 
unique (i.e., only the normal configuration) for a given 
inner vacancy if the inner vacancy were formed very 
slowly and if the electrons were granted a long time in 
which to readjust to the altered field. But such deliber- 
ation is not allowed. The change in the Hamiltonian of 
the system occurs in less than about 107!8 sec, and in 
consequence of this change all the electrons are forced 
to change their wave functions. They do so as rapidly 
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Fic. 3. K-absorption spectra (uncorrected) for gaseous chlor- 
ine [Stephenson, Krogstad, and Nelson, Phys. Rev. 84, 806 
(1951) ]. 


as they are able. Generally, it may be presumed that 
the inner electrons shift rather quickly to the respective 
new orbitals closer to the nucleus of the atom in 
question, but the slow-moving valence electrons may 
require more time. The alloted time is about the 
lifetime of the inner vacancy, of the order of 10718 sec 
for K states. At the termination of this allotted time 
some of the valence electrons may have succeeded in 
following the new orbitals and the others may have 
been obliged to go to an unoccupied excitation orbital?” 
or to the unoccupied part of the continuum. The energy 
of the system depends upon the particular configuration 


27 To illustrate, the 4s valence electron of monatomic gaseous 
potassium may go, upon production of the 1s hole, to the 5s 
orbital of the new system (i.e., according to the new Hamiltonian) 
leaving a vacancy in the new 4s orbital. 
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Fic. 4. Ly; absorption spectra (uncorrected) for metallic silver 
CL. G. Parratt, Phys. Rev. 54, 99 (1938) ]. 


of these electrons, and different configurations yield 
different energies. Without commenting further about 
their relative production probabilities,’ we shall call 
these possible different states ‘‘valence-electron-con- 
figuration” (VEC) excitation states. They are reckoned 
as excitation states whether or not the inner electron is 
ejected completely free of the atomic field, although of 
course the energy depends also upon the particular 
final position of the ejected electron. 

Figure 5 is a rough sketch which symbolizes in simple 
fashion the type of different electron configurations 
involved in the bound-ejected-electron and valence- 
electron-configuration excitation states for a solid. The 
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Fic. 5. Symbolic representation of different electron configu- 
rations in x-ray excitation states. For clarity, a restricted number 


of electron orbitals is shown: More than one valence electron 
and more than one ejected electron may be involved. SYA j 
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28 There are many eigenvalues in the solution of Schrödinger’s 
equation for a many-electron system following an abrupt change 
in the Hamiltonian, and we have mentioned only a few of them 
This is done with the implied assumption that only a few have 
significant probabilities of production. ae 

Also, we have chosen to use a semiclassical description of the nee 
states (in terms of specific electrons), rather than a general 
quantum mechanical description; the latter, although formulated — 
exactly in equations, does not have much practical meaning 
without solution of the as-yet-unsolved many-body pro 
Partial solutions of the quantum mechanical problem 
best-known wave functions, are nevertheless very much 
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open circles indicate alternative possible positions of 
the electron in question, although more than one 
electron in each case may be involved. 

As this discussion is intended to apply to all kinds of 
materials (gases, liquids, and solids) we must distinguish 
between the valence electrons associated with the 
particular atom in question and the other valence 
electrons of the system as a whole. In addition to the 
possible shift of the atomic valence electrons to orbitals 
closer to the nucleus, many of the system’s other 
valence electrons may also move in a little closer. This 
feature for a solid is included symbolically in Fig. 5. 
The two types of shifts are in competition and they 
interact with each other since the objective of each is 
to shield the charge of the positive hole and thus to 
destroy the shift motivation. For example, insofar as 
the far-out excitation orbitals are concerned (corre- 
sponding to discrete energy levels in, or near the 
bottom of, the continuum), the in-flow of system elec- 
trons in a metal is expected to be much more important 
than in an insulator. The shielding thus effected by the 
system’s electrons in a good metal is expected to inhibit 
especially the relative importance of the bound- 
ejected-electron excitation states in absorption,”—! as 
well as to alter the relative importance of the valence- 
electron-configuration excitation states in both absorp- 
tion and emission. 

But whether the system’s valence electrons are thus 
effective or not, the new valence orbitals may be present 
though unoccupied. In this event, it is possible that an 
outer electron may change its mind, so to speak, after 
the formation of the inner vacancy and drop into the 
unoccupied orbital before the allotted lifetime of the 
inner vacancy has expired. The likelihood of such a 
change of mind may be negligibly small unless it 
involves an Auger-type transition in which another 
electron in the system is shifted farther away from the 
nucleus. Good metals are especially interesting in this 
regard because of the ease with which Auger transitions 
of small energy may take place, the Auger electron 
being merely moved the necessary amount across the 
Fermi surface. 

An emission spectrum in which valence-electron- 
configuration excitation states may be involved is 
shown in Fig. 6, a spectrum studied by Schnopper* 
with a two-crystal spectrometer. This spectrum is the 
Kßı3 region for metallic manganese, the transitions 

supposedly being K — Mrz, 111 where the Mrr, rrr elec- 
trans occupy the shells just inside the partially com- 
plete Mrv.v, Npr shells and the valence electrons in 
wy. Cauchois and N. F. Mott, Phil. Mag. 40, 1260 (1949); 
and R. Landshoff, Phys. Rev. 55, 631 (1939). 
“æ J, Friedel, Phil. Mag. 43, 153, 1115 (1952), Proc. Phys. Soc. 

London) B45, 769 (1952) i DEER) Phys. 3, 446 (1954); 
pe Rele L. Jossem, Phys, Rev. 84, 362 (1951), 

, s. Rev. 97, 916 (1955) ; J. Phys. Chem. Solids 2, 67 (1957). 
yS. ll-energy Auger transitions in a metal are discussed by 
2 Smal [Proc. Phys. Soc. (London) A62, 806 (1949)]. 
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Fic. 6(a). The observed K£, emission and the M11, 111 states 
for metallic manganese. The observed contour is corrected first 
for the instrumental resolving power and then for the K state 
to give the M density-of-states contour. 

Fic. 6(b). The M density-of-states contour of Fig. 6(a) resolved 
arbitrarily into conventional Mrr and My,;; components, the 
residue being indicated as excitation states. In the upper figure 
the Mrr and Mrr components are Lorentzian in shape; in the 
lower figure they are Gaussian. 


this metal. The directly observed spectral contou! 
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resolving power,” and then corrected for the “resolving 
power” due to the K state®® by the method of Porteus 
and Parratt.'7 The solid curve thus represents the 
product of the density of p-type states and the transition 
probability, viz., N (E)T (E). The N (E)T (E) curve is 
then shown in the lower two parts of the figure as 
resolved into three components, Mrr and Mrrr, and 
the remaining broad low-energy component indicated 
as excitation states. This resolution has been carried 
out first with the assumption that the Mrr, rrr states 
each has a Lorentzian shape, and, then, a Gaussian 
shape.*® The excitation states are commonly referred 
to as the final state for the £, line. 

The wavelength of a so-called KB’ component has 
been reported from photographic measurements by 
several investigators as near the low-energy side of the 
low broad component, and this component is commonly 

* known as the £’ line. Three possible explanations of the 
origin of this 8’ line have been proposed but none of 
them seems able to account for its rather large relative 
intensity, about 1.4 times the integrated intensity under 
the £1 line as shown in Fig. 6. The first explanation takes 
it to be a complex satellite corresponding to the double- 
electron-jump transitions KLrr, rrr > LrM ry, 3785 No 
calculations of the expected relative intensity according 
to these transitions have been made, but it is generally 
expected that, in general, double-jump transitions are 
far less probable than single-jump transitions. For 
manganese, the single-jump satellites ALrz,111— 
Lir, rriLrr rrr, viz., the Kas, 4 satellites, are only one 
percent as intense as the parent Ka, line.’ The second 
explanation assigns the 8’ component to Compton and 
Raman scattering of the 61,3 doublet as this radiation 
attempts to emerge from the x-ray tube anode in which 
it is generated.**" This requires far more self-absorption 
in the anode than is reasonable, and, anyway, such a 
scattered component is not observed accompanying 
x-ray lines in general, e.g., the Kai, lines. The third 


31 This correction presumes that the central part of the instru- 
mental spectral window is given by the experimental (1, —1) 
curve which has a full width at half-maximum of 0.6 ev and which 
has a shape between a Lorentzian and a Gaussian, and that the 
remote tails (the “flat? component) of the window are a little 
higher than the Lorentzian. The flat component is best obtained 
from an analysis of the thickness effect in an absorption curve. 

35 The K state is assumed to be of Lorentzian shape having a 
full width at half-maximum of 1.05 ev. 

36 In each case, the resolution was made on the assumptions 
(a) that the full widths at half-maximum of the M7; and Mir 
states are 1.2 and 2.4 ev, respectively, and (b) that the relative 
“intensity,” or state-density, at the maximum ordinate is 4 to 1. 
The two M widths are estimated from the radiative and Auger 
transition probabilities, and from a sort of “best fit” criterion in 
effecting the actual resolution. The relative density follows from 
the ratio of quantum weights, vizs., 2 to 1, and from the ratio of 
widths. 

37 According to this transition, which Sawada?! interprets as 
(K > Li) + (Lin > Mwv,y), this satellite has no obvious 
“parent” line. Nevertheless, it is customarily called a low-energy 
satellite of KB;,3. 

38 M. Sawada, Sci. Mem. Kyoto Imper. Univ. A15, 43 (1932). 

3 L. G. Parratt, Phys. Rev. 50, 1 (1936). 

40 H, P. Hanson and J. Herrera, Phys. Rev. 105, 1483 (1957). 

4 F, Bloch, Phys. Rev. 48, 187 (1935). 
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explanation is that splitting of an x-ray state may 
occur as a result of interaction (angular momentum 
coupling) between the inner vacancy and an incomplete 
outer shell of electrons, e.g., the Mry or My shell for 
the transition elements scandium through nickel.” 
In general, this type of splitting is expected to be very 
small and to contribute at most only to the width and 
asymmetry of the level or of an emission line involving 
the corresponding state. 

It seems that most if not all of the broad low-energy 
component in Fig. 6 arises from valence-electron- 
configuration excitation states. Preliminary calculations 
have shown® that, according to this interpretation, the 
energy intervals and the relative intensity may be of 
the correct order of magnitude. It is suggested that the 
observed relative smoothness of this component, i.e., 
the absence of much obvious structure, is due to the 
profusion of very closely spaced excitation states and 
to the continuum of energies of Auger electrons in a 
metal. 

A similar low-energy satellite, K6”, is observed 
accompanying the K$», s lines for elements near manga- 
nese.38: 44415 Both the K8’ and the K8” satellites seem 
to depend somewhat alike on the chemical bonds of the 
emitting material. It is quite possible that most or all 
of this KB” component should also be assigned to the 
role of excitation states of the valence-electron-con- 
figuration type. Later, it is proposed that most if not 
all of the asymmetry of the Kaj,» lines for these ma- 
terials is also due to this type of excitation state. 

It may be pointed out in our general discussion of 
excitation states that, in principle at least, another type 
of excitation state may arise due to the capture of a 
stray electron that may be just passing through the 
region, or of an electron from a neighboring atom. 
Stray capture is very improbable, but it might be 
considered under certain circumstances, especially for 
a metal. Capture of another atom’s electron, called a 
cross-transition, has been discussed by several investi- 
gators, especially for insulators, but its relative proba- 
bility has not been well evaluated. Finally, further 
comment should be made on the state in which there 
are two or more inner electron vacancies. As stated 
earlier, such a multiple-inner-vacancy state is the initial 
state for satellite emission and is of interest in absorp- 
tion spectra. According to our terminology, these states 
are not excitation states unless one or more of the 
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Fic. 7. Part of an x-ray energy level diagram (qualitative) for 
an insulator. Only a few (of many) discrete states are shown for 
each type of inner vacancy; each K state, for example, corresponds 
to a particular electron configuration of all the electrons in the 
system. All the many discrete levels in a given group, e.g., the 
K group, are involved in absorption spectra, but only one or a 
few are ordinarily involved significantly in emission spectra 
(see text). A different continuum of states is shown for each 
type of inner vacancy. 

The Ritz-combination principle, according to which the energy 
of an emission line is the simple difference between two absorption- 
edge energies, has only qualitative validity; it is not generally 
valid in detail. 


vacancies gives rise to an abnormal configuration of the 
valence electrons. 

Three other phenomena could also be mentioned in 
this section, all of them of interest in absorption spectra 
only. These are the interactions of the ejected electron 
with the other atoms and electrons in a polyatomic 
system. These interactions—plasma oscillations, and 
scattering or diffraction—are better discussed later as 
factors in the transition probabilities, rather than 
treated as giving rise to separate states. 


X-Ray Energy Level Diagram 


Figure 7 presents a part of a qualitative energy level 
diagram which, it is believed, has provisions for the 
necessary and sufficient complexity to interpret ordinary 
x-ray spectra. It is designed to avoid the more obvious 
difficulties encountered with the conventional one- 
electron-jump type of diagram which is shown in Fig. 1. 
Figure 7 is drawn for a solid, specifically for an insulator ; 
a similar diagram may be drawn for a metal—the zero 
energy would appear at or near the bottom of the V 
(valence) continuum; i.e., near the top of the V, band 
as this band appears in the figure. As compared with 
Fig. 1, the level arrangement has been inverted; the 
K group of levels, having the greatest energy, and the 
energy being positive, appears at the top. 
Since the range of spectral energies of primary 
: in solid-state work is about the same for all 
terest 1 tral regions, the diagram is drawn on a linear 
x-ray a an For purposes of illustration, scale sections 
rgy scale. K, Lit and V (valence) groups of levels 
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possible discrete levels in each case (except the V case, 
see later) are shown. A logarithmic scale would again 
be convenient if the relative energies of many groups 
of levels of a solid containing an atom of high atomic 
number were to be indicated coherently in the diagram, 
but, then, to show the multiplicity of discrete levels 
due to the splitting, a magnified insert of each group 
of high-energy levels (e.g., the K group or the Lyrr 
group) would be required. 

The actual separations of the peak positions, the 
degree of overlap of adjacent states, as well as the 
relative prominence of the individual members of a 
split group, are generally not yet known. The diagram 
of Fig. 7 must be recognized as being merely qualitative 
in these regards. And a sensible terminology for the 
various levels in each group must await further knowl- 
edge. 

Both emission and absorption transitions are indi-' 
cated. The energy of the system in emission does not 
include the energy of the ejected electron if it has 
escaped completely from the system, i.e., escaped from 
the local region of interest in the solid. But if it has 
not escaped, it and its energy must be included as part 
of the system. Although, in principle, all of the discrete 
states and a few of the continuum states are potentially 
significant as initial states in emission, it turns out that, 
in ordinary practice with a conventional x-ray tube, 
the probability is generally high for a transition from 
only one initial state of a split group (i.e., from only 
one K state or from only one Zrry state, etc.). However, 
this high probability for the involvement of only one 
state in each of the two complex groups of states in 
emission is ot correct when the state in question is of 
low energy, i.e., of energy less than perhaps 20 ev. This 
point arises again in reference 70. 

Let the K level marked “K,” refer to the electron 
configuration in which, conventionally, the K electron 
has been removed to ‘‘infinity” in the solid and left 
there with zero kinetic energy. (The condition of zero 
kinetic energy is of interest in absorption transitions 
only.) It is suggested that this is the significant K level 
in the ordinary K-emission spectra, whatever is the 
configuration of electrons appropriate to this level. 
(“Ordinary K emission spectra” refers to spectra from 
K states of single inner-vacancy produced by high- 


. energy cathode electron bombardment as in a conven- 


tional x-ray tube.) In this sense, we may call K, the 
“unexcited K level.’6 All of the other discrete K levels 
refer to K states in which a different electron configu- 
ration obtains. This different configuration, as already 
pointed out, may be one in which the ejected K electron 
is still bound, albeit lightly, to the positive hole it left 
behind, and/or in which a different number of outer 


46 The terminology here becomes a trifle subtle: This particular 
K state, K,, is an ionization state by definition; all other discrete 
K states are excitation states, some being excited (but not 
ionized) on each of two or more counts and some being excited 
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electrons failed to jump down closer to the atomic 
nucleus as the electric field changed upon the production 
of, or subsequent to the production of, the K-electron 
vacancy.” The continuum of levels indicated by the 
diagonal lines above K, in Fig. 7 represent the theo- 
retical normal density-of-states in the so-called unoccu- 
pied continuum, but also in this region are some 
unmarked discrete levels.‘ In the case of a solid, 
“normal” refers to the theoretical model in which there 
is no perturbation due to the inner vacancy; any 
perturbation effects are presumed to be included in the 
discrete levels or bands. The theoretical normal con- 
tinuum has maxima and minima in it, and these are 
supposed to correspond to “Kronig structure” in x-ray 
absorption spectra, as discussed later. 

One measurement of the difference between the 
energy of K, and about the lowest of the bound- 
ejected-electron excitation states is afforded by a 
comparison of the energy required to produce the K 
state when (1) a large excess energy is available in the 
process and (2) when no excess energy is available. In 
the case of metallic copper, the former energy is 
8985.5 ev* and the latter is 8980.7 ev'. 

Again, what has been said about K states also applies 
in general to states in which the vacancy is in some other 
shell or orbital, including those states in which the 
vacancy is in one of the valence orbitals. But the 
valence states are significantly different. 

By definition, the ordinary x-ray energy level diagram 
gives the energy of the system when one and only one 
electron has been lost from the shell or orbital desig- 
nated in the level terminology. This means, in a strict 
sense, that the normal unperturbed valence band of the 
solid should not be included in the diagram. But the 
assumption is often made that a vacancy in a normal 
valence orbital does not significantly perturb the perti- 
nent region of the solid, and hence that it does not 
significantly perturb the energy or shape of the normal 
valence band. So, with this assumption, we include the 
normal valence band in the diagram. This is the band 
labeled “V,” for which the ejected electron is at 
“infinity” with zero kinetic energy. 

By identifying V, with the normal valence band, we 
have pre-empted excitation states, or bands of states, 
of the valence-electron-configuration type. Unless the 
perturbation, which is assumed in the previous para- 
graph to be negligible, is not negligible (or does not 
remain negligible as the number of valence vacancies 
increases slightly), all excitation bands are degenerate 
and are the same as V,. The other V bands indicated 


47 For example, the proposed states involved in MnKp’ emission 
are among those symbolically represented as p and ø states, i.e., 
Kpa > (M11, 111)p,c, although these states are not intended to be 
interpretated so specifically in this generalized diagram. 

48In a more general sense, when our concepts include the 
interactions among all the electrons and nuclei in the solid, no 
distinction im kind need be made between the continuum states 
and the discrete states either below or in the continuum. 
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in Fig. 7—V., V,—are representative of the various 
kinds of bound-ejected-electron excitation bands. 

The width and shape of the valence band refer to the 
density of states regardless of admixed wave-function 
symmetries and regardless of the probability of position 
of the positive hole in the band. But, in particular, it 
may be that in some cases in absorption the position 
probability is sharply peaked near the top of the band, 
whereas in emission the vacancy may be more likely 
near the bottom."*°.3! Even if it were preferentially at 
the bottom initially, e.g., after the jump of a valence 
excitation electron (from an excitation orbital inside 
the normal valence orbitals) to an inner vacancy, the 
hole may tend to “move up” in the band as a conse- 
quence of a second valence electron jump, in this 
instance within the band. Such a double-jump radiative 
transition is perhaps not very probable. However, it 
is conceivable that the effective position of the final 
valence hole may depend upon the particular solid and 
in some instances may even be time dependent. If 
such differences in the position probability are real, 
they are not properly a feature of the diagram per se, 
but rather are part of the transition probability 
correction (h) as listed earlier. 

Another time-dependency feature in the relative 
transition probabilities has to do with radiationless 
(Auger) transitions or with the interaction with lattice 
vibrations, i.e., with phonons. If an inner vacancy state 
lives long enough for significant Auger or phonon 
interaction to take place, readjustment of the configu- 
ration of the outer electrons may occur in such a way 
as (a) to quench or partially quench some of the 
excitation states in favor of others or of the unexcited 
ionization state, and (b) to alter the wave-function 
symmetry of an outer bound electron and thus to 
introduce somewhat different excitation states. These 
changes in relative transition probabilities are in addi- 
tion to the Auger broadening of states owing to reduc- 
tion of lifetime, and also to broadening due directly to 
lattice vibrations as discussed later. 

Of course, if the readjustment of the outer electrons 
is not part of the radiative transition, the angular 
momentum of the system must be conserved in the 
radiative transition itself. The angular momentum, 
which in absorption includes the angular momentum of SN 
any ejected electron no malter where it is, must be l 


“Tt is common practice in ultraviolet absorption spectroscopy — 
with insulators to depict the first valence excitation band (called i 
an “exciton”) as a narrow level instead of as a band. This practice 
presumes that the position probability of the positive hole in 
band is sharply peaked, usually taken as at the top of the | 
because of the calculated theoretical shapes of the 
energy te wave weston Onacicctron model calculations) 
an application of the principle i 
an Peau principle of conservation of the n 
* However, Landsberg® calculated that the 
the valence positive hole by an Auger or rad 
certain metals might take place in less t 
short time. If the desire for the hole to m 
double-jump radiative transition may not 
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Fic. 8. Extended K-absorption structure (uncorrected) 
for metallic copper. 


reckoned with respect to a reference axis passing through 
the local atom. 

In the case of atoms in an electrically polarized en- 
vironment, e.g., in a complex ion in a solution or in a 
solid, many or all of the orbitals of the bound-ejected- 
electron may be significantly compressed in radii and 
constrained within a region bounded by a strong local- 
ized potential barrier (perhaps by an electrical double- 
layer). For such atoms, the close-in structure in the 
absorption spectra are likewise appreciably altered and 
made much less sensitive to the more remote atomic 
environment. 

Finally, it must be recognized that the diagram of 
Fig. 7 refers to ordinary x-ray states, and that different 
but similar diagrams must be drawn for x-ray states 
involving two (or more) inner electron vacancies. Of 
course, both emission (satellite) and absorption transi- 
tions may be indicated on such diagrams for the 
multiple-inner-vacancies. 


Absorption Spectra : Extended Structure 


To recapitulate, the important x-ray states in absorp- 
tion spectra are (a) bound-ejected-electron excitation 
states, (b) valence-electron-configuration excitation 
states, (c) multiple-inner-vacancy states, and (d) the 
continuum states, including the effects of plasma oscil- 
lations and of scattering or diffraction of the ejected 
electron in the x-ray system. But, generally, the 
unexcited ionization state, viz., K, or L,, etc., is zot 
significantly involved in absorption spectra close (within 
say +20 ev) to an absorption discontinuity, notwith- 
standing widespread elementary notions of the contrary. 
For the final state in absorption, the geometrical extent 
of the x-ray system always includes the ejected electron 
= andis therefore usually much larger than is the extent 
Meron a state in emission. All the high-up continuum 
Jevels are in the diagram for absorption interests only. 
For emission all these continuum levels become simply 
K, or Lr etc., Since for them the ejected electron is 

ckoned simply as having escaped from the system; 
aa energy with which it escaped is of no interest in 
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For a monatomic gaseous absorber (e.g., see Fig. 2) 
all K-excitation states of the bound-ejected-electron 
type appear as levels below the K-ionization level when 
these levels are presented in a diagram of the type of 
Fig. 7. In a solid, the K, level is proposed as the closest 
equivalent to the atomic K ionization level. In Fig. 7, 
one excitation level is drawn above K, (or L,, etc.); 
this is partly arbitrary and partly due to the fact that, 
in a solid, K, does not have the simple ‘‘series limit” 
meaning that it has in a monatomic gaseous atom.*! 

In absorption, the relative prominence of the different 
types of excitation states, compared with the continuum 
of states, depends upon the particular absorbing ma- 
terial and the particular inner vacancy. For example, 
for crystalline potassium chloride, discussed later, the 
excitation states dominate in the region near the 
“main” K-absorption edge, whereas for metals the 
excitation states are less prominent. Extended absorp- 
tion structure, due to scattering or diffraction of the 
ejected electron, is observed for all polyatomic systems,” 
and, for solids, it seems to be more prominent for 
metals than for insulators. 

Figure 8 illustrates the extended structure typical of 
crystalline metals. Numerous irregular undulations are 
observed on the high-energy side of the apparent edge. 
Although very important in the first 20 ev, the excita- 
tion states cannot be responsible for much of the struc- 
ture 50 or more ev beyond this region. The observed 
irregularities beyond about 20 ev are commonly referred 
to as Kronig structure in honor of the first investigator 
to propose their interpretation (in 1931).?° According to 
Kronig, zero absorption is predicted if the ejected 
electron has such a combination of speed and direction 
in the crystal lattice that it suffers Bragg reflection® 
from the lattice planes; and a maximum in absorption 
is predicted if the ejected electron does not have such 
a combination and does not undergo Bragg reflection. 
The predicted abrupt changes in absorption become 
smoothed into mere undulations when various directions 
in the crystal lattice are simultaneously taken into 
account. In the band theory of solids, as it developed 
subsequent to about 1928, a “forbidden” region of 
energy corresponds to just those values of vector 
momenta of the electron that allow Bragg reflection 
(the boundary of a Brillouin zone). An allowed energy 


51 There is a strong tendency in the literature for the assumption 
that a similar or identical series-limit meaning holds in all kinds 
of absorption spectra. See reference 1 and G. H. Wannier, Phys. 
Rev. 52, 191 (1937). Also, Vainshtein, Barinskii, and Narbutt, 
Zhur. Eksptl. i Teoret. Fiz. 23, 593 (1952). For a solid, a series- 
limit for bound ejected-electron excitation states no doubt exists 
but it is not, in general, coincident with or very near to the 
bottom of the conventional solid-state unoccupied continuum, 
nor is it the same limit for different inner vacancies or for different 
atom species in the solid. This lack of coincidence is due mostly 
to the respective wave-function extents and symmetries, but 
partly to the time factors. 

® H, Peterson, Z. Physik 80, 258 (1933). f 

6 The general scattering or diffraction of the ejected electron 
becomes Bragg reflection in a crystal if the electron 1s presumed 
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band is that for which the electron can travel freely 
through the lattice without reflection. Thus, after the 
spectra have been corrected for the various effects (a) 
through (h) listed earlier, and if the entire intensity 
pattern is considered, instead of merely the positions 
of the maxima and minima, the observed Kronig 
structure is just the x-ray absorption view of the 
theoretical normal density of states in the overlapping 
unoccupied bands of the solid. 

In Kronig theory, the inner vacancy is presumed not 
to perturb the normal bands in any way. It is probable 
that the influence of this perturbation is such as to 
complicate seriously any Kronig interpretation of ob- 
served structure within 30 or so ev of the apparent 
edge, and to be a significant factor as far out as 75 or 
100 ev. Also, in Kronig theory, the ejected electron is 
presumed to be a plane wave, i.e., an electron rather 
far removed from the vacancy site. Actually, for an 
x-ray state of short lifetime, the ejected electron does 
not have time enough to travel very far from the 
vacancy site (a 50-ev electron of normal effective mass 
would have time to travel a total distance of only about 
4 A in 1078 sec), and the plane wave presumption is 
perhaps not very good except for the very extended 
absorption structure.” 

An extension of Kronig’s interpretation has been 
offered by Hayasi.*® According to Hayasi, if the Bragg 
reflection occurs at a Bragg angle near 90°, the ejected 
electron remains in the vicinity of the inner vacancy as 
in a type of standing wave pattern. An increase in the 
absorption probability results because of the greater 
overlap of the wave functions of the initial and final 
states. Thus, Hayasi predicts an absorption maximum 
for some of the energies at which Kronig predicts a 
minimum, or at least an increased absorption at some 
of Kronig’s minima. In the final state, a “quasi- 
stationary” state as Hayasi calls it, the ejected electron 
is, in a sense, bound to the positive hole it left behind, 
and, in another sense, the electron is in a forbidden 
energy region in the solid. Insofar as the electron is 
bound to the positive hole, the final state is one form 
of what we have called an excitation state. Hayasi, 
like Kronig, ignores the lattice perturbation due to 
the inner vacancy. Moreover, for a short-lived inner- 
vacancy state, the ejected electron does not have time 
to establish itself as a very respectable standing wave.*® 
For comparison with experiment we need more detailed 
and quantitative calculations on the basis of Hayasi’s 
proposal, with inclusion of the vacancy perturbation, 


ši In a recent paper, Shiraiwa, Ishimura, and Sawada [J. 
Phys. Soc. Japan 13, 847 (1958) ], discuss a modification of Kronig’s 
theory by allowing the amplitude of the wave of the ejected 
electron to decrease rapidly with distance from the parent atom. 

55 T. Hayasi, Sci. Repts, Téhoku Univ. 33, 123, 183 (1949); 
34, 185 (1951); 36, 225 (1952). 

56 To the degree that the general position of the Hayasi electron 
is localized, the uncertainty principle argues further against its 
being a plane wave with well-defined momentum. The Hayasi 
reflection at best should be considered as “quasi.” 


the time-dependencies, and all directions of ejection in 
the lattice. 

Efforts to explain the observed details of absorption 
curves on the basis of Kronig’s interpretation have 
resulted in some but not complete success. Hayasi’s 
modifications may possibly help. The excitation states 
will probably account for most of the structure in the 
close-in regions. But it is quite likely that additional 
processes are involved in the complete explanation. It 
has been suggested that a correlation exists between the 
x-ray absorption structure and the discrete loss of 
energy of electrons as they penetrate thin solid foils.*” 
We may distinguish two types of discrete energy loss. 
The first is in the production of multiple inner-vacancy 
states. In any type of absorber, the interaction of the 
electromagnetic field with the system may result in the 
ejection of two (or more) electrons from an atom 
instead of just one electron. One or both ejected 
electrons may come to rest in an orbital or orbitals 
still bound to the ‘wo positive holes left behind. Or the 
second electron may come from a different atom in the 
system. In any of these cases, the secondary electron- 
jump absorbs a discrete amount, or at least a minimum _ 
amount, of energy and provides an additional way in 
which the system can absorb the energy of the incident 
x-ray photon. (The minimum amount of energy ab- 
sorbed by the secondary jump is essentially zero only 
for the special case of an electron at the Fermi surface 
in a metal.) When the photon energy and system are 
such that the probability for the secondary jump is 
reasonably high, the absorption should exhibit an 
increase.®* There are many potential secondary electrons 
in a solid, and the discrete amounts of energy become 
very numerous when more than one secondary electron 
and more than one atom are considered. The absorption 
structure due to the production of multiple-vacancy 
states may indeed show only one or two maxima and 
an accompanying broad absorption tail of unresolved 
components. 

The second process for a discrete loss of energy is by 
the induction of plasma oscillations. The ejected elec- 
tron as it traverses the crystal interacts with the 
loosely bound valence electrons in such a way that the 
electrons oscillate collectively. The energy of the oscil- 
lation is taken from the absorbed photon via the 
kinetic energy of the ejected electron. If more than one 
mode of oscillation is induced, the electron loses more 
than one discrete unit of loss energy, and if more than 
one oscillation of a given mode is induced, the electron 
loses a multiple of a discrete unit. As in the case of 


57 E.g., see Leder, Mendlowitz, and Marton, Phys. Rev. 101, 
1460 (1956); Revs. Modern Phys, 28, 172 (1956); E. J. Sternglass, 
Nature 178, 1387 (1956); D. Pines, Revs. Modern Phys. 28, 184 
(1956); R. H. Ritchie, Phys. Rev. 106, 874 (1957). 

58 Production of multiple-vacancy states in gaseous absorbers 
should be about as probable as in, say, ionic crystalline absorbers. 
Reports indicate very little absorption structure for gases that 
can be attributed to multiple vacancies. For this reason these 
states may not play a very significant role in absorption, but the 
argument is not generally conclusive. 
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ejection of a second electron, plasma-oscillation pro- 
duction represents an additional way in which the 
system can divert the energy of the x-ray photon. An 
increase in the absorption probability results for those 
particular photon energies properly tuned. 

The energy spectrum of x-ray photo-ejected K elec- 
trons from metallic copper has been analyzed magneti- 
cally® and found to consist of a secondary maximum 
and a long low-energy tail. It is not yet clear which of 
the two types of inelastic scattering of electrons is 
primarily responsible for the observed structure. 

As implied earlier, these two discrete-loss processes 
also take place when no primary inner electron vacancy 
is produced; in fact, this is normally the situation in the 
experiments of penetration of electrons through thin 
solid foils.*” But the probability with which each process 
takes place is believed to be increased when the primary 
inner vacancy is made at the same time. For example, 
in producing the multiple-vacancy state, the primary 
electron ejection may, by the abrupt change of the 
Hamiltonian of the system, trigger the secondary 
electron jump. This may be especially so in case the 
secondary electron is a valence electron. (In case the 
second vacancy is in the valence band, the state would 
be, according to our terminology, a valence-electron- 
configuration state.) 

In each of the aforementioned possible factors in the 
interpretation of the extended structure in absorption, 
a reference “zero energy” is involved. It is usually very 
difficult or impractical to disentangle this zero energy 
from the observed structure with the requisite accuracy 
to make definitive correlations between experiment and 

_ theory. To date, only the maxima and minima in the 
structure have been considered. It would be much more 
informative if complete absorption patterns were 
correlated. Then we could better see what absorption 
remains to be accounted for by additional processes 
such as discrete loss. 


Widths and Lifetimes of X-Ray States 


With a sufficiently open energy scale, preferably 
linear, the width (and the gross shape) of at least one 
member of each group of levels may be conveniently 
shown on one side of the diagram of Fig. 7. It should 
be mentioned that the level width and shape in the 
case of crystalline solid materials should include the 
effects of lattice vibrations at a particular temperature. 
These effects are discussed in some detail later in 
connection with the interpretations of the spectral 

curves for potassium chloride. Typically, lattice vibra- 
tions at a temperature of about 300°K contribute a few 
ths of an ev (or less) to the full width at half- 
num, and contribute a Gaussian factor in the 
The numerical values of widths indicated in 
are about right for chlorine in crystalline potas- 


A Nordling, and Siegbahn, Arkiv Fysik 12, 301 
wski, À 


a! 


tie oe 


rukul Kangri University Haridwar Colea Bins auvieus bHiathis isnot of great importance now. 


PARRATT 


sium chloride if the effects of lattice vibrations are 
neglected. 

The widths of the energy levels corresponding to a 
few inner-vacancy-states, excluding the effects of lattice 
vibrations, are given in Table I. These widths, deduced 
from measurements with the two-crystal type of spec- 
trometer whose resolving power is rather well known, 
are believed to be essentially atomic values, i.e., unper- 
turbed by (or else corrected for) the physical or chemical 
environment. Corrections have been made for the 
instrumental spectral window, for the backgrounds, 
and for the overlapping lines. 

Determination of the lifetime of a state follows from 
the width according to the uncertainty relation AAE 
œh/2r; a full width at half-maximum of AE=1 ev 
corresponds to a lifetime of A/= 6.6 10-'§ sec. 

The K width for gaseous argon, 0.50.05 ev, ob- 
tained from the K-absorption transition K —> Nr rrr 
(or 1s— 4p), is perhaps the most reliable value in the 
table, and this serves as a sort of anchor point in the 
determinations of some of the other values.® This argon 
K width is reasonably consistent with the combination 
of (a) experimental values of the fluoresence yield, and 
(b) a calculation for the sum of the theoretical radiative 
transition probabilities.“ The K and Lyrr widths for 
gold were taken from the width of the respective 
absorption edge.” This is admittedly not a reliable 
method for obtaining a level width because of the 
unknown role of component structure due to the 
excitation states. However, it happens that both of 
these gold widths are in fairly good agreement with the 
calculated theoretical values.®:® Furthermore, for the 
gold Lrrr width, the consistency argument of Richt- 
myer, Barnes, and Ramberg is reasonably persuasive. 
(This argument considers the widths of many different 
emission lines.) The same type of argument supports 
the silver values. For every element in the table 
other than argon and gold, the state widths were 
obtained from the Ka; emission line. All emitters were 


TABLE I. Full widths at half-maximum of x-ray atomic energy 
levels corresponding to the states indicated (in ev). 
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® The argon absorption curve is being restudied (with higher 
resolving power, quartz crystals instead of calcites) to obtain 
still more reliable data on the state shape and width, and on the 
relative transition probabilities of the series 1s > np, n>4. 

“ D. L. Dexter and W. W. Beeman, Phys. Rev. 81, 456 (1951). 

e F. K. Richtmyer, Revs. Modern Phys. 9, 391 (1937); Richt- 
myer, Barnes, and Ramberg, Phys. Rev. 46, 843 (1934). 

8 L. Pincherle, Physica 2, 596 (1935). 

“L. G. Parratt, Phys. Rev. 54, 99 (1938). 

60 As judged now with benefit of the new correction procedures,” 
a small over-correction for the instrumental spectral window 
was reported in reference 64 for most of the silver L series lines. 
The corrections reported in reference 62 for the gold L lines are 
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solids® except krypton which was gaseous.6* For 
titanium, manganese, and copper, the Ka, line is 
markedly asymmetrical and correction for this asym- 
metry is discussed in the next paragraph. In each case, 
the corrected line width was divided between the K 
and Lrrr states, each state being assumed to be a 
singlet and of Lorentzian shape. In making the division, 
two criteria were kept in mind: (1) the radiative 
components of the K widths should be a reasonably 
simple function of atomic number Z, and (2) the ratio 
of the width of the K-radiative component divided by 
the total K width should agree reasonably well for each 
Z with the experimental value of the fluorescence 
yield. The K-radiative widths so obtained are listed 
in Table II. The K fluorescence yields by this method 
are also listed. It turns out that the dependence of the 
radiative width on the atomic number is in good accord 


+ with the theoretical Zt prediction. (The radiative 


component of the L777 width is very small, much less 
than the uncertainty in the total Lrrr width.) 

As stated in the foregoing, the Kay lines for titanium, 
manganese, and copper (and, of course, for some other 
elements not included in the table) are asymmetrical in 
shape. When the widths and asymmetries of these lines 
are each plotted vs atomic number through the range 
Z=16 to 42, it becomes obvious that the apparent 
excess width is closely related to the asymmetry.” 
Illustration of the line shape, after correction for 
everything except asymmetry, is given in Fig. 9. It is 
proposed that the excess width on the low-energy side 
is essentially attributable to transitions between exci- 
tation states of the valence-electron-configuration type, 
just as in the case of the 8’ structure in the MnK;,3 
region discussed earlier.” Tentative calculations show 


TABLE II. K-radiative widths (in ev) and fluorescence yields. 


Atom 
SRN 1S 13A aTi 2Mn Cu nGe Kr «Mo sAg Au 


2 
Krad 0.04 0.07 0.2 0.33 0.65 1.0 1.7 3.6 6.0 50 
wK 0.1 0.14 0.22 0.31 0.43 0.5 0.57 0.72 0.8 0.93 


6 E. Wilhelmy, Z. Physik 97, 312 (1935). 

87 Broyles, Thomas, and Haynes, Phys. Rev. 89, 715 (1953); 
E. S. Burhop, The Auger Effect (Cambridge University Press, 
London, 1952). 

68 The report by W. H. Zinn [Phys. Rev. 46, 659 (1934) ] of a 
Z* dependence of the zotal widths is based on measurements of 
the widths of absorption edges that are evidently replete with 
excitation structure. These measurements are not reliably inter- 
preted in terms of level widths. Typical of this difficulty, Zinn 
used the ‘‘corrected” K-level width for silver as 25 ev, whereas 
its actual value is near 7.5 ev. However, the Z* prediction is 
expected to be fairly good for reliable total widths when the 
Auger contribution is small, as for K widths for 2945. 

© Some of the asymmetry may be due to the other effects 
mentioned in connection with the Ky,3 region, vis., unidentified 
satellites, Compton and Raman scattering, and interaction 
between the inner Z;;; vacancy and the incomplete Mury,v 
electron shells. However, these effects are here presumed to be of 
negligible importance; but whether they are or not is not relevent 
to the present argument for obtaining the level width. The 
pertinent assumption is that all such effects distort the line 
K> (Li); on only one side. 


Energy (ev) 


Vic. 9. Asymmetric Kay “line.” The line Ky — (Lirr)r 
is presumed to be symmetrical (see text). 


that the energy separations between K, and the 
pertinent K excitation states, between (Zrrr), and the 
pertinent JZyrr-excitation states, and between the 
(Mrrr); and the pertinent M zrrr-excitation states, are 
such that now we observe mere asymmetry instead of 
the plateau region observed in the 1,3 region. The 
line from the transition K, —> (Lrrr), is presumed to 
be symmetrical and to be obtained by subtracting the 
mirror image of the high-energy side of the maximum 
ordinate of the corrected observed curve from the 
low-energy side. It is the width of this symmetrical 
component that is used in deducing the K and Lrrr 
level widths.” 

In support of this analysis of the asymmetry are the 
facts that the widths so corrected for asymmetry have 
a fairly smooth dependence on atomic number over a 
range of Z that includes both symmetrical and asym- 
metrical lines, and that the radiative component of the 
K width varies as Z*. In support of the contention that 
these widths are essentially atomic, i.e., relatively free 


7 A final caution may be expressed in regards the numerical 
widths of x-ray energy levels. If, as is supposed, the excitation 
states are responsible for asymmetry in an emission line, these 
states may also contribute to the width of the symmetrical 
component of the line. In other words, the widths as listed in 
Tables I and II for single energy levels may be a little too great. 
If the symmetrical part of an apparently single energy level is 
itself the sum of several closely spaced narrower components, 
then, even if each component is of Lorentzian shape, the sym- 
metrical component may be expected to be non-Lorentzian, 
perhaps more nearly Gaussian. Thus, in the question as to whether 
the width in Tables I and II is about right for a single state or 
refers to the width of a composite of excitation states, we find two 
experimental features of interest: (1) the relative magnitude of 
the width, and (2) the shape of the state. As judged from present 
information, the K and Ly state widths as listed seem to refer 
in each case to essentially singlet states, but the evidence is not 
clear in the cases of M, N, etc., states. E.g., the Mz, 1 states of 
Fig. 6 are wider than the K state for manganese, and the Minu 
states may well be Gaussian in shape. These M states may be 
wide because of (a) a high probability for Auger transitions, 


(b) phonon interactions (discussed later), or (c) static interactions 


of neighboring atoms (solid-state broadening). 
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Fic. 10. Lz; emission and absorption spectra for metallic 
sodium, magnesium, and aluminum. The same energy-interval 
scale is used for all the curves; zero is taken as at the estimated 
position of the Fermi energy. The curves are taken from various 
authors.’ 


of solid-state effects, is the fact that the gases argon 
and krypton are included in this atomic number 
dependence without distinction from the solids. 


V. EXPERIMENTAL CURVES AND 
INTERPRETATIONS 


The work in this subject is far too extensive and 
intensive to be summarized adequately in a few pages. 
In the preceding sections, only a few topics were 
selected for discussion; and, in this section, only a few 
additional curves are selected as illustrative of the 
experimental results and interpretations. 

The interpretations are not clean-cut, in general. 
For the most part, the corrections listed earlier have 
not been made to the observed spectral curves; and, 
in particular, the recognition and use of excitation states 
is relatively new. A few satisfactory features can be 
pointed out, but the reader is warned to expect some 
confusion. An interpretational “break-through,” how- 
ever, may not be far away. 


Sodium, Magnesium, and Aluminum 


First, we mention briefly the Lrzz spectra of metallic 
sodium, magnesium, and aluminum. These spectral 
curves, reproduced in Fig. 10 from various authors,” 
were obtained with sufficiently high resolving power 
that no correction for this effect need be made. Also, 

eat care was taken (except possibly in the case of 


sodium absorption) to translate the observed blackening 
= ee rves are taken from W. M. Cady and D. H. 

71 The emis pys. Rev. 59, 381 (1941); the absorption curves 
Tombou MM O'Bryan, Phys. Rev. 57, 557, 995 (1940), J. R. 
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of the photographic plates to an x-ray intensity (or 
absorption) scale. 

There are several features of interest in these curves, 
but to see them in perspective we first recall the general 
electronic aspect of these solids. The electron configu- 
ration of atomic sodium, atomic number 11, is 15?2s?2p%3s 
in which the 3s electron is nominally in the solid-state 
valence band, i.e., in the occupied part of the conduction 
band. For magnesium, atomic number 12, we find two 
nominal 3s electrons; and for aluminum, atomic number 
13, we add one more electron, this time in the nominal 
3p band. It is the combination of 3s and 3p bands from 
which the electron comes to fill the inner Lrrr vacancy 
in the Lrz7 emission transition. 

The dipole selection rule dictates that only an electron 
having wave-function symmetry of s type or d type, as 
viewed from the Ly; vacancy, may fall radiatively into 
the Lrrr vacancy (p-type symmetry). Hence, to the 
extent that the nominal 3p band is involved, there must 
be admixture of s- and/or d-type symmetries in it.” 
The selection rule actually applies to the wave functions 
of the states, but, as pointed out earlier, interpretation 
of this rule in terms of the one-electron model suffices 
if the primary jumping electron is the only electron in 
the entire system whose angular momentum changes. 

In the electron configuration for sodium, the nominal 
3s band is half occupied; hence, sodium is a good 
conductor. The 3s band in magnesium is fully occupied 
and, hence, since magnesium is a good conductor, the 
3s and 3p bands must overlap. Aluminum is a metal 
because the 3p band is only partially occupied. Mag- 
nesium has a different crystal structure than sodium 
and aluminum, but theoretical calculations suggest that 
the overlap is similar for all three of these metals, the 
principal difference being in the measure of occupancy. 
The so-called reduced width of each emission band 
(i.e., the base width corrected for the “tail” as indicated 
by the respective dashed curve in the figure) is con- 
ventionally taken as a measure of the occupancy of 
the 3s, 3p bands. The energy widths of the 3s, 3p 
bands, as obtained directly from the measured bands, 
are 3.05, 6.4 and 10.6 ev, respectively, and this is 
interpreted as showing the increase of electron occu- 
pancy as we go from sodium to aluminum. It is possible 
that the role of excitation states is not important in 
the width and shape of these curves, but, on the other 
hand, it is conceivable that the valence-electron- 
configuration type of excitation states do influence the 
curves in some manner similar to that discussed earlier 
for the MnKi,3 region. If the excitation states are 
important, then only the relative magnitudes of the 
observed widths may be significant in showing relative 


732 Although for quadrupole radiation, p to p transitions are 
allowed, this radiation is very feeble compared with dipole 
radiation. ji 

33 Jt is also important to note that, at least in hydrogenic 
theory, the transition probability for p to s is much less than 
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electron occupancy, not the numerical magnitudes 
themselves. 

The narrow spike observed at the high-energy side of 
the emission band for both magnesium and aluminum 
is conventionally explained as a sharp rise in the 
density of 3p states just before the top of the 3s band 
is reached.” The spike is markedly higher for aluminum. 
This explanation requires that there be a very large 
admixture of s- and/or d-type electrons in the bottom 
part of the 3p-electronic band. The theoretical density 
of states, according to the one-electron model, with 
complete disregard for the wave-function symmetries, 
i.e., for the transition probabilities, is indeed much like 
the observed emission curves for all three metals with 
the Fermi energy appropriately placed. But the calcu- 
lations have not yet succeeded in giving the degree of 
admixture of wave-function symmetries of the electrons, 
or of indicating the role of the possible excitation states. 
These facts should be taken into account before the 
explanation of the spike or of the general shape of the 
curve is considered to be satisfactory. 

Next we note that the edge of each of the observed 
emission bands at the Fermi energy is abrupt, as 
expected from the degenerate-electron gas theory if the 
inner electron vacancy causes no perturbation. Indeed, 
when the small corrections are made for the spectral 
window of the spectrograph and for the width of the 
initial Lrry state, the remaining width of the observed 
emission edge has been attributed to the effect of the 
thermal motion of the electrons of the emitting 
anode.7®:76 Reckoned as between the 5% and 95% 
points on the edge, the temperature width as deduced 
from the Fermi distribution formula is 0.06 ev at 110°K, 
and is 0.45 ev at 670°K. The experimental widths at 
each of several temperatures in this range are found to 
be in agreement with these theoretically predicted 
values.7775 However, these three metals are most 
unusual, apparently unique, in that their Lrrr spectra 
exhibit this abrupt emission edge, an edge that corre- 
sponds so nicely to the Fermi surface predicted theo- 
retically for unperturbed metals. This prediction for 


74 For sodium, there is no evidence in the curve of Fig. 10 for 
the 3s, 3p band overlap according to this interpretation, but a 
suggestion of such overlap is seen in Skinner’s curves [H. W. B. 
Skinner, Phil. Trans. Roy Soc. London A239, 95 (1940) ]. Also, 
this suggestion appears in a recent work by Fisher, Crisp, and 
Williams (private communication). 

78 s W. B. Skinner, Phil. Trans. Roy. Soc. London A239, 95 
(1940). 

16 This is one of the very few cases in the long wavelength 
region in which correction for either type of resolving power has 
been attempted. 

77 Incidentally, it is a truly remarkable experimental accom- 
plishment to know the temperature of the local regions of the 
anode emitting the x-rays, and also to carry out the necessary 
photographic procedures across such a steep edge to obtain a 
reliable intensity curve. 

This interpretation of the edgewidth does not take into account 
the effects of lattice vibrations, discussed later in connection with 


the potassium chloride curves, nor of self-absorption in the 
x-ray tube anode. 
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metals is based on an assumption that the Fermi 
surface does not touch a zone boundary.78 

The long tail on the low-energy side of each of these 
emission bands (some 8% or more of the total band 
intensity) is at least partially accounted for, according 
to Skinner” and Landsberg,” by Auger transitions that 
occur as the final-state positive hole “bubbles up” from 
near the bottom of the occupied part of the valence or 
conduction band of electronic “levels.” In this expla- 
nation, the Auger process greatly reduces the lifetime 
of each of the low-lying levels in the band, and the 
width of each level is correspondingly increased; the 
tail is thus simply the lower side of the symmetrical 
spread of each low-lying level. According to this Auger 
explanation, the intensity in the tail appears at the 
expense of intensity in the rest of the band, particularly 
in the lower half. Hence, if a large part of the tail were 
to be thus explained, the corresponding redistribution 
of intensity in the band proper would somewhat 
invalidate the popular comparison of the reduced band 
shape with the free-electron parabolic relation I/(v)? 
vs E? (or vs E} in the case of K-band emission). Other 
proposals to account for this tail, or part of it, have 
also been made.” 

In Seitz’s proposal” the tail may be due to a series of 
occupied excitation orbitals just below the valence 
band. Accordingly, a photon of slightly lower energy 
is emitted if the primary-jumping electron comes from 
an occupied excitation orbital instead of from the 
normal valence band. In these two possible types of 
transitions, the initial states of the system are presum- 
ably the same, and the final state in each case is one in 
which the positive hole is in the valence band. Hence, 
the proposal must be essentially that the energy position 


78 It is often said that lithium, sodium, and potassium are our 
most nearly ideal metals, ideal in the sense of meeting some of 
the conditions assumed in a theoretical treatment. Lithium K 
emission does not show an abrupt edge [D. E. Bedo and D. H. 
Tomboulian, Phys. Rey. 109, 35 (1958) ], and this may possibly 
be accounted for, according to Dr. V. Heine (private communi- 
cation), if the Fermi surface touches a zone boundary. Potassium 
K emission, although recorded [E. L. Jossem and L. G. Parratt, 
Phys. Rev. 98, 1151 (1955)], awaits corrections (especially for 
the resolving power due to the K state, for self-absorption and 
for the background due to the K£; line) before the answer can be 
definite. There is a suggestion of the type of distortion observed 
with lithium. Lack of abruptness in these cases may possibly be 
attributed, in addition to the distortion due the Fermi surface 
touching the zone boundary, to the role of excitation states or to 
the nominal s and d types of symmetry in the conduction band. 
Since the final state in K emission must have p-type symmetry, 
the experimental curves tell us the actual (admixed) p-type 
distribution only. 

Dr. S. E. Williams, University of Western Australia, reports 
(private communication) a new observation that the Z;;; emission 
of potassium metal may also be abrupt, having a width of “0.22 
ev including the instrumental width at 1A resolution.” He 
estimates that his potassium target was at a temperature near 
its melting point. 

R. H. Kingston [Phys. Rev. 84, 944 (1951)] reports the 
M-emission edge of potassium metal to be rather sharp also. 

IF, Seitz, Modern Theory of Solids (McGraw-Hill Book 
Com any, Inc., New York, 1940), pp. 436-439; Cady and 
Tomboulian™; S. Raimes, Phil. Mag. 5, 727 (1954); and D. 
Pines, Solid State Physics (Academic Press, Inc., New York, 
1955), Vol. 1, pp. 367—450. 
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of the positive hole is lower in the band when the 
jump occurs from one of the excitation orbitals than it 
is when the jump occurs from the normal band.™ This 
sort of argument is also discussed (with an indefinite 
conclusion) in the case of K emission of potassium 
chloride by Parratt and Jossem* where the excitation 
spectra are presumed to be more intense than the 
normal band spectra. [As to the expected relative 
intensity of these two types of spectra, note that the 
transition probability (independent of relative quantum 
weights) in each case is proportional to the degree of 
overlap of the wave functions of the initial and final 
states, and that the overlap is greater for the excitation 
spectra simply because the excitation orbital is closer 
(more tightly bound) to the nucleus. ] 
If Seitz’s proposal were interpreted as implying 
different initial and final states for the two different 
radiative transitions, as different valence-electron- 
configuration excitation states, the explanation of the 
tail would be essentially the same as is discussed earlier 
for the Kp’ “tail” in the MnK@;,; region of Fig. 6, and 
suggested also for the K8” satellite accompanying the 
Kb», lines and for the asymmetry of the Kay,» lines. 
Indeed, with this interpretation, the significance of the 
“reduced width” scheme by which the tail is separated 
from the main emission band, becomes artificial, and 
we are really talking about the shape of the emission 
as a whole. 
It is conceivable that, if the Skinner-Landsberg 
explanation of the low-energy tail is even qualitatively 
correct, the symmetrical Auger-broadening of the states 
in the band must be sufficient also to produce at least 
i ‘some of the small but definite tail observed on the 
__ high-energy side of the band, even though the states in 
E uppermost part of the band are themselves negli- 
aa gibly broadened. However, Skinner has proposed 
= another explanation,’® viz., that this high-energy tail is 
due to the “satellite” band LrrrM —>MM, i.e., to the 
_ transition between an initial state containing both an 
_ Lrrr and an M vacancy and a final state containing 
two M vacancies. According to the usual satellite 
argument for states of double (or multiple) inner 
vacancy, the unscreened or residual electrical field, 
which is present as a consequence of the extra inner 
vacancy, produces a little extra energy in both the 


—. 
_ 8 The discussion given by Seitz” is in terms of the one-electron- 
jump type of “energy level” diagram. He discusses further 
= whether or not the normal valence band “broadens” as a conse- 
= quence of the inner electron vacancy enough to “absorb” the 
q citation levels and thereby to obliterate the excitation spectra. 
is is presumably not a question of different energies of the 
tate; the initial state is presumed to be of the same energy 
ess of whether an excitation electron or a normal valence 
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initial and final states in such respective amounts that 
the emitted satellite photon has a slightly greater 
energy. Note, however, that the extra M vacancy in 
the case of any of the present three metals is a vacancy 
in the valence or occupied part of the 3s, 3p electronic 
bands, and we have presumed above that if the vacancy 
is in the normal valence band it does not alter the 
energy of this band. Such a vacancy in the normal 
valence band also does not alter the energy of the 
initial Ly7;-vacancy state either. If these presumptions 
are valid, the LrrrM — MM band is negligibly shifted 
and is indistinguishable in shape from the parent 
Lrr— M band. 

But, if the extra vacancy in the initial state is in an 
M-excitation orbital, then, according to our termi- 
nology, the state is called an excitation state of the 
valence-electron-configuration type, and the energy of 
the state and of the emitted photon is shifted. By the 
usual satellite argument, the satellite occurs on the 
high-energy side of the parent line or band, whereas 
we have discussed the excitation spectra of this type 
as shifted to the low-energy side. In the latter case, 
we have presumed that the normal or unexcited (Lrrr)r 
and V, states of ionization in the energy level diagram 
are those for which there exists one (or more) vacancies 
in the valence excitation orbitals. When less than the 
normal number of vacancies is found here, the state is 
an excitation state, and photon emission therefrom is 
of lower energy than is the normal line or band. In 
other words, in terms of the earlier discussion of the 
manganese K-emission spectra, if the definition of a 
satellite state were to be extended to include a valence 
orbital vacancy as the second vacancy (instead of 
another inner vacancy), the MnK®,,; lines would be 
satellites of the Kp’ parent line or lines, and most of 
the Ka; line would be a satellite of the low-energy 
“asymmetrical” component. Or, in terms of the present 
Liz spectra, if the excitation orbitals play a role 
similar to that proposed for manganese, the observed 
high-energy spike with its abrupt edge may be a 
satellite of the low-energy part of the emission. It 
seems better to retain the definition of a satellite state 
as given earlier. 

The relative intensity of the LrrrM — MM band, as 
proposed by Skinner without excitation orbitals and 
regardless of its energy position, would be seriously 
reduced (as suggested by Skinner) by the rapid transfer 
of the M vacancy in the initial state from the atom in 
question to a neighbor and then out of the x-ray system. 

The time for such a transfer depends upon the position a 
of the M vacancy in the valence band, being probably 
much less than the Lzzz lifetime if the vacancy is at or 
very near the top of the band. One Auger transition, 
Lrr—> LrrıM, always places the extra M vacancy 
very near the top; the Lrr—Lrrr energy difference is — 
very small, e.g., it is 0.3 ev in the case of magnesium,’ 
and therefore the M vacancy thus produced occurs 
within 0.3 ev of the Fermi surface. But there are ott 
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Auger transitions that more frequently place it lower 
in the band. 

Instead of the extra vacancy being in the M band, 
it may in some cases be in the Lrr or Lrrr shell, as 
produced by the Auger transition K— LL. In this 
event, the satellites LrrL — ML or LrrrL— ML are 
shifted well beyond the Fermi energy and are observed’ 
completely separated from the parent band. In shape, 
these satellites are a small replica of the parent 
Lr, ir M. 

The probability for Auger decay of the Lz state is 
especially very great, and no Ly-emission band for any 
of these metals has ever been reported. Also, the 
Auger transitions Lrr > LrrıM and L;r— MM, which 
“rob” intensity from the Lr;-emission band, cause the 
Lrr band tohave only about 1/20 of the intensity of the 
Lrrr emission for Mg, whereas, on the basis of quantum 
weights, we would expect an intensity ratio of 4. 

As mentioned in the foregoing, some fraction of the 
intensity thus “robbed” from the Ly; emission, and 
likewise some of the intensity “robbed” from the K 
and Ly; emission, reappears as part of the Lrr7(M)"— 
M(M)" band where »>1; in fact, most of the apparent 
L111 — M band is of emission for which one or more 
extra M vacancies of the normal valence type are 
present. There is no point in calling most of this 
emission a satellite or satellites. 

Now we turn to the absorption spectra of these three 
metals. The first point to be made is that the high- 
energy edge in emission and the low-energy edge in 
absorption agree in position (at the Fermi energy) as 
may be theoretically expected of a good metal in the 
absence of excitation structure. Such agreement is 
sometimes found for those good metals for which the 
proper edges can be unambiguously identified, but the 
proper identification is usually very difficult or im- 
practical. In fact, because of the prospective role of 
excitation states, the agreement in the present uncor- 
rected experimental curves is perhaps better than we 
should expect. 

Concerning this agreement vs disagreement of the 
emission and absorption edges at the Fermi energy, 
two effects should be noted in addition to the role of 
excitation states. First, if the lifetime of the initial 
state in emission is rather long, significant phonon- 
interactions may take place and shift the energy of this 
state before the emission transition occurs. The energy 
shift is expected to be such as to increase the high- 
energy side of the emission and, hence, produce dis- 
agreement in the positions of the respective edges. 
Second, if the lifetime of the initial state allows relaxa- 
tion of the crystal lattice (to relieve the stress set up 
by the inner vacancy), the energy of both the initial 
and final states are thereby changed, usually, but not 
always, reduced. The energy of the inner state is 


3i The dependence of state energy on the lattice spacing is 
mentioned later in connection with temperature effects in po- 
tassium chloride. 
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expected to be changed more than that of the outer 
state, and, hence, such relaxation would also produce 
disagreement in the respective edges. The lifetime of 
the MgZrrr state is probably greater than 10 sec 
(Skinner reports’® the width at 5% maximum to be 
0.02 ev), and this is approaching the time required for 
significant phonon interactions. The lattice relaxation 
time is a little longer, about 10~-” or 10-" sec. These 
phonon and relaxation effects are not expected to be 
important when the initial state lifetime is less than 
about 107! sec.@ 

There seems to be no obvious explanation of the 
observed shapes of these three absorption curves beyond 
the edges. From the conventional interpretation of the 
emission spectra in terms of the overlapping 3s, 3p 
bands, we would expect sodium to show strong absorp- 
tion corresponding to the high-energy emission for 
magnesium and aluminum. Perhaps it does, but the 
evidence suggests the presence of additional absorption 
structure in the range 0 to 5 ev. The magnesium 
absorption spike, at less than 1 ev from the Fermi 
energy, seems to be narrower than would be expected 
on the argument of the aluminum and the 3 band, 
and the aluminum absorption shows no clear maximum 
in the range 0 to 5 ev. 

The fact of two different absorption curves in the case 
of aluminum raises further questions. One of these 
curves was recorded with the aluminum condensed by 
evaporation technique onto a Zapon substate; in the 
other case the aluminum foil was self-supporting. The 
vacuum-deposited specimen shows a relatively strong 
broad absorption band beyond about 5 ev above the 
Fermi energy. There is reason to believe” that this 
strong absorption occurs in a metal-substrate interface 
region roughly 500 A thick. It is possible that some 
metal atoms were more or less deeply embedded in the 
material of the substrate, and that some outer electrons 
were stripped off the metal atoms in the embedding 
process. Thus, the neighborhood environment of the 
boundary aluminum atoms or ions may be quite 
different than in bulk aluminum metal. There is 
possibly established, therefore, a wide variety of 
excitation and ionization states that combine in effect 
to give the strong but very broad absorption. This 
type of strong absorption is observed also for celluloid 
and polystyrene substrates, and indeed for other 
metals than aluminum. -A 

In general, the interpretation of these spectra, eith 
emission or absorption, is not as yet complete. In this 
section, we make no effort to present an exhaustive 
discussion nor a final interpretation. Our purpose is to zi 
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® Kostarev CA. I. Kostarev, Zhur. Eksptl. i Teoret. Fiz. 
628 (1952); or see M. A. Blokhin, Bull. Acad. Sci. USSR, Phy 
Ser. 20, 701 (1956), English translation] proposed that becaus 
lattice relaxation, every valence emission curve should be 
to higher energies in order better to match the edge 
Kostarev, however, does not discuss the fact that many 
states have a lifetime much less than the relaxation tim 
these short-lived states, his arbitrary shift would be out of 
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Fic. 11. K, Lizz, and Mir: spectra for metallic nickel, copper, 
and zinc. The curves are taken from various authors.® 


illustrate typical features of interpretation, and to this 
end we choose different spectra to emphasize different 
features. 


Nickel, Copper, and Zinc 


Figure 11 shows a composite of the K, Lrrr, and 
Mrrr spectra for three more metals, viz. nickel, copper, 
and zinc.® These curves are also normalized and plotted 
on the same energy-interval scale to facilitate com- 
parison of shapes. The zero of the energy scale is 
arbitrarily taken in each case as at about the first 
inflection point of the uncorrected absorption curve, 
the Fermi energy in a good metal is expected to be 
somewhere near this position. This presumes, as in the 
case of the sodium, magnesium, and aluminum, (1) 
sufficient admixture of wave-function symmetries that 
the different inner vacancies will not, by the selection 
rule, affect the position of this inflection point, and (2) 
that excitation states are of negligible importance in 
this part of these absorption spectra. Neither of these 
presumptions is generally valid, but there appears now 
to be no better zero-position criterion. Also, it should 
— a Curves for this figure were taken as follows: K emission from 

A. Bearden and C. H. Shaw, Phys. Rev. 48, 18 (1935); K 

absorption from W. W. Beeman and H. Friedman, Phys. Rev. 
56, 392 (1939); L emission. from Spielberg, Soules, and Shaw, 
Pate published ; L absorp from A. pandetröm, Nova Aca 
Reg. Soc. Sci. Upsaliensis 9, No. (1935), and C. B. van den 


> öningen University, 1957; M emission from 
Berg, thesis Grr Johnston, Phil Mag. 45, 1070 (1954); and 


Skinner, Bu from Tomboulian, Bedo, and Neupert, J. Phys. 
M absorp ieis, 282 (1957). ae 
em. 90. this position in the uncorrected absorption 
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s Conventionally, energy of the absorption edge (although the 
curve is taken int in the transmission curve is sometimes used, 
first igfection Preoretical justification). 
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be realized that just what is called the first inflection 
point often depends upon the instrumental resolving 
power and the precision of the intensity measurements. 

The M emission and absorption, as directly observed, 
consists of overlapping Mrr and Mrrr bands, but only 
the Myrr bands are shown in Fig. 11. In making the 
necessary resolution, we have taken the energy sepa- 
ration between the Mrr and Mrrr bands to be small, 
about 2.5, 2.2, and 2.0 ev for Zn, Cu, and Ni, respec- 
tively, in both emission and absorption. Further, we 
have assumed the relative intensities of the two absorp- 
tion bands to be 2 to 1 on the basis of quantum weights. 
In emission, however, the Auger transition Mrr—> 
MrrrV, where V refers to the valence band (admixed 
3d and 4s bands), probably increases this ratio from 2 
to 3 or more (and very slightly broadens the Mz, 
band); for present purposes, the Myr;-emission bands 
in Fig. 11 are deduced from the observed curves with 
the presumption that the intensity ratio is about 3:1. 

None of the experimental curves has been corrected 
for the corrections (b), (c), (e), (f), (g), and (h) as 
listed earlier. Although a correction (d) for background 
intensity has been made in each case by the respective 
investigators, this correction is often arbitrary. This 
arbitrariness is probably most pronounced when the 
background is either changing rapidly with energy (as 
in the K-emission curves because of the tail of the 6; 
line) or is large and somewhat uncertain (as in the Mrrr 
emission). It is entirely possible that the so-called base 
width of the K-emission bands in Fig. 11 are too great 
by 50%, and of the Mrrr bands too narrow by 50%.* 
If the Mrrz bands are as wide and complex as is reported 
by Bedo and Tomboulian,® it seems likely that, in 
accounting for these features, we must assign a promi- 
nent role to the excitation states. 

The correction for both kinds of resolving power is 
large in the K spectra, intermediate in the L spectra, 
and probably small or negligible in the M spectra. The 
correction for self-absorption by the anode of the x-ray 
tube has recently been reported to be very large, at 
least in certain K-emission spectra. To the extent 
that an emission band or line, after having been 
corrected for the instrumental resolving power, straddles 
an absorption edge, the effect of self-absorption is 
markedly asymmetrical and causes the emission curve 
to appear too low on the high-energy side. The self- 
absorption correction is no doubt important, especially 
in K spectra for which the jump-ratio at the absorption 
edge is great. The intensity of the high-energy 


85D. E. Bedo and D. H. Tomboulian, Bull. Am. Phys. Soc. 
Ser. II, 3, 192 (1958); see also Tech. Rept. “The M» s valence 
band spectrum of manganese” (July, 1958), Contract DA-30-115- 
ORD-669. 

s We are currently in the process of repeating the work of 
Hanson and Herrera” on the K spectra of copper and nickel. 
The self-absorption correction is not as large as they reported 
since (1) they did not first correct the observed emission for 
instrumental resolving power, (2) they used an experimental 
curve for the absorption edge before it had been corrected for 
the instrumental resolving power, and (3) their correction for 
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satellites is very much stronger than is apparent before 
the correction for self-absorption is made, and careful 
correction for the satellites becomes imperative. We 
would expect that, after all the pertinent corrections 
have been made, any remaining overlap of the emission 
and absorption spectra (at the Fermi energy) should be 
accounted for as the effects of thermal motion of the 
conduction electrons and of lattice vibrations in the 
emitting anode and in the absorber. 

We may remind ourselves that, by application of the 
selection rule to the principal jumping electron, the K 
emission refers to transitions to a final state having an 
electron vacancy of p-type symmetry in valence band. 
In this application, the wave-function symmetry of the 
valence vacancy is viewed from the position of the K 
shell of the emitting atom. The K-absorption process 
leaves the system with one inner K vacancy and with 
the ejected electron having p-type symmetry whether 
it is in an excitation orbital or in some unoccupied part 
of the conduction continuum. In the Lrrr and Mrrr 
emission or absorption spectra, the final state vacancy 
or the ejected electron, respectively, has either s or d 
symmetry.*§ 

As a help in the interpretation of the experimental 
curves, we may first become acquainted with the 
theoretical prediction (one-electron model) of the 
density of states in the solids under study. Such a 
prediction, which neglects the perturbation of the inner 
vacancy, is sketched in Fig. 12. This is a “composite” 


ZG GY & 2 fo mM & 
Energy (ev) 


Tic. 12. “Theoretical” density of valence states for metallic 
nickel and copper, and (questionably) for zinc. This is a composite 
sketch from several different calculations based on the one- 
electron model. Estimated Fermi energies are indicated. 


the background radiation (the side of the nearby intense A; line 
and the continuous radiation) was not quite right. 

87 The serious tangle of satellites as often reported in spectral 
curves recorded photographically [e.g., see Y. Cauchois, Phil. 
Mag. 44, 173 (1953), and references cited therein] is frequently a 
consequence of overexposure of the photographic film and a 
failure to correct properly to a linear intensity scale. But the 
correction for satellites is usually not an easy one because of the 
poorly known background intensity, a background that may 
contain essential structure itself. 

88 Again note that, at least in hydrogenic theory, the transition 
probability for p to s is much less than for p to d. 
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of calculations by several investigators.*® The double 
maximum in the 3d band is predicted by all the calcu- 
lations. The base width of the theoretical 3d band 
seems to be about 2.7 ev for nickel (Koster) and about 
3.4 ev for copper (Howarth). Extrapolation of these 
calculated results to zinc is probably not justified 
because of its different crystal structure. The Fermi 
energy is indicated for nickel and copper, and it is 
estimated for zinc. Ten electrons go into the bands for 
nickel, 11 for copper, and 12 for zinc. In Fig. 12, the 
Fermi energies are placed for the respective areas to be 
in this ratio. As drawn, the Fermi energy for copper is 
about 2 ev above the top of the 3d band and also about 
2 ev below the bottom of the 4p band. 

Because of the dipole selection rule, we expect quite 
different shapes in the K emission on one hand and in 
the Lrrr and Mzrrr emission on the other hand, neg- 
lecting excitation spectra in each case. Insofar as Fig. 
12 applies and insofar as the observed spectra show 
the normal unperturbed states of the solid, the Lrrr 
and Myrr curves should both reflect the s and d band 
structure of Fig. 12, weighted heavily in favor of d 
because of the relative transition probabilities and 
because of the greater density of states, whereas the K 
curves should show merely the admixed p symmetry in 
these bands. (In K-quadrupole radiation, the transition 
of s symmetry to d symmetry is allowed but, as men- 
tioned in the foregoing, the intensity of this radiation 
is very faint relative to dipole radiation.) 

First consider the Zyrx and Myrr emission. The 
successive shift of the observed Lrrr and Myrr emission 
from the Fermi energy, as we go from nickel to zinc, 
seems to confirm the 3d-electronic band as most prob- 
ably containing the final electron vacancy. On the 
other hand, it is most disconcerting that there is no 
(or very little) evidence in these observed curves for 
the theoretically predicted double maximum in the 3d 
band. The full widths at half-maximum of the observed 
Lyrr-emission bands are about 2, 3, and 1 ev, respec- 
tively, and predicted width at half-maximum of the 3d 
band is about 2 ev for nickel (Koster) and a little wider 
for copper (Howarth). No theoretical prediction is 
available for zinc. Even after allowing about 0.5 ev 
(very rough) in the observed curves for resolving power, 
etc., the width agreement is fairly good, somewhat 
unexpected and probably fortuitous in view of the 
double-peak disparity. The widths at the base are not 
in agreement. Moreover, the observed M widths at 
half-maximum are uniformly greater than 2 ev. There 
is no obvious reason in the one-electron metal why the 
Lir-emission shapes should be different from the Mrrr 
shapes. And the very wide K bands are not easily 
explained, even after an estimated 3 ev (very rough) 

9 E.g., see E. Rudberg and J. C. Slater, Phys. Rev. 50, 150 
(1936); J. C. Slater, Phys. Rev. 49, 537 (1936); G. F. Koster, 
Phys. Rev. 98, 901 (1955); G. C. Fletcher, Proc. Phys. Soc. 
(London) A65, 192 (1952); D. J. Howarth, Proc. Roy. Soc. 


(London) A220, 513 (1953); C. Kittel, Introduction to Solid Stale 
Physics (John Wiley and Sons, Inc., New York, 1956), Chap. 12. 
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allowance is made for the correction for resolving 
power, self-absorption and satellites. 

Considerable point is usually made of the apparent 
coincidence in the energy positions of the abrupt high- 
energy side of emission and the low-energy side of 
absorption for the metals sodium, magnesium, and 
aluminum. This coincidence is always sought for good 
metals, but it is not so readily seen for most metals. 
The curves for nickel, copper, and zinc are typical. 

In general, we find some spotty agreements in terms 
of the theoretical predictions based on the one-electron 
model; but more disagreements. Perhaps some of the 
apparent difficulties would disappear if proper correc- 
tions were made to the observed curves, when more 
sanitary conditions and materials are used, etc.” 

In particular, the correction for excitation spectra 
may be by no means negligible. For example, it is 
likely that the excitation orbital which forms inside the 
occupied valence band (see earlier discussion of the Kp’ 
line for elements in this region of atomic numbers) is 
not similarly occupied in all K states in the emitting 
anode of the x-ray tube. If this is the case, then the K 
emission may be broadened as a result of the energy 
separations of overlapping excitation states, and also 
may be accompanied by the Kp” “satellites.’’38.*4 

Similar arguments apply to the Lrrr and Muzzy 
emissions for which the energy separations of the 
respective excitation states are expected to be somewhat 
smaller than for the K states. Further, the relative 
probability of the jump of a normal valence electron, 
compared with the jump of an excitation valence 
= electron, may possibly be greater in the Zyry than in 
= the Mrrr emission because of the overlap of the respec- 

_ tive wave functions and because the perturbation is 
Stronger in the Lzzr case. If this is so, and if the final- 
= state vacancy following the excitation electron jump is 


® The curves reproduced in Fig. 11 are believed to be the most 
reliable reported to date for these spectral regions. However, the 
following is illustrative of the experimental situation. The K- 
_ emission curves of Bearden and Shaw® have been challenged by 
Hanson and Herrera“; the Z;7;-emission band of copper has been 
reported by several investigators with large disagreements 
(compare the two curves by Cauchois [Y. Cauchois, Phil. Mag. 
44, 173 (1953), and references cited therein] and by Shaw et al.® 
as shown in Fig. 24 of reference 3 with the curve in Fig. 8 of 
i van den Berg®); and the Mp1, 111 emission is in similar trouble 
4 (compare Fig. 3 of Skinner et al.® with Fig. 3 of E. M. Gyorgy 
and G. G. Harvey Nore Rev. 87, 861 (1952); see also Phys. 
Rev. 93, 365 (1954) ] and with Fig. 2 of D. E. Bedo and D. H. 
Tomboulian [Phys. Rev. 113, 464 (1959)], and see Bedo and 
Tomboulian® who suggest that the assumed background is often 
e Lyn-absorption curve of our Fig. 11 has 


Richtmyer [Phys. Rev. 34 
in which discrepancies 
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always at a particular favored place in the valence 
band (as mentioned earlier as a possibility), then we 
might expect the L777; emission to be narrower than the 
Murr emission and to be displaced slightly in energy 
position. We may continue this speculation: If the 
favored position of the positive hole in the final state 
is indeed responsible for the Lrrr emission’s being 
narrower than the Mrr;, and also responsible for the 
relative energy positions of the Lrrr and Mrrr bands, 
then perhaps we may conclude that, after the excitation 
electron has jumped into the Lyrr vacancy, the positive 
hole is near the top of the 3d band. However, in the 
case of nickel, the Lrzr emission is only a little narrower 
than the M777, and there is no significant difference in 
the peak positions. Such speculations are interesting, 
but very little significance should be attached to their 
conclusions at this time. 

Also contributing to a difference in the energy 
positions and shapes of the emission bands is the 
difference in lifetimes of the respective initial states. 
Thus, in these states, effects of phonon interactions 
and degrees of lattice relaxation are different. 

The Lrrr emission for nickel and copper shows in 
each case a small bump on the high-energy side of the 
Fermi energy. It is likely that this bump is a conven- 
tional satellite. It is possible, but perhaps not likely, 
that much of the asymmetry in the Lrrr- and Mr- 
emission bands for copper and zinc is due to high-energy 
satellites. Satellites, being essentially atomic in char- 
acter, may generally be identified by observations with 
elements in a mono-step sequence of atomic numbers. 
Unfortunately, the necessary identification study of 
La; and My satellites in this range has not been made. 

Now consider the amazingly wide K-emission bands. 
These are usually identified by x-ray spectroscopists as 
the KB»,5 lines or bands, viz., the K— Mry,y and 
K — Nir, 111 transitions, the first being nominal quad- 
ruple radiation and the second being very improbable 
because of the small number of Nzrr,rrr electrons 
(nominally no such electrons). These bands are very 
faint, having a peak of only about 1/100 of the peak 
intensity of the nearby Kıs lines, K— Mrz 
whereas the number of electrons in the nominal Mry,y, 
Narr, rrr (i.e., 3d3/s, 5/2) shells is 10 for copper and in the 
Mrr, rrı (i-€., 3p1/2,3/2) is only 6. The K-emission bands 
are obviously severely suppressed but, even so, are 
more intense than would be expected for normal 
quadrupole radiation. We conclude that the emission 
is actually due to an admixture of p-type symmetry in 


the nominal 3d, 4s energy bands, but that this admixture — 
is not very great or else the K bands would be more 
intense than they are. The relative feebleness of this 


Kßə,s radiation is believed to be an important point in 
regards the degree of admixture in general. a 

Mott™ has suggested that, from an interpretation 0 
the van der Waals forces, an appreciable percen 
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admixed p symmetry may exist at and just below the 
Fermi energy in both nickel and copper. This is not 
believed to be likely because of the low intensity of the 
K emission, but if it were it would possibly account for 
the extra bump of intensity observed at —2 ev for 
copper, and it would presumably give rise to an abrupt K Absorption 
edge in the nickel emission somewhat as is observed for, ADET 4 
say, aluminum. Such an edge is not found for nickel. 
2, a 
34468 = 
a K@ Emission a 
g K Absorption 4 
transitions take place in the case of the Lrrr and/or 
.M rrr emission. Such transitions, however, would seem 


The observed K emission above the Fermi energy 
may perhaps be relatively too intense to be accounted 

` 93 Fic. 13. K emission and absorption spectra (uncorrected) 
to be very improbable. for crystalline potassium chloride. 


for as satellites. It is barely possible that there is an 
appreciable number of stray electrons hovering around 
in the crystalline anode just above the Fermi surface; 
if so, and if enough of these hovering electrons drop 
into the K vacancies, we may be able to explain such a 
high-energy tail. It is also barely possible that such 


The large intensity maximum in the K-emission band 
near the Fermi energy in zinc asks for explanation. As 
we have stated above, it is likely that the density of 
states and the Fermi energy as displayed in Fig. 12 
are not applicable to zinc. In particular, it may be 
that the low-energy tail of the 4p-electronic band is 
occupied in zinc, and that the observed intensity 
maximum in question reflects this occupation as dipole 
radiation. If this is so, and if excitation states do not 
invalidate the one-electron model, the zinc K emission 


The band structure of the transition metals is of 
considerable current interest, primarily because of their 
electrical and magnetic properties. In this interest, 
correlations of the results of x-ray measurements have __ 
been barely attempted% but the possibilities are very 
promising, especially after the observed curves for 
selected solid materials have been properly corrected 
for the various effects (a) through (h) as listed earlier. __ 


should exhibit the abrupt high-energy edge expected Potassium Chloride 
for metals under these conditions; perhaps it would if p : A: 
the corrections were properly made to the curve. But, Figure 13 shows uncorrected curves for an insulator. 


even so, it is probable that an energy gap of a few Here, the valence band in crystalline potassium 
volts would exist between the K emission and absorp- Chloride, nominally the Cl 3p electrons, is probably so 
tion edges. far removed from the Fermi energy that the K emission 
Turning now to the absorption curves, the “jump- band, designated ĝı, is very narrow; the observed full 
ratio” for nickel at the Mrrr edge is about 2 but width at half-maximum of bı is 1.21 ev. On the high- 
decreases to about 1.3 for copper and zinc. This may energy side of 61, there 1s observed the line z for 
be taken as evidence that the 3d-electronic band in chlorine and B” for ESN and also a rather long — 
nickel is not quite fully occupied, but that in both faint tail in each case. 8, or 8”, and the tail, are probably 
copper and zinc it is fully occupied. Figure 12 is, of 4 complex of satellites and, if so, are of no great interest a 
course, in accord with this evidence. to us now. ~ BY. 
Obvious component structure appears in the absorp- In absorption, the excitation states are no doubt 
tion curves near and above the position of the Fermi responsible for at least the most prominent maxima— 
energy. It is not clear whether this structure corresponds indeed, the overlapping of the excitation spectra and- 
simply to a variation in the unperturbed V.(EZ)T(Z) the unperturbed solid-state band spectra is so gre 
curve or is evidence of excitation states, plasma that we are at a loss to say where any feature of t 
oscillations and multiple-inner-vacancy states.” unperturbed solid-state spectra may be. a 
eR Stephenson [S. ‘T. Stephenson, Phys. Rev. 58, 877 (1940)] The absorption structure is roughly similar for either 


reports that much of this apparent satellite structure may be an the chlorine or the potassium inner vacancy althot 
inverse view of the Kronig structure brought out in the self- .——\-——— 
absorption process. If so, this structure should disappear when % E.g., see N. F. Mott and K. W. H. Stevens, Phil. M: 
proper correction for self-absorption is made. 1364 (1957). x. 
93 Many investigators have speculated about this possibility in % These curves are from the paper by P. 
attempts to explain the apparent satellite structure: e.g., see Y. The method used in reference 31 for correcting the emission 
Cauchois, Compt: rend. 201, 1359 (1935), T. Hayasi, Sci. Rept. widths for the instrumental resolving power is not valid at t 
Tôhoku Univ. 25, 785 (1936), 31, 1 (1942), and van den Bere wavelengths, as has been shown by the better method of 
% Photographically recorded CuK absorption curves** show 17. An interesting emphasis on the need of a proper cot 
= in the range 0 to 15 ev four wiggles (component structure) not the absorption curves is demonstrated by Parratt, 
indicated in Fig. 11. We are currently restudying this region with and Jossem [Phys. Rev. 105, 1228 (1957)] in 
a two-crystal spectrometer. “thickness effect.” SI tier 


= 2 CC-0. Gurukul Kangri University Haridwar Collection. Digitized by S3 Foundation USA A 


taa 
PPS, 


Riedie i > . o~ 


ee i A eran a ae er ee 


a 


Oo 


640 EG 


Unoccupied Levels 
(wiih one X electron missing) 


8 
Energy (ev) 


Fıc. 14. Density of p-type states (times the transition proba- 
bility) in crystalline KCl. The occupied electronic levels are 
those for a crystal having an electron vacancy in the chlorine 3p 
(valence) band; for the unoccupied levels, the vacancy is in a 
chlorine K shell. 


the details and the transition probabilities are consider- 
ably different in the two cases. These similarities and 
differences are expected when we consider the similar- 
ities and differences in the immediate neighborhoods of 
the respective atoms in the crystal. 

The chlorine curves for the other alkali chlorides also 
show” different details in the absorption structure, but 
all of these curves are characterized by pronounced 
maxima and minima. For LiCl and NaCl, the deep 
minimum that appears between the first two maxima 
for KCl and RbCl is effectively filled up, indicating 
either a smaller energy interval between the most 
prominent excitation states or else a greater number of 
such states. The structure for CsCl is again different, 
but the crystal structure of this chloride is body- 
centered-cubic whereas the other chlorides are of face- 
centered-cubic structure. The K-absorption curves for 
potassium in the potassium halides likewise show both 
similarities and differences.’ Similar strong absorption 
maxima and minima for chlorine are also observed when 
the chlorine is in a gaseous phase, either in HCl or in 
free Cl? (see Fig. 3); this might be expected since the 
bound-ejected-electron excitation states are vestigial 
molecular or atomic states. 

Figure 14 presents the V,(Z)T(£) graph for potas- 
sium chloride. For this graph the chlorine K data 
shown in Fig. 13 have been fully corrected except for 
corrections (e), (g) and (h). The corrections for both 
kinds of resolving power were made by the method of 
Porteus and Parratt!’: First, the observed transmission 
curve was corrected for the instrumental spectral 
window, and then the absorption curve (the logarithm 
of the thus-corrected transmission curve) was corrected 
for a Lorentzian K state. 

Now, we see that the chlorine 3p-valence band is 

indeed much narrower than has been customarily 


t believed from early theoretical calculations (This prob- 


em is discussed in reference 31.) A recent calculation 


by Howland, however, concludes that the base-width 


a and Jossem (to be published). 
Parratt, P Ora. Repts. Tohoku Univ. First Ser. 36, 1 (1952). 


Kiyono, “e ogstad, and Nelson, Phys. Rev. 84, 806 (1951). 
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of this band is about 1.5 ev, in sensible agreement with 
the curve in Fig. 14. But, more recently, Howland has 
extended his theoretical calculations to predict a shape 
of this band as well as its width; he predicts a half- 
maximum width of 1.32 ev, a value much greater than 
the observed 0.6 ev width, and he predicts a low-energy 
shoulder in the band which is much more prominent 
than the slight shoulder seen in Fig. 14. 

This feature, just referred to as a slight shoulder, 
may be evidence of the spin-doublet states 3p; and 39. 
The separation between these two states in gaseous 
atomic chlorine is about 0.11 ev and is expected to be 
about the same in crystalline KCl. It is possible to have 
two components of identical shape (near Gaussian) and 
separated by 0.11 ev, each with a width of about 0.5 ev, 
and with the 3p; component on the high-energy side 
being twice as intense as the 3p; component, such that 
their sum agrees with the curve of Fig. 14 within the - 
accuracy with which the curve is known. Another 
interpretation of the slight shoulder would assign it to 
one or more excitation states of the valence-electron- 
configuration type, similar to the situation with the 
transition metals of the manganese group. 

From the absorption part of the spectrum, it is 
significant that the number of evident excitation states 
is larger than has been previously recognized or theo- 
retically discussed for x-rays. The prominent states 
here are no doubt of the bound-ejected-electron type. 

The question arises as to the meaning of the width 
of a single component excitation state as it appears in 
the “unoccupied” part of Fig. 14. This width is the 
residual width after correction of the true spectrum for 
the “atomic” K state which is less wide than any 
K-excitation state. If a full-width K-excitation state 
had been used in the correction procedure, the result 
would have been degenerate (a Dirac delta-function™). 
The saving difference between these two K state widths 
is to be found primarily in the effects of phonons in the 
lattice. This needs further discussion. 


Lattice Spacing and Phonons in KCl 


First, we point out an effect due to the average value 
of the lattice spacing. Then we discuss the effect of 
statistical time and/or spatial variations in this spacing. 
The first is relatively easy to discuss insofar as we have 
knowledge; the second, the effect of the phonons, is 4 
little more involved. 

The energy of any x-ray state, i.e., of a system 
containing an inner electron vacancy, depends upon the 
total electric field in which the vacancy finds itself. As 
the lattice expands, e.g., by an increase in temperature, 
the contribution of electric field due to the presence of 
neighboring atoms (or ions) is generally altered, and s0 


mL, P. Howland, Quarterly Progr. Rept. Solid-State and 
Molecular Theory Group, MIT (January 15, 1958), p. 50.. 
12 A Dirac delta-function ô(x) is zero for x70, infinite for 
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is the energy of the x-ray state. The magnitude of the 
energy shift is probably small but is not easily deduced 
theoretically. It is believed that the shift is generally 
to a higher energy for x-ray states in atoms in a metal, 
and probably also for the anion in an ionic crystal, but 
to a lower energy for the cation in the ionic crystal.!% 

However, the ground state, i.e., the initial state in 
any absorption transition, is by definition not altered 
by the expansion of the lattice. Hence, the energy shift 
appears directly in the position of the main absorption 
edge. For a given solid it is expected that the shift is 
about the same for all states of energy greater than a 
few hundred ev; it is too difficult to predict the relative 
shift for states of less energy, e.g., for the valence state. 
Hence, for an emission line from a transition between 
states each of energy greater than a few hundred ey, 
the energy shift caused by the thermal expansion of the 
lattice, is expected to be small. 

In absorption, a lattice spacing effect, in addition to 
the shift in energy of the main edge, comes into the 
Kronig and Hayasi structure since the energy positions 
in this structure depend upon Bragg reflection of the 
ejected electron in the lattice. The interference condi- 
tions in Bragg reflection are temperature sensitive since 
they depend upon the lattice spacing. The energy 
dependence of the interference conditions turns out to 
be an inverse quadratic function of temperature. 

Incidentally, the argument may be advanced that a 
sensitive method for studying the chemical bonds 
around an atom in a solid, since these bonds affect the 
lattice spacing, may be in interpretations of the energy 
positions of (inner-state) — (valence-state) emission 
(e.g., K— V), and/or of the inner-state absorption 
edge, as the chemical composition of the solid is varied. 
However, in compounds or alloys the role of the 
excitation states of both the valence-electron-configu- 
ration and the bound-ejected-electron types depend 
upon the chemical composition. It is likely that in a 
given alloy, even one having a uniform phase, the 
so-called normal valence configuration may not be the 
same for all atoms. Alterations in the excitation states 
probably have a small effect on the peak positions of 
emission lines but have possibly a large effect on the 
absorption spectra. In any case in such a chemical-bond 
study, they must be considered in addition to the 
effect due to the change in lattice spacing alone. 10.195 


103 Interesting studies of the temperature effects in K emission 
and absorption for ferromagnetic alloys have been reported by 
V. A. Kazantsev [Bull. Acad. Sci. USSR, Phys. Ser. 20, 97 
(1956), English translation, and Doklady AN SSSR 101, 477 
(1955) ]. Although he gives simple interpretations in terms of the 
one-electron-jump type of diagram, the role of excitation states 
probably does not invalidate the general conclusions in regards 
the main trends of energy change. 

1% Nordling, Sokolowski, and Siegbahn report [Arkiv Fysik 
13, 483 (1958)] that the apparent K and Ly, states for copper 
change in energy by 4.4 and 2.5 ev, respectively, from metallic 
Cu to CuO. 

106 Das Gupta reports [Phys. Rev. 80, 281 (1950)] that the 
energy shift of the K — V or Lir —> V emission peak correlates 
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Now consider the broadening of the energy of x-ray 
states due to phonons or lattice vibrations. (Some 
authors object to calling this a thermal effect because 
lattice vibrations are present even at absolute zero of 
temperature.) It was mentioned earlier that only a 
localized region of the solid is effectively involved in the 
energy of each state. This region is, of course, as great 
as the pertinent electron orbits, including the effective 
orbit of the ejected electron. The localization is due to 
the very short lifetime of an x-ray state and to the 
relatively long time for a phonon (or any other likely 
energy-bearing agent) to travel through the lattice. 
This factor in localization is a consequence of the 
Franck-Condon principle. The x-ray system, therefore, 
refers to a region that may be only a very few lattice 
spacings in extent (excluding the ejected electron if it 
is in the remote part of the continuum). 

The energy of a state is affected by the relative 
positions of all the electrons and nuclei in this localized 
region, and lattice vibrations give rise to fluctuations 
in this energy. Of course, the observed spectra represent 
a statistical average of the spectral energies from all the 
compressed or expanded regions that are emitting or 
absorbing the x-rays; the effect generally is to broaden 
an emission line, an absorption line, or an absorption 
edge. The statistical vibration broadening in either 
emission or absorption, considering all vibration fre- 
quencies and localized regions, is essentially Gaussian 
in shape. 

For an ionic crystal, there are two kinds of lattice 
vibrations: (1) the so-called optical branch in which 
the adjacent positive and negative ions vibrate out of 
phase,” and (2) the acoustic branch in which adjacent 
ions vibrate approximately in phase. Vibrations fre- 
quencies extend from zero to about 10" cps. Hence, 
after a time greater than about 10-” or 10-® sec, a 
completely new time and spatial distribution of com- 
pressed and expanded regions is expected. 

For a metal crystal, the energy smear in an inner- 
vacancy state due to the vibrations in the electric field 
may be about a tenth of an ev or so at room tempera- 
ture, and for an ionic crystal it may be somewhat 
greater because of the optical phonons. The smear is 
conveniently expressed as the root mean square of the 


qualitatively with the heat of formation of a compound or alloy. 
This is in agreement with our expectations. However, in seeking 
quantitative agreement, Das Gupta assumes that the heat of 
formation is due entirely to the energy shift of the valence band 
alone, there being no shift in the inner state. This assumption 
does not seem reasonable. 

_ “6 The electron vacancy in the x-ray state has some mobility 
in the lattice; this feature also extends the local region of interest. 
Not much is known [T. Takeuchi, Progr. Theoret. Phys. 18, 421 
(1957) ] about the transfer of inner electron vacancies from one 
atom to the next, although this mobility is believed to be very 
small. 

17 The transverse mode of this type of vibration is excitable 
by an electromagnetic field, such as a light wave; hence the term 
“optical.” We are concerned primarily with the longitudinal 
modes. The strongest polarization in the crystal occurs when 
adjacent ions vibrate 180° out of phase. 
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amplitude of the vibrations of the state energy, and 
this for a Gaussian shape of the state is a little less than 
the half-width at half-maximum. For metallic potas- 
sium, the smear so expressed is calculated to be 0.15 ev 
at room temperature; for potassium chloride, the smear 
is about 0.15 ev at 0°K, about 0.16 ev at 100°K, about 
0.19 ev at the Debye temperature, 230°K, and perhaps 
0.23 ey at 373°K.8 Such fluctuations are present in 
the energy of each x-ray ionization state. The magnitude 
and phase of the energy fluctuations are about the same 
for all ionization states in a given atom in the solid for 
which the effective diameter of the vacancy orbital is 
small compared with the atomic volume, i.e., for all 
states whose radiative lifetimes are less than about 
10~ sec. 

It follows, then, as a prediction, that any emission 
line arising from a transition between ordinary x-ray 
states both of which have radiative lifetimes short com- 
pared with about 10~* sec has an energy width that is 
rather insensitive to the lattice vibrations. But each ab- 
sorption line, band or edge is expected to be broadened, 
regardless of the lifetime of the final state. Radiative 
lifetimes less than about 10~ sec are expected for all 
x-ray states except those for atoms of the lowest atomic 
numbers and except for those states corresponding to a 
vacancy in the valence band. 

For a singlet state of very short lifetime, shorter than 
about 10~* sec, the shape is expected to be Lorentzian, 
as determined by the statistics of random-decay phe- 
nomena; and the shape of spectral lines (either emission 
or absorption) involving such states is also expected to 
be Lorentzian, and the shape of the main absorption 
edge, if the continuum of “unoccupied” states starts 
abruptly and is uniformly dense, is expected to be an 

_ arctangent. Indeed, within experimental error, the 
; Lorentzian shape is observed for emission lines between 


7 _ 198 These energy fluctuations are calculated from formulas 
= upplied by Professor A. W. Overhauser. For the longitudinal 
= modes in a monovalent metal, 


E ny (NERO il 6 j” 
Ez: Keat) l 
+ ‘where N =number of atoms/cm?, p= density in g/cm’, u= velocity 
= of longitudinal sound in cm/sec, T=absolute temperature, 
+‘ §@=Debye temperature, k=Boltzmann constant, and C= Fermi 
= energy; and for the longitudinal optical modes in an ionic crystal 
of two types of ions, 
KOP 
NONNE) ara) 
=O pah (NM aR M: npo) \exp(hw/rkT) — Tae ? 
l e= electronic charge in esu, n=optical index of refraction, 
Jattice constant, Mı, M2 are masses of the two kinds of ions, 
Janck constant, w=average optical mode frequency, and the 
zen uantities are as defined in the foregoing. The particular 


‘mode for which the equation is devised is probably the 
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singlel states both having very short lifetimes. However, 
in emission involving a longer lived state, e.g., the 
valence state, there is evidence that the tail on either 
side of the observed line is less high than Lorentzian, 
indicating a significant Gaussian contribution in the 
final state. (For example, the valence state derived 
from the f; line in Fig. 14 shows this effect.) In the 
longer wavelength range, Bedo and Tomboulian re- 
port’® that the shape of the lithium K state, as inter- 
preted from the K emission, is more nearly Gaussian 
than Lorentzian; this is to be expected if the K-state 
lifetime is of the order of, or greater than, about 10-" 
sec,’ a lifetime that is reasonable from other argu- 
ments. 

Now consider a short-lived excitation state of the 
bound-ejected-electron type. The sensitivity of such a 
state to lattice vibrations depends upon the diameter | 
of the orbit of the bound-ejected electron. If this 
electron remained very close to the position of the 
electron vacancy, its negative charge would effectively 
cancel the positive charge of the vacancy, and the 
energy of the state would suffer only minor fluctuations 
due to vibrations of the electric field. On the other 
hand, if the excitation orbit were large compared with 
the lattice spacing, such that the electric fields near a 
large number of atoms were sampled in the orbit, the 
lattice vibration sensitivity would be about the same 
as for an ordinary unexcited x-ray ionization state. It 
is believed that most excitation orbits are of an inter- 
mediate size for which no prediction of the lattice 
vibration sensitivity is made at this time. 

Another broadening effect in each excitation state, 
of either the valence-electron configuration or the 
bound-ejected-electron type, is due to the finite lifetime 
of the electron (or electrons) excitation orbital (or 
orbitals) as a consequence of all modes of decay other 
than the filling of the inner vacancy. The most likely 
types of such decay transitions are those due to Auger 
or phonon interactions. This type of broadening, 
Lorentzian in shape, is very small in the present KCl 
spectra, but the broadening by phonon interactions is 
possibly important when the inner vacancy has a longer 
lifetime, e.g., in absorption spectra in the long wave- 
length region." However, even in the long wavelength 


1 Or, if the valence state, i.e., the final state in this Li k 
emission, has a long lifetime, the emission may also be predomi- 
nantly Gaussian as a consequence of this lifetime and lattice 
vibrations, independent of the shape of the K state. If the pro- 
posal of Skinner-Landsberg® for metals is correct, the lifetime of 
the low-lying valence states for lithium are shortened by Auger 
transitions and the shape of such a low-lying state in the band 
becomes essentially Lorentzian. e 

u0 This presumes that the dynamic phonon-electron interaction 
is negligible during the short lifetimes of interest; but the sta- 
tistical static interactions cause a broadening which may increase 
or decrease the effect in the ordinary unexcited x-ray ionization 
state. F 
u This broadening has been calculated by A. Radkowsk; 
[Phys. Rey. 73, 749 (1948)] and found to be appreciable 
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region, this broadening effect may be overpowered by 
the vibration Gaussian contribution which is also a 
consequence of the longer state lifetime. 


Excitation States for KCI in Absorption 


Our discussion of the excitation states shown in Fig. 
14 may now be completed. The “atomic” K state used 
in the correction procedure was obtained, as discussed 
in connection with Table I, from Ka-emission lines 
and from the strong resonance absorption line in 
gaseous argon. Because of short lifetimes, this state 
does not contain the broadening effects of the phonons. 
So corrected, each component excitation state in Fig. 
14 is expected to show phonon width and to be essenti- 
ally Gaussian in shape, the width being roughly the 
same for all excitation states. The structure shown in 
Fig. 14 is consistent with this view. The band of 

“excitation states about 4 ev wide in the center of the 
figure evidently contains many—at least six—over- 
lapping component states, and the very steep sides of 
this band indicate component widths of the order of 
0.2 ev. Such a width is in agreement with the calculated 
phonon width given above for KCI at room temperature. 
The rugged structure extends over a range of about 
10 ev, and the total number of excitation states is 
probably greater than 15. Beyond about 20 ev from 
the valence band, the density of states has relatively 
small undulations in it, roughly the sort of undulations 
expected in the theoretical Kronig structure. 

An experimental study of the chlorine K absorption 
in potassium chloride has been carried out at tempera- 
tures from 100°K to 373°K." Two of these curves in 
Fig. 15, not corrected for resolving power, show: (1) a 
general shift of the order of 0.2 ev to lower energies, 
except for the first maximum in the region labeled 
p-type continuum, (2) broadening of the excitation 
states, and (3) some alterations in the component 
structure.!! It seems reasonable to suppose from these 


2 4 6 8 10 12 14 16 
Energy (ev) 


Fic. 15. Temperature dependence in the chlorine K-absorption 
spectrum of crystalline KCl. 


u? Parratt, Hempstead, and Porteus (to be published). 

u3 The observed excitation states in ultraviolet absorption in 
KCl are also temperatura DES as to both the apparent 
energy-positions and widths [Hartman, Nelson, and Sie fried, 
Phys. Rev. 105, 123 (1957)]. The vibrations in the electro field 
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results that the excitation electron in the first excitation: 
state is essentially shared with the near-neighbor 
potassium ions, in agreement with the usual theoretical 
model. 

A feature of great solid-state interest in a curve of 
the type shown in Fig. 14 is the magnitude of the 
energy gap between the normal valence band and the 
bottom of the conduction continuum. We discuss this 
feature now. Nominally, the bottom of the continuum 
refers to the states of s-type symmetry whereas Fig. 14 
gives the density of states of p-type symmetry. Final 
states of s (or d) symmetry are involved in the ultra- 
violet absorption, and the desired energy gap has long 
been interpreted from this absorption as 9.46 ev in 
crystalline potassium chloride." This interpretation, 
although the numerical value of 9.46 ev may possibly 
be changed to about 8.3 ev as a result of recent work," 
would place the bottom of the s-type continuum at 
about the position of the central band of excitation 
states in Fig. 14. Admixture of p type in the nominal 
s-type states is no doubt present, and some fraction of 
the density of states in the region labeled excitation 
states in Fig. 14 should therefore be interpreted as part 
of the normal continuum. There is no obvious feature 
of the curve by which we can identify either the bottom 
of the continuum or even the existence of the continuum 
as some appropriate fraction under the excitation states. 
Possibly the deep minimum at about 16 ev above the 
valence states represents the bottom of the p-type 
continuum. States to the right of this minimum are 
designated p-type continuum states in the figure but 
we do not wish thereby to imply that the continuum 
necessarily begins here. 

In general spectroscopy, it is common to add the 
energies of two or more selected lines to get the energy 
of a third; e.g., (K = Lrrr)+ (Lir = Mrr)= (K> 
Mrr). By a similar argument," in which the energy 
positions of absorption edges are included, we may 
possibly derive some help in identifying the gap between 
the valence states and the bottom of the continuum. 
In KCl the valence states are the Cl Mzrr states, and 
we may write the gap-equations as 


gap (to p-type continuum) 
= Kabs— (K z2 M rrr) from Fig. 14 
=Mr wst (Lrrr > Mr) 
+H(K > Lrrn)— (K —> Mr) (1) 


in the region of a valence orbital are believed to be too complex 
for even qualitative prediction at this time. 

1“ E.g., see N. F. Mott and R. W. Gurney, Electronic Processes 
in Ionic Crystals (Oxford University Press, London, 1948) 
second edition, p. 95. ? 

uë The arguments of E. A. Taft and H. R. Philipp [J. Phys. 
Chem. Solids 3, 1 (1957)] may be applied to the experimental 
curves of Hartman, Nelson, and Siegfried [Phys. Rev. 105, 123 
(1957) ] for potassium chloride. 

u6 The author is indebted to Professor E. L. Jossem for develop- 
ing this argument in the present application. 

The argument is essential] the Ritz-combination principle 
mentioned earlier as being valid in a gross description but not 
valied in detail. Our present interest is gross. 
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gap (to s-type continuum) 
= Mrrr avs {rom ultraviolet absorption 
=D avst+ (K = Lr1)—(K = Mrz). (2) 


The only real help to be gained in this type of argument 
lies in a less ambiguous recognition of the appropriate!” 
My, and Lrrı absorption energies than is the case for K 
absorption in Fig. 14—the excitation spectra are ex- 
pected to be somewhat different because of the different 
positions of the electron vacancy. The Mr state has 
the same symmetry as the K state, viz., s type, but the 
Lrrr state has p-type symmetry. The gap-equations 
(1) and (2) are written separately for the p type and 
for the s-type continua because it is believed that the 
experimental values may well be different for these 
two gaps, and it is alleged theoretically that the 
admixture of p-type symmetry at the very bottom of 
the nominal s-type continuum is most unlikely. 

The energies of all three chlorine emission lines in the 
gap-equations have been accurately measured for KCl 
—K — Myrr= 2815.12 ev, Lrrr — My=182.75 ev, and 
K — Ly11= 2622.23 ev—but the chlorine L- and M- 
absorption edges for KCl have not been recorded. The 
Cl Mr aps for LiCl has been determined by O’Bryan"® 
who remarks that the absorption bands “show a width 
from 2 to 4.5 ev at the head with less intense continuous 
absorption extending to shorter wavelengths.” The 
energy at the center of the strong absorption band is 
29.7 ev. O’Bryan’s description of the absorption curve 
also fits fairly well the Cl K-absorption curve for LiCl 
as reported by Parratt et al?" It is reasonable to 
conclude that 29 ev is probably a minimum value for 
the Cl M: absorption energy for KCl. The Cl Lrrr- 
absorption energy is also to be deduced indirectly from 
measurements, and, again, only an approximate value 
can be obtained. We use the Cl Lyz1-absorption energy 
for gaseous CCl,, 200.6 ev, as reported by Prins,” and 
add to this energy about 3.5 ev for the following 
reasons: 1.5 ev are added as a guess as to the separation 
between the first strong absorption peak and the 
appropriate absorption edge. The remaining 2 ev comes 
from a comparison of the Cl K-absorption spectra for 
gaseous CCl, by Nilsson” with the Cl K absorption 
for KCl by Parratt et al.” ; the K edge shifts by about 
2 ev in going from the gaseous CCl, to solid KCl. We 
assume that the Lrrr edge shifts by the same amount. 
Thus, the approximate energy of the Cl Lrzz absorption 

is taken to be 204 ev. 

Substitution of these numerical values in Eqs. (1) 


and (2) gives 
gap to p-type continuum= 19 ev 
gap to s-type continuum= 11 ev. 


aM iis 7 ” 
The difference between the “unexcited” x-ray level (the 
Ba ie mies earlier with the subscript 7) and the energy of 
Jevel rete ion edge is presumed in this argument to be roughly 


the absorpti ious cases. 
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The uncertainty in the 11-ev value may be 2 or 3 ev, 
and the 8 or 9 ev previously reported for the s-type 
case may possibly be a better value than the 11 ey. 
The 19 ev for the p-type case was deduced as a minimum 
value, and it also may be uncertain by a few ev. The 
Mr- and Lyrr-absorption edge energies are based on 
uncorrected curves recorded photographically and, in 
some instances, with low instrumental resolving power. 
Nevertheless, the conclusion that there is a difference 
of several ev between the effective bottoms of the s- 
and p-type continua is experimentally the best we have 
to date. This conclusion may easily be altered to say 
that there is only a very small amount of p admixture 
in the difference region. 

These arguments are interesting and suggest desirable 
supplementary measurements for the future. At the 
present time, they are not sufficiently quantitative to 
provide a convincing basis for the energy-gap interpre- 
tation of the curve of Fig. 14; they do however tend to 
confirm the division of the curve into the general 
regions labeled excitation states and p-type continuum. 


Other Types of X-Ray Curves 


There are other ways in x-ray spectroscopy to study 
the electronic band structure of solids, but these we 
barely mention. 

One way is to record and interpret a curve of the 
emission intensity of a characteristic line, say the Kay 
line, with each of various values of the anode voltage 
in the neighborhood of the threshold excitation volt- 
age.”! This method purports to examine the unoccupied 
levels of the solid by inquiring as to just where and 
with what probability the inner electron can be ejected 
to them by cathode-electron impact. It is analogous to 
the x-ray absorption method except that the radiative 
selection rules do not apply. When scrutinized more 
carefully, the correlation between electron energy and 
the emitted x-ray intensity is as complex as the interpre- 
tations in absorption work: Excitation states, plasma 
oscillations, and multiple-inner-electron vacancy states 
are all involved as well as the normal unperturbed 
solid-state bands. 

It is interesting to speculate in this method as 
follows: With the spectrometer initially set at the peak 
of Ka; for about twice threshold voltage on the x-ray 
tube, a wavelength scanning (by means of small changes 
in the Bragg angle) at a constant voltage near threshold 
may show the peak at a different wavelength. And the 
line now may be of a different width or asymmetry. 
Perhaps if both voltage and wavelength were variables, 
more than one ‘Ka, line” would be observed near 
threshold voltage; if so, they would correspond to 
different radiative combinations of K and Lyyy normal 
and excitation states. It is believed likely, however, 
that only one broadened Ka; line would be observed in 
this fashion because, if more than one were to appeal; 


) ‘cad, 1174 (1934). 
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then, in an appreciable fraction of the emission transi- 
tions, (1) the bound-ejected electron would have to 
jump differently by a few ev or (2) the valence-electron 
configuration would have to change differently. With 
regard to the latter possibility, it may be noted that 
the time required in the formation of the perturbation 
due to the inner vacancy near threshold voltage is 
somewhat longer than it is with much excess voltage 
on the x-ray tube, and consequently the respective 
probabilities for producing the various excitation states, 
especially of the valence-electron-configuration type, 
are perhaps altered. 

Another method of studying the electronic band 
structure of solids is by the spectral details observed 
near the short-wavelength limit of the continuous 
x-radiation.! The cathode electron comes to rest, or 

12 S, T. Stephenson, “The continuous x-ray spectrum,” in 
Handbuch der Physik, S. Fliigge and Marburg, editors (Springer- 


Verlag, Berlin, 1957), Vol. 30, pp. 368-370, and references cited 
therein. 


nearly so, wherever it can find an acceptable unoccupied 
level in the solid. The x-ray intensity pattern gives 
information concerning the positions of the respective 
levels and the density of states. In this method, the 
excitation states are probably not involved, but the 
plasma oscillations and the multiple-vacancy states are 
present to complicate the interpretations. 


VI. FURTHER READING 


For further discussion of the experimental tech- 
niques, types and details of observed curves, variety of 
solids that have been studied to date, theoretical calcu- 
lations, different speculations in interpretations, etc., 
the reader is perforce referred to the literature. Only a 
very few examples have been selected for inclusion in 
this review. Fortunately, several summaries of the 
general subject have recently appeared, each giving 
extensive references.'~® 
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1, INTRODUCTION 


XIDE semiconductors, such as rutile, differ from 

the elemental semiconductors, silicon and ger- 
manium, in several respects. Probably the most 
important is the high melting point, which increases 
the difficulty of obtaining specimens of high purity. A 
related effect is the high concentration of vacancies 
which is “frozen” into the specimens at elevated 
temperatures. Oxygen vacancies are believed to exceed — 
the cation vacancies in rutile. The formation of an _ 
oxygen vacancy in the otherwise pure oxide provides 
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a source of two electrons having an activation energy 
less than the energy required to raise an electron to 
the conduction band in the stoichiometric material. 
Thus the transport properties of the oxide are altered 
by departures from stoichiometry, just as they are by 
the presence of impurities. 

The second respect in which oxide semiconductors 
differ from silicon and germanium is the separation of 
the cations by the oxide ions. This separation leads to 
a reduction of the overlapping of the cation wave 
functions. It is undecided whether the wave functions 
of the cation 3d electrons are to be considered as over- 
lapping, or “noninteracting.” This question has been 
discussed by Verwey, et al., Mott,? Zener and Heikes,’ 
and very recently by Morin.* It seems probable that 
the titanium 3d levels (or band) do not overlap the 
bottom of the much wider 4s conduction band. To the 
extent that electrical conduction is due to 3d electrons, 
one expects a low mobility compared with the electron 
mobility in silicon and germanium. 

Finally, one has the difficulty of distinguishing 
between chemisorbed oxygen and oxygen from the 
adjacent bulk material. In other words, our concept 
of a surface is less precise than in the case of elemental 
solids. 

For these reasons, it is appropriate and timely to 
examine critically all the available literature on the 
physical properties of rutile. In the following pages 
these properties are interpreted in terms of two elec- 
tronic models. The attempted correlation of the various 
types of measurement (electrical conductivity, thermo- 
power, etc.) by the use of these models is not satis- 
factory. It was therefore decided to discuss many 
physical properties that are not obviously significant 
from the point of view of the two models. 


2. CRYSTAL STRUCTURE 
2.1. Introduction 


Titanium dioxide exists in three crystalline modifi- 
cations, rutile, brookite and anatase, all of which have 
been prepared synthetically. Titanium dioxide pre- 
cipitated from sulfate or chloride solutions at room 
temperature is amorphous even after drying at 110°C.° 
At elevated temperatures anatase is precipitated from 
sulfate solutions, but hydrolysis by direct boiling of 
chloride solutions produces rutile as the initial crystal- 
line product. Other forms are converted to rutile when 
heated to temperatures between 700 and 920°C. The 
transition is not reversible. The stability of anatase is 


( 1 eon Haayman, and Romeyn, Chem. Weekblad 44, 705 
1948). 
2N. F. Mott, Can. J. Phys. 34, 1356 (1956); Nuovo cimento 
Suppl., Ser. 10, 7, 312 (1958). 

3C. Zener and R. R. Heikes, Conference on Magnetism and 
Magnetic Materials, Boston, Massachusetts, October 16-18, 1956 
(Institute of Electrical Engineers, New York, 1957). 

+ F. J. Morin, Bell System Tech. J. 37, 1047 (1958). 

5 Seppo Wilska, Acta Chem.Scand: 8, 1796 (1954) 
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Fic. 1. Crystal structure of rutile (TiO2). (Courtesy of R. G. 
Breckenridge, W. R. Hosler, and The Physical Review.) 


increased by the presence of 0.1% of certain anions.® 
The preparation and mutual transformation of these 
oxides are discussed in Barksdale.’ 

Moore® has described the method of preparation of 
rutile single crystals by the Verneuil flame fusion 
method. When removed from the furnace the crystals 
are opaque black because of a slight deficiency of 
oxygen, but have the rutile crystal structure. After 
heating in a stream of oxygen the crystals become 
transparent, with a slight yellow coloration. 


2.2. Crystal Structure 


The crystal structure of rutile is usually described 
in terms of an ionic model based on the ions Titt and 
O*-. Although Syrkin and Dyatkina® consider that 
rutile may be regarded as intermediate between ionic 
and molecular in type, one notes that the Ti—O sepa- 
ration in the O—Ti—O grouping is actually slightly 
greater than the remaining four Ti—O spacings. This 
increased separation has been explained” as resulting 
from the repulsion between one oxygen ion and the 
four equidistant oxygen ions in the same octahedron. 

The oxygen ions are arranged in the form of some- — 
what distorted octahedra (Fig. 1). Each octahedron 
shares one edge with adjacent members of the chain. 
Alternatively, the crystal structure may be visualized a 


in the same layer are parallel, and the chains in adjac 
layers are perpendicular to one another and to th 
axis. 


aon Knoll and U. Kiihnhold, Naturwissenschaften 44, 

7Jelks Barksdale, editor, Titanium, its Occurrence, Ci 

and Technology (Roni ld Press, New York, 1951). 
8 C. H. Moore, Jr., Mining Trans. 184, 194 (1949). 
9 Y. K. Syrkin and M. E. Dyatkina, The Structure o 

and the Chemical Bond (Butterworths Scientific 

London, 1950). 

10H. Remy, Treatise on Inorganic Che 

am. 
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2.3. Unit Cell Parameters 


The unit cell of rutile is tetragonal. Recent deter- 
minations of the cell parameters have been made by 
Legrand and Delville," and Baur.” The values obtained 
by Baur were a=4.5943-0.003 A, c=2.959+0.002 A, 
c/a=0.6441, and x+=0.306+0.001. The meaning of the 
symbol x will be clear from its use in the following 
section. 


2.4. Bond Lengths and Angles 


The titanium ions are at the positions (0,0,0) and 
(4,3,3), and the four oxygen ions are at the positions 
+ (x,x,0), and +(3+2%, 3—x, 4). The bond distances 
and angles are given in Table I (see also, Fig. 2). 


2.5. Interatomic Distances from Ionic Radii 


The ionic radii (Zachariasen quoted by Kittel) of 
the cation and anion are, respectively, Rc=0.60 A, 
and IS 1.46 A. 

In ionic crystals, the interatomic distance Dy is 
given by the relation Dy=Rc+Ra+Ay, where N is 
the coordination number of the cation, and Ay is a 
correction which depends upon JN. In rutile the cation 
has six nearest O% neighbors, hence V=6. From col- 
lected crystallographic data of other compounds, Ag is 
found to be zero, or more correctly, Re and R4 are 
calculated on the basis that A=0 for N=6. The Ti—O 
_ separation calculated in this way (2.06 A) is somewhat 
greater than the observed values (1.944 and 1.988 A), 
indicating a slight departure from the rule. It is usually 
issumed that such a discrepancy reflects the contri- 


ag 
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bution of the covalent forces to the crystal bonding. 


the TABLE I 
gt Number of 
bonds of 
wt. its type 
Bond Type® per unit cell 
aoe (a) 
Ti-—O 1.944-+-0.00: 


F. A. GRANT 


2.6. Other X-Ray Investigations 


By the use of Weissenberg photographs, Baur” has 
been able to carry out a Fourier projection of the 
elementary cell of rutile on the plane perpendicular to 
[001] (Fig. 3). This figure shows that the electron 
density does not fall to zero at any point between the 
oxygen and Tir; ions. The “bridge” between cations 
and anions is an indication of a covalent contribution 
to the bonding. These results have been corroborated 
by the results obtained by a group of Spanish 
workers. 1." 


2.7. Madelung Constant 


Assuming purely ionic bonding and regular octahedra 
as the building blocks, Born and Bollnow?>!® carried 
out a Madelung calculation for rutile. This analysis 
predicted a maximum Madelung constant when the 
axial ratio c/a was equal to 0.721. The observed ratio 


Fic. 2. The atomic arrangement of the tetragonal crystal rutile. 
Large circles represent oxygen atoms, small circles titanium atoms. 
The numbers refer to the bond types listed in Table I. 


is c/a=0.6441. The calculated value of the Madelung 
constant is 4.816. 

Making the assumption that the bonding is com- 
pletely ionic (based on Tit and O= ions), and taking 
into account also the repulsive contributions to the 
total energy, Lennard-Jones and Dent!” have calculated 
the expected value of the parameter x (see “Bond 
Lengths and Angles”). In their method the same fixed 
separation r was assumed for all six Ti—O bonds, and 
the potential energy of the crystal was calculated as a 
function x. The value of the parameter x, necessary to 
minimize the potential energy was then obtained. 
Repeating this procedure for a number of values of r, 
a curve of x versus r was plotted. Using the experimental 


13 Bru Villaseca, Cubero, and Vega, Anales real soc. españ. fis. 
y quim. (Madrid) 46A, 317 (1950). 

1 M. Cubero and Hernandez Montis, Anales real soc. españ. 
fis. y quim. (Madrid) 48A, 133 (1952). 


15M. Born and O. F. Bollnow, Naturwissenschaften 13, 559 


(1925). 
y Ar E. Lennard-Jones and B. M. Dent, Phil. Mag. 3, 1204 


E. aonaid Jones and B. M. Dent, Phil. Mag. 3, 1204 
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data available at that time, SnO» and PbO, were found 
to lie on the theoretical curve. Using Baur’s” experi- 
mental value of x=0.306-40.001 for rutile, one obtains 
a value of r=2.05+0.03 A from the theoretical curve, 
which is somewhat higher than Baur’s experimental 
value of 1.988+0.006 A. Once again, one finds a dis- 
crepancy which probably reflects a covalent contri- 
bution to the bonding in rutile. 


2.8. Neutron Diffraction 


Using thermal neutrons, Shull and Wollan!® have 
obtained Laue diffraction patterns of a number of single 
crystals, including rutile. 

3. X-RAY PROPERTIES 
3.1. Soft X-Ray Spectra 
Confirmation of the band structure theory of solids 


is obtained from soft x-ray emission experiments. The 


Tin 


Fic. 3. Fourier projection of rutile on the [001] plane 
(after W. H. Baur'*). 


probability of a transition from an electronic state in a 
higher band to a vacant lower state depends, among 
other factors, on the number of electrons in the upper 
state. Consequently the distribution of intensity with 
wavelength provides information concerning the density 
of states as a function of energy. 

Soft x-ray absorption experiments also provide 
information concerning the energy levels in a solid. 
The K-absorption edge corresponds to the transition 
(consistent with the selection rules) of a 1s electron to 
the. lowest empty level. Experimental values of the 


aon S G. Shull and E. O. Wollan, Naturwissenschaften 36, 291 
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Fic. 4. K-edge structures for oxides of some of the transition 
metals (after Hanson and Knight”). 


absorption coefficients are plotted against photon 
energy in Fig. 4. 

The distinguishing features of these curves are a 
small low-energy absorption, followed by a steep rise 
in absorption coefficient. On the high-energy side of the 
absorption edge the curve exhibits a fine structure 
characteristic of the environment of the atom (the 
type of bonding, etc.). 

The shapes of such curves are determined by the 
absorbing atoms, their immediate surroundings, and by 
type of structure of which they form a part. Kiestra” 
has summarized his conclusions in the following words: 


“(a) In a region close to the edge, the behavior of 
the absorption coefficient is mainly a property of the 
atom in question. 

“(b) In the following region the fine structure is 
determined by the immediate surroundings of the atom. 

“(c) In the region of highest energy, the fine struc- 
ture depends essentially on the whole crystal lattice. 


“For the transition metals and their compounds (a) 
comprises the region up to about 40 eV from the edge, 
(b) the region between about 40 eV and 150 eV, and 
(c) the region above 150 eV.” 
Recently, Hanson and Knight” have investigated 
the K-absorption edge structure of the fluorides, oxides, 
and sulfides of a number of transition metals. These 
workers found that oxides in which the metal assumed 
a higher valence exhibited greater low-energy absorp- 
tion than the same metals in their lower valence form. 
In highly ionic compounds, the low-energy absorption 
is less than in compounds between elements exhibiting 
less difference in electronegativity. a 
When it is suspected that there is an appreciable 
change in symmetry and crystal bonding, as in bari 
titanate at the Curie temperature, the investigat 
the K-absorption edge of titanium might ‘Pro 
of interest. r 


19S. Kiestra, “Conferen 
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The fine structure of the K-absorption edge has been 
investigated for pure titanium,” titanium and 
titanium dioxide, titanium metal, rutile, anatase, 
and brookite, and the L-series x-ray spectra of the 
elements K(19) to Ge(32).2° A limited interpretation 
of the experimental results has been given by these 
authors. Moreover, no attempt has been made to 
correlate these results with the electron transport 
properties of these materials. O’Bryan and Skinner?’ 
have investigated the soft x-ray spectroscopy of many 
metallic oxides and halides (not including rutile). 


3.2. X-Ray Absorption 


Brewster’ lists values of the mass absorption 
coefficient of rutile. At the wavelengths 0.01, 0.1, 1, 
and 2 A, the values of the mass absorption coefficient 
are, respectively, 0.05434, 0.1790, 38.71, and 250.4, 
respectively. The linear absorption coefficient is equal 
to the product of the mass absorption coefficient and 
the density. 


4. CRITERIA OF BOND TYPE 
4.1. Introduction 


When the covalent contribution to the bonding is 
small, as in the alkali halides, a quantitative estimate 
of its effect may be made. When a quantitative estimate 
cannot be made, however, it is usual to make use of 
certain criteria of covalency, such as the type of crystal 
structure, atomic radii, bond lengths, position in the 
periodic table and electronegativity difference of the 
components, Pauling’s rules, etc. 

It should be emphasized that the following criteria 
of the extent of the ionic contribution to the bonding 
are not necessarily in close agreement. Their value is 
therefore more qualitative than quantitative in nature. 


4.2. Dielectric Constant 


The optical and static dielectric constants of covalent 
crystals are almost equal. In contrast, the static di- 
electric constant of highly ionic crystals is considerably 
greater than the optical dielectric constant. The values 
for rutile single crystals are K=173 and Ko1=8.4 in 
the c direction, indicative of a strong ionic character. 


41 Seljakob, Krasnikov, and Stellezkii, Z. Physik 45, 548 (1927). 
22 G. Okuno, Proc. Phys.-Math. Soc. Japan 18, 306 (1936). 
23 M. A. Blokhin, Doklady Akad. Nauk. S.S.S.R. 95, 965 (1954). 
2 V. H. Sanner, Z. Physik 112, 430 (1939). 
25], B. Borovskii, Compt. rend. acad. sci. U.R.S.S. 26, 764 
i lish). 
oso) ee rate Mat. Astron. Fysik 25A, No. 32 (1937). 
y B. M. O'Bryan and H. W. B. Skinner, Proc. Roy. Soc. 
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4.3. Electronegativity 


Mulliken**-*! defines the electronegativity of an atom 
as the mean of its ionization potential and its electron 
affinity, thereby making use of observable quantities, 
The concept of electronegativity is less precise when 
more than one valence electron is involved. 

Hannay and Smyth’? have derived a formula which 
enables the extent of the ionic character of a bond to 
be estimated from the electronegativities of the bonded 
partners. This approach yields a value of 43% ionic 
character for the Ti—O bond in rutile, which is about 
the same as that of HF and KI. 


4.4. Magnetic Susceptibility 


The tetravalent titanium and the divalent oxygen 
ion both have noble-gas configurations. The temperature- 
dependent paramagnetic susceptibility of stoichiometric 
rutile should therefore be zero, whether the bonding is 
ionic or covalent. Feeble (Van Vleck) temperature- 
independent paramagnetism, however, is observed, 
possibly caused by “distortion due to interatomic 
forces.** This distortion is evidence for a departure from 
ionic bonding, and probably reflects a covalent 
contribution. 


4.5. Bond Lengths 


Assuming ionic bonding, but taking into account 
repulsive forces, Lennard-Jones and Dent (see Sec. 2) 
have calculated the cation-anion separation in the plane 
perpendicular to the c axis expected for rutile-type 
crystals. The measured value of this bond length in 
rutile is somewhat shorter than the calculated value, 
and may be taken as evidence for a covalent contri- 
bution to the bonding. 


4.6. Fourier Projection 


Baur has carried out a Fourier projection of the 
elementary cell of rutile perpendicular to [001] (Fig. 
3). For purely ionic bonding one would expect the 
electron density to fall to zero at some point between 
the ions.** The considerably higher electron density 
between the oxygen and Tir; ions in Fig. 3 is evidence 
for a covalent contribution to the bonding. 


4.7. Solubility 


The low solubility of rutile in polar solvents such as 
water, is taken as evidence for a considerable covalent 
contribution to the Ti—O bonding. 


por S. Mulliken, J. Chem. Phys. 2, 782 (1934); ibid. 3, 573 
# C. A. Coulson, Valence (Clarendon Press, Oxford, 1951). . 
A. F. Wells, Structural Inorganic Chemistry (Oxford Uni- 

versity Press, New York, 1950). 
2N. B. Hannay and C. P. Smyth, J. Am. Chem. Soc. 68, 171 

(1946). This formula also appears in references 30 and 31. 

8 L. F. Bates, Modern Magnetism (Cambridge University Press, 

New York, 1951), pp. 44 and 272. 
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4.8. Pauling’s Rules 


One may consider the reduction in stability in the 
rutile structure resulting from the sharing of one edge 
by adjacent octahedra (Pauling’s third rule) to indi- 
cate a departure from purely ionic bonding. This 
reduction in stability is probably compensated by a 
corresponding increase in the covalent contribution to 
the bonding, made possible by this structure. 


5. PARAMETERS FOR THE EVALUATION 
OF EXPERIMENTAL DATA OF 
SEMICONDUCTORS 


5.1. Introduction 


It is desirable, whenever possible, to present experi- 
mental data in terms of parameters derived from a 
(more or less simple) model. In the present case, neither 

‘the accuracy nor the consistency of the available 
experimental data permits a choice to be made between 
the commonly used models. For this reason, the 
properties of rutile are discussed in terms of two simple 
models. 

In this section, these two models are examined, and 
the experimentally observed quantities are expressed 
in terms of such parameters as the ionization potential 
of the donors, the forbidden energy gap, the effective 
mass of the electrons, etc. These equations are then 
used as an indication of the appropriate manner in 
which to plot the experimental data. In the following 
sections the experimental data pertaining to the electron 
transport properties of rutile are introduced, and the 
parameters derived from these data are given. In a 
later section, the values of these parameters obtained 
from magnetic susceptibility, electrical conductivity, 
thermopower, and other measurements, are sum- 
marized. This summary may then be examined for 
indications of the most satisfactory model to be used 
in discussing rutile. 

It is assumed (as in Fig. 5) that there is a forbidden 
gap of several electron volts between the valence and con- 
duction bands. Donor sites are believed to lie one- or 
two-hundred millivolts below the bottom of the conduc- ` 
tion band. There may be overlapping of donor wave 
functions leading to the formation of donor bands. In 
this case the effective mass is expected to be much 
higher in the donor band than in the conduction band. 

The various symbols used in later sections are defined 
below: 


No= 2(2rmkTo/h®)}=the constant 2.5210" elec- 
trons/cc and is defined for T>=300°K. This quantity 
is the so-called “effective number of states.” 

n_= the concentration of electrons in the conduction 
band (per cm®). Electrons in donor bands, which may 
incidentally contribute to the conduction, are excluded. 


3° R. C. Evans, An Introduction to Crystal Chemistry (Cambridge 
University Press, New York, 1952). 


30 W. Crawford Dunlap, Jr., An Introduction to Semiconductors 
(John Wiley & Sons, Inc., New York, 1957). 
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Fic. 5. Two types of band structure assumed in the discussion 
of the transport properties of rutile. 


M= (m./m) =the density-of-states effective electron 
mass. 

t= (T/300)=the absolute temperature divided by 
300°K. 

=the electronic charge in coulombs. 

n=the height of the Fermi level in joules, measured 
from the bottom of the conduction band. 

k=the Boltzmann constant (units depend upon the 
context-joules per degree when transport properties are 
in question). 

T=the absolute temperature in °K. 

Na= the concentration of donor atoms (per cm’). 

Ea= the energy of the donor sites with respect to the 
bottom of the conduction band, in joules. 

b=the electron mobility in cm? volt sec. 

E,=the energy gap, i.e., the separation of the valence 
and conduction bands, in joules. 

We take the criterion for the applicability of Maxwell- a 
Boltzmann statistics to be —n/kT>3. ce. 


5.2. Donor Sites in Rutile 


Nonstoichiometry is believed to result in the presence 
of oxygen vacancies, together with twice that number 
of electrons, probably on Tit’ sites.t No distinction 
will be made between nonstoichiometric rutile and rutile _ 
containing pentavalent impurities. In both cases the F. 
formation of donor sites will result. The presence of _ 
acceptor levels is considered if and when necessitated _ 
by experimental evidence. Two cases are considered 
corresponding to the two models: (a) noninteracting — 
hydrogen-like donor sites, at sufficiently small concen- _ 
tration, in which case a single s electron is considered _ 
to move in the field of a net positive charge and 
interacting donor sites, when the overlapping of the 
donor wave functions is appreciable. This interaction 
may lead to the formation of a donor band, and 
possibly to associations of hydrogen molecule- 


vacancies, in which two electrons are considered 
move in the field of a net positive charge of two. 


t The experiments of Gray and McCain indicate that 
of rutile in an atmosphere of hydrogen leads to a loss i 
the rutile. This result is evidence for the creation o 
cancies rather than hydrogen interstitial atoms. T. 
Defect Solid State (Interscience Publishers, Inc., 


p. 287. «eee 
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The ground state of the donor sites (or the midpoint 
of the donor bands) is assumed to lie at an energy level 
Ea with respect to the bottom of the conduction band 
(Za is expected to be a negative number). The simple 
model of spherical energy surfaces and a single density- 
of-states mass, is applied to the conduction band. The 
effective electron mass is assumed to be greater in the 
donor band than in the conduction band. These models 
may have to be modified if evidence is found of a second 
source of electrons (i.e., if oxygen vacancies behave as 
helium-like centers). 

Because of the random distribution of donors, the 
donor bands may have “tails” so that some overlapping 
of the conduction band may occur. However, this 
modification is not introduced into the mathematics 
because of the obvious difficulties. It can be shown’? 
that these models lead to the following results: 


N-=noM 1) exp(n/kT). (5-1) 


5.2.1. Noninteracting Donor Sites 


We usually make the additional assumption that the 
Fermi level 7 lies more than 3kT higher than the donor 
impurity levels.t This assumption implies that less than 
one impurity site in twenty is ionized, and leads to the 
following relations: 


Mo= ZN a exp(Ea—n)/kT, (5-2) 


where Na= the concentration of donor atoms (per cm’), 
and Ea=the energy of the donor sites with respect to 
the bottom of the conduction band. 

If Eqs. (5-1) and (5-2) are combined, one obtains the 
following expression for the concentration of conduction 
electrons in terms of the ionization potential of the 
donor impurity sites” 


moNa\?} 
N= ~~) Mitt expEa/2kT. (5-3) 


5.2.2. Donor Bands 


i When the wave functions of the donor sites overlap, 

o the energy levels of the isolated donors must be replaced 
; by bands. Since there are Va energy levels in the band, 
and each level may accommodate two of the Na elec- 
trons, the donor band is only half filled at 0°K. Thermal 
excitations of electrons to the conduction band tend to 


7 See Eq. (5.33a) of reference 36. 

Because of the high value of the dielectric constant of rutile, 
prove to be so small that this assumption is justified only 
very low temperatures, and for sufficiently high donor con- 
trations. The effect of the high dielectric constant will be 
teracted to some extent if the effective mass ratio is also high. 


26; nce 36. 
267 OO hen for the factor 4 in Eq. (5-2) found in Ap- 
E. Spenke, Elektronische Halbleiter (Springer-Verlag, 


he factor 2, and the sign of Eu, this is Eq. (5-31) 
heta eee is measured with respect to the 
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lower the Fermi level as the temperature is raised, 
However, it can be shown that the Fermi level remains 
near the donor band provided that Nane. In this case 
it is appropriate to use the height of the Fermi level 7 
as an indication of the position of the donor band. For 
this purpose 7 is expressed in terms of observable 
quantities by taking logarithms of both sides of Eq. 
(5-1) and multiplying by kT. 


5.3. Interpretation of Hall Data 


The Hall constant R for nondegenerate n-type semi- 
conductors (discussed also in Sec. 8) is related to 7e by 
the relation” 


n-=r/eR=—6.25X 10!8r/R. (5-4) 


The magnitude of the coefficient r, which depends on 
the type of scattering,’ lies between 1 and 2. 


5.3.1. Noninteracting Donor Sites 
Combining Eqs. (5-3) and (5-4) one obtains 


1/(—R)=constant Xt? exp(Eu/2kT), (5-5) 
or 
MERE Int]=Ki— E./24T, (5-6) 


in which Ky is a constant which is dependent on M, 
Na and on the coefficient r of Eq. (5-4). When the 
quantity [In(— R)+3 Inż] is plotted against reciprocal 
temperature, one obtains — Ea from the slope, and if one 
knows Na, and if M is temperature independent, one 
obtains M from the value of the intercept, Ki, made 
by this straight line. 


5.3.2. Donor Bands 


In Sec. (5.2.2) the height of the Fermi level has been 
discussed in the case in which the overlapping of the 
donor wave functions leads to the formation of donor 
bands. Provided that Nane, it may be shown that the 
Fermi level lies below, but near, the midpoint of the ` 
donor band. It is reasonable to assume that the electron 
mobility in the conduction band is much greater than 
in the donor band. It is therefore appropriate to plot 
the high temperature Hall data in terms of Eq. (5-1), 
which may be rewritten with the known quantities on 
the left-hand side, as follows 


Inz,.—Into—$3 Int=$ InM+n/kT. (5-7) 


At sufficiently low temperatures, however, as the con- 
centration of conduction electrons decreases, the effect 
of the electrons in the donor band is expected to 
dominate. Maxwell-Boltzmann statistics are inappli- 
cable to these latter electrons. 


5.3.3. Hall Mobility 
The electrical conductivity may be expressed ' 
(5-8 


o=n,.eb. 
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Combining Eqs. (5-4) and (5-8) one obtains an ex- 
pression for the mobility 6 in terms of the Hall coefficient 
and the electrical conductivity ø: 


b=0.840(—R/r). (5-9) 


5.4. Interpretation of Electrical Conductivity Data 
5.4.1. Intrinsic Semiconductors 


If the density-of-states masses of electrons and holes 
are equal, then the Fermi level is in the middle of the 
energy gap, i.e., n= — E;/2. Combining Eqs. (5-1) and 
(5-8) and 


o=nyebM 33 exp(— E,/2kT). (5-10) 


In this case the temperature dependence of the ex- 
ponential factor is usually so great in comparison with 
that of the factors b and # that Eq. (5-10) may be 
written 


o= Kz exp(—E,/2kT). (5-11) 


E, may then be evaluated from the slope of a plot of 
Ino versus reciprocal temperature. 


5.4.2. Impurity Semiconductors 


Isolated donors.—In the case of noninteracting im- 
purities, one combines Eqs. (5-3) and (5-8) to give 


Ino=3 In(moN a/2)+Ine+1nbd+% In 
+3 Int+exp(Ea/2kT). (5-12) 


Without some assumption concerning the temperature 
dependence of the mobility b, Ea cannot be evaluated 
from conductivity data alone. (The assumption of a 
single density-of-states electron mass for the conduction 
band necessarily implies that M is considered to be 
temperature independent.) 

Breckenridge and Hosler"! (following Fröhlich?) 
attempt to interpret their data by using the following 
relation for the mobility when the scattering is pre- 
dominantly due to optical lattice vibrations: 


b= A (exp(6/T)—1), (5-13) 


where 6 is the Debye temperature. After the appropriate 
substitution 


Ino—3 Inft—In(exp(0/T)—1)=K3+(Ea/2kT), (5-14) 


in which K; is a constant which depends on Na, A, and 
M. An assumption concerning the value of 0 permits 
the left-hand side of Eq. (5-14) to be evaluated from 
conductivity measurements, and plotted against re- 
ciprocal temperature. From the slope Ea may be 
determined. Since K; is dependent on the unknowns 
Na, A, and M, it is not feasible to attempt an inter- 


Goan) G. Breckenridge and W. R. Hosler, Phys. Rev. 91, 793 


53). 
ee oaen and N. F. Mott, Proc. Roy. Soc. (London) A171, 
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pretation of the intercept of this plot in the absence of 
further information. 

Actually, (as shown in Sec. 8) the Hall mobility data 
of Breckenridge and Hosler for slightly reduced rutile 
are adequately represented above room temperature 
by the empirical relation 


b=aT>-5, (5-15) 


where T is the absolute temperature, and a is a constant 
independent of temperature. Substitution of this rela- 
tion into Eq. (5-12) enables one to write 


Ks+Ea/2kT. (5-16) 


Donor bands.—At sufficiently low temperatures the 
electrons would be expected to lie in the donor band. 
The donor-band conductivity in this low-temperature 
range should depend on temperature chiefly through 
the mobility factor 6. Little information would be 
obtainable concerning the position of the donor band 
with respect to the conduction band. 

At higher temperatures the electrons in the con- 
duction band should make a greater contribution to the 
conductivity than the electrons in the donor band. 
Provided that information is available concerning the 
electron mobility, the data on electrical conductivity 
at higher temperatures may be used to give information 
concerning the Fermi level, and hence the approximate 
position of the donor band. Making the usual assump- 
tions concerning the applicability of Maxwell-Boltz- 
mann statistics and of the concept of a single effective 
electron mass to the electrons in the conduction band, 
we may combine Eqs. (5-1) and (5-8) to yield the 
following relation: 


Ino—Innoe—$ Int=$ InM+Inb+n/kT (S-17) 


in which the quantities on the left-hand side are all 
known. When the assumption implied in Eq. (5-13) is 
valid, we may combine Eqs. (5-13) and (5-17) and 
express the result in the form 


Ino—$ Int—In(exp(0@/T) —1) = Ks+n/kT. (5-18) 


When the T=- relation [Eq. (5-15) ] is valid, one 
may write 


Ing+1.75 Inf= 


Ino+Int= Ke-+n/kT. (5-19) 


5.5. Interpretation of Thermopower Data 
5.5.1. Impurity Semiconductors 


Noninteracting donors——Equating Eqs. 
(5-2) one obtains 


noM it} exp(n/kT)=4Na exp(Ea—n)/kT (5-20) 
which may be expressed in the form 


4 InzotHł InM+$ Ini+ exp (n/kT) 
=}InNetEe/2kT—In 2. (5-21) 


(5-1) and 


Noting that 


e 


ai 


Gra NE 


| 


i 
$ 
JEE 
ey) 
ee 
d n 
DE 
| 
Hi 
j 


aaee ee aoi E DE 


eae 


LS 


ee eye S 
Sea ee m an 


ae 
= 


654 — Pe A. 


[see Eq. (7-2)], where Q* is approximately equal to 
2kT, one obtains 
eQ/k+-4 In(T/300)=Kr+-Ea/2kT (5-23) 


in which the constant Ky includes Q*/kT, and the 
impurity concentration term, and the effective mass 
term, which are assumed to be temperature indc- 
pendent. The thermopower, Q, is defined in Sec. 7.1. 

Donor bands.—At sufficiently low temperatures, the 
contribution of the electrons in the conduction band is 
negligible. In this temperature region, the electrons in 
the donor band are expected to exhibit degeneracy, 
and the dynamic term Q*/kT of Eq. (5-22) should make 
the major contribution to the thermopower. This 
contribution (approximately 2k/e) is almost tempera- 
ture independent. 

At higher temperatures, where the electrons in the 
conduction band make the main contribution to the 
thermopower, the height of the Fermi level is related 
to the thermomower through Eq. (5-22). Replacing Q* 
by its approximate value 2k7, one obtains 


E,=n/e= (Q+2k/e)T. (5-24) 


Thus when the donor sites interact to form donor bands 
one expects an almost temperature-independent value 
for QT in the higher temperature range. At still higher 
temperatures where thermal excitation raises a large 
proportion of the electrons from the donor band to the 
conduction band, the Fermi level falls below the donor 
band with increasing temperature, and Q should tend 
to a constant value. 


5.5.2. Correlation of Thermopower and 
Conductivity Data 


Impurity semiconductors.—When the temperature is 
sufficiently high for the contribution of the electrons 
in the conduction band to dominate the contribution 
to the conductivity of the electrons in the donor band, 
the correlation between thermopower and conductivity 
data is independent of the interaction between donor 
sites. Equations (5-1) and (5-8) may be combined to 

jeld 
á Ino=ln(no)+lnb+4 InM-+-3 lnt+-n/kT. (5-25) 

When the foregoing assumption, Eq. (5-13), con- 

cerning b is permissible, one obtains 


Ino—$ Inf—In(exp(@/T)—1)=Ks+n/kT (5-26) 


in which Ks is a constant which depends on A and M. 
The thermopower Q may be expressed in the form 
=(n—0%)/eT, where Q* is a quantity of the order of 
magnitude of 2kT- .§ Whence 
In(exp(6/T) — 1) = Ks+ (e0/K) 
+Q*/kT. (5-27) 


on is made concerning 9, the left-hand 
tion may be evaluated from conduc- 


Ino—# Int— 


side of this equa 
§ References are given in 
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tivity measurements. However, the quantity Ks 
involves both the effective mass ratio M and the 
mobility factor A. Hence, in the absence of further 
information, no correlation between electrical con- 
ductivity and thermopower is possible.|| To the extent 
that the term Q*/kT is temperature independent, 
however, it should be possible to correlate dO/dT with 
the rate of change of the left-hand side with tempera- 
ture. 

A similar conclusion is reached when the mobility b 
is inversely proportional to a power of the absolute 
temperature, as in Eq. (5-15). 


5.5.3. Correlation of Thermopower and Hall Data 


Correlation of thermopower Q with the Hall coeff- 
cient would involve one fewer variable (6), and would 
therefore be preferable to the correlation with electrical 
conductivity. 


6. ELECTRICAL CONDUCTIVITY 
6.1. Rutile Single Crystals (Intrinsic) 


Cronemeyer* has measured the intrinsic conductivity 
of rutile single crystals in the temperature range 300° 
to 1400°C, using specimens approximately 1X1 X10 
mm in dimensions. The four-terminal method of 
measurement was used (see his Fig. 3); hence, the 
measured conductivity should be independent of the 
electrode material. However, the conductivity was 
found to be field dependent when the field strength was 
greater than 10 v/cm for the intrinsic conductivity, 
and greater than 0.1 v/cm for reduced (presumably 
nonstoichiometric) samples. 

Cronemeyer’s results may be represented by the 
following equations, in which o represents the electrical 
conductivity in (ohm cm)-!, and T is the absolute 
temperature: 


Direction perpendicular to the c axis 
623° to 1123°K 


Ino= 7.92—17 600/T (6-1) 

1123° to 1673°K 
Ing=11.10—21 200/T. (6-2) 

Direction parallel to the c axis 

773° to 1223°K 
Inc= 8.43—17 600/T (6-3) 

1223° to 1673°K 
Inc=11.30—21 200/T. (6-4) 


|| However, in some materials, such as NiO, it is possible to 
equate the density of states with the number of cations. This 
simplification, which results from the localization of the cation 
wave functions, facilitates a correlation of thermopower an 
conductivity data without requiring a knowledge of the effective 
eee mass. See F. J. Morin, Bell System Tech. J. 37, 1047 
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The respective activation energies calculated from 
these data are 1.53, 1.84, 1.53, and 1.84 ev. For intrinsic 
samples, the energy gaps are assumed to be approxi- 
mately twice these values. In deriving these results the 
temperature variation of the (7/300)! and mobility 
factors of Eq. (5-1) have been neglected. 

Cronemeyer associates the 3.06-ev energy gap 
obtained in this manner with the optical absorption 
edge at 3.02 ev. 

A more recent investigation of the conductivity of 
single crystals of rutile has been carried out by Y. L. 
Sandler.*® Preliminary published results indicated a 
somewhat higher conductivity in the temperature range 
250° to 700°C, and a somewhat smaller energy gap of 
2.8 ev. 


6.1.1. Field Strength Dependence 


Below 900°C, Cronemeyer’s measurement of the 
intrinsic conductivity of rutile showed some field 
strength dependence for fields greater than 10 v/cm. 
Measurements made at 820°C showed an increase in 
conductivity of about fifty percent when the field was 
increased to about 20 v/cm. Nonlinearity apparently 
begins at an even lower field strength (0.1 v/cm) in 
reduced samples. 


6.1.2. Current versus Time 


Sandler’ has observed the current versus time 
characteristics of rutile single crystals. Measurements 
were made at temperatures up to 720°C for various 
applied steady voltages. Only curves for 654°C are 
given in the report. Although it was not explicitly 
stated, the two-terminal method was probably used, 
with evaporated gold electrodes. The discussion seems 
to indicate that the specimens were in an atmosphere 
of pure oxygen. To ensure that the observed time- 
dependent effects were not due to surface conduction, 
a guard-ring electrode was used. 

The curves were characterized by relaxation times 
that decreased as the applied voltage increased. For 
1.06-v potential difference, the relaxation time was 
a about two minutes. The author considered that the 

data might be explained in terms of space-charge 
polarization and field emission. 


6.1.3. Tonic Conductivity 


Large currents could be passed through the rutile 
crystals for hours without destructive effect. Hence 
Cronemeyer concluded that the conductivity of rutile 
was essentially electronic. 


6.1.4. Dependence on Area of Cross Section 


The conductivity was observed to increase by a 
factor of fifteen when the cross-sectional area was 
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increased from 3X10 to 6X10? cm*. Cronemeyer 
suggested the possibility that this was due to “carrier _ 
injection with predominant recombination at the 
surface.” 

6.1.5. Other Phenomena 


Sandler‘ has investigated the effect on the con- 
ductivity of rutile crystals of the heating during the 
process of depositing the evaporated gold electrodes. __ 
Electrodes were formed at 100 and 300°C. The con- 
ductivity ceased to be an exponential function of 
reciprocal temperature below 270°C, and below 500°C, 
respectively. P 

In the presence of water vapor, the conductivity was 
considerably increased. By the use of a guard electrode, 
it was shown that the effect was not due to surface 
conduction, but was probably due to the effect of 
moisture on the crystal-electrode interface. 


6.2. Rutile Ceramic (Intrinsic) 


Dependence of electrical conductivity on oxygen 
pressure is probably associated with departures from 
stoichiometry.{[ The conductivity of rutile containing 
intentionally introduced oxygen vacancies is discussed 
in a later section. The workers referred to in this section 
made no deliberate attempt to reduce their specimens 
more than the degree of reduction incidental to the : 
conditions of measurement. 

Earle*® measured the electrical conductivity of 
pressed disks of rutile as a function of the pressure of __ 
the surrounding oxygen. His results may be expressed __ 
in the form 


Inc=A—B Inp, (6-5) 


where ø is the conductivity in (ohm cm), p is the 
oxygen pressure in atmospheres, and A and B are s 
temperature dependent parameters. 
At peure below 30 mm of mercury B was found ip 
to be at 626°C, and equal to 3 in the temperature © 
range 755° to 968°C. Above 30 mm of mercury, B y 
equal to 4. At 50 mm of mercury and 1000°K 
conductivity was 1.6X 10-8 (ohm cm)". The activati 
energy was about 1.7 ev, and appeared to be indepe 
ent of the oxygen pressure. Evaporated gold e 
trodes were used. pi 
Earle investigated the possibility of ionic c 
tivity, and believes that its contribution was negli 
under the conditions of his experiment. à 
Hauffe“ also investigated the dependence of 
electrical conductivity of rutile on the pressur 
surrounding oxygen gas. His results may be 
by the same formula, but with different valu 
{ The effect of oxygen pressure on surface conductivity is 
important. The effects of oxygen and other gases on the sur 
conductivity of silicon and germanium are discusse 
conductor Surface Physics, R. H. Kingston, ai (U 
Penns; aylvanin, Press, 1956). e 
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and B. At 800°, 900°, and 1000°C, the values of B were, 
respectively, 1/4.8, 1/4.4, and 1/5.3. 

Combining measurements made at these three tem- 
peratures, and using the approximate value of 4 for B, 
Hauffe’s results may be represented roughly by the 
relation 


Ino=— 4 Inp—22 400/T (6-6) 


in which p is expressed in atmospheres, and T is the 
absolute temperature. The coefficient of 1/T yields an 
activation energy of 1.93 ev. 

Gorelik*® has measured the electrical conductivity of 
“pure” sintered rutile, and rutile to which various 
amounts of the oxides of the alkaline earths had been 
added. His results for “pure” rutile may be expressed 
by the following formula in the temperature range 380° 
to 490°K: 

Ino=2.5—11 700/T. (6-7) 


The corresponding activation energy is 1.02 ev. 
Gorelik’s investigation of the effect of a magnetic field 
on the electrical conductivity is discussed in another 
section. 


6.3. Reduced Rutile 


The term “reduced rutile” is used here to signify 
that an attempt has been made to introduce oxygen 
vacancies by heating the specimen in oxygen at reduced 
pressure, or in an atmosphere of hydrogen or some 
other reducing gas. It is implied that the specimens 
have been quenched so that the vacancy concentration 
is higher than that corresponding to the equilibrium at 
the temperature and pressure at which measurements 
are made. 

When an oxygen vacancy is created, the two electrons 
associated with the ion must be retained in the crystal 
to preserve electrical neutrality. There is some un- 
certainty whether these electrons (a) are associated 
with the oxygen vacancies to form helium-like donor 
sites, or (b) whether the electrons convert two Tit! 
sites into Ti* sites. 

In case (a) one, or possibly two, activation energies 
are expected for electron transport phenomena when 
the vacancy concentration is too low for the formation 
of impurity bands. Because of the formation of donor 

bands one expects the disappearance of the activation 
energy at sufficiently low temperatures when the 
= vacancy concentration is high. 


i 6.3.1. Experimental Results 


= Breckenridge and Hosler measured the electrical 
nductivity of reduced rutile single crystals and 
mics. The conductivity of these samples lay in the 
0.03 to 10 (ohm cm), depending on the degree 
ction, which was carried out in hydrogen at 
temperatures. 


J. Exptl. Theoret. Phys. U.S.S.R. 21, 826 (1951). 
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The curves of the logarithm of the conductivity versus 
reciprocal temperature were fairly flat, and in most 
cases had negative slopes at 1000 and at 77°K. 

There are so many disposable parameters in the 
simple model [Eq. (5-12)] (Debye temperature 9, 
impurity concentration, effective mass, etc.) that it is 
unlikely that its use in the interpretation of these data 
can yield significant information, except in conjunction 
with Hall data. The values of the electron mobility 
obtained from Breckenridge and Hosler’s conductivity 
and Hall data are discussed in Sec. 8. 

Kataoka and Suzukit have made measurements of 
the electrical conductivity and thermopower on reduced 
rutile ceramic. When these results are plotted® ac- 
cording to Eq. (5-16) assuming isolated donor levels, 
the slopes correspond to values of the donor ionization 
potentials in the range 0.29 to 0.35 ev. An exception to 
this result (0.23 ev) was the sample of highest con- 
ductivity (20 ohm cm) for which the explanation 
could be partial degeneracy. 

When the results are plotted®! according to Eq. 
(5-19) assuming a donor band, the resulting values of 
the Fermi level lie in the range 0.12 to 0.15 ev, with the 
exception of the sample of highest conductivity which 
yielded a value of 0.09 ev. 

Boltaks® and co-workers have investigated the con- 
ductivity of sintered specimens of rutile which had been 
reduced in hydrogen or in carbon monoxide. They found 
that the temperature variation of the pre-exponential 
term in Eq. (5-12) could not be neglected. When 
allowance was made for the T? factor in Eq. (5-12) a 
linear plot of Ino versus reciprocal temperature was 
obtained. From the slope of the tangent to this curve, 
the ionization potential Ha of the donor sites was 
estimated to be less than 0.2 ev at room temperature. 
Apparently no consideration has been given to the 
effect of the variation of the mobility b with tempera- 
ture [Eq. (5-12) ]. 


6.3.2. Single Crystals versus Ceramics 


The measurements of Breckenridge and Hosler may 
be examined to compare the properties of polycrystal- 


9 S. Kataoka and T. Suzuki, Bull. Electrotech. Lab. Tokyo 18, 
732 (1954). 

51 Measurements of electrical conductivity made by Kataoka 
and Suzuki may be represented within the experimental error by 
the following expressions between 1000/7 =1.5 and 1000/T=3.5: 


Specimen logioc 
2.12—250/T 
2.02 —550/T 
1.72—520/T 
1.40—500/T 
0.60—400/T 

—0.13—475/T 


In the same temperature range, logt=log(7'/300) may be repre- 
sented approximately as a function of 1/7 by the expression 
0.62—184/T. Substitution of these expressions into Eqs. (5-16) 
and (5-19) yields the activation energies and ionization potentials 
given in the text. The mobility has been assumed to be propor- 
tional to T-*-5 (see Sec. 8-2) 
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TABLE IT. Comparison of parameters for single crystals and ceramics reduced in hydrogen under approximately 
the same conditions (from Table I of Breckenridge and Hosler®). 


k Suji Effective 
, Reduction Resistivity Hall Activation mass 
Reduction time at coefficient energy ratio 
temperature in 300°K at 300°K AE: of electrons 
SQ minutes ohm-em (—R) evè n me/m 
National Lead 800 5 8 4 0.12 1.4 10" 43 
single crystal 
National Bureau 800 5 1 0.3 0.14 29X10" 55 
of Standards 
ceramic 
Linde single 800 indefinite 0.7 0.10 0.16 15X10% 46 
crystals 
(c direction) 
(a direction) 800 15 1 0.10 0.15 13X10% 114 
National Bureau 800 average 1 0.50 0.14 3.1X 10% 60 
of standards of 5, 10, 
ceramic 20 min 
specimens® 


a See reference 49. 
b See Eq. (8-2). _ Lec 
e The corresponding resistivity and Hall curves are almost coincident. 


line specimens and single crystals. When this com- 
parison is made for specimens that have been reduced 
under similar conditions, there is no evidence for a 
significant difference between the single crystal and 
polycrystalline specimens with respect to mobility, 
effective mass, or activation energy. 

In Table II some of Breckenridge and Hosler’s 
results have been tabulated for the purpose of com- 
paring the parameters for single crystal and ceramics. 
The comparison is justified only for pairs of samples 
that had been reduced (in hydrogen) at the same 
temperature for approximately the same length of time. 

As an example of the difficulty of interpreting these 
results, one notes that the Hall coefficient of the 
National Lead single crystal (reduced 5 min at 500°C) 
is higher (by a factor of eight) than that of the ceramic 
specimen subject to the same reduction. However, the 
Hall coefficient of the Linde crystal is less than or equal 
(depending on the direction of the current with respect 
to the optical axis) to that of the ceramic specimen. 
Similarly, there appears to be no evidence that the 
effective mass ratio is greater for single crystals than 
for ceramic specimens. 


6.4. Rutile Containing Added Impurities 


6.4.1. Conductivity in Air at Normal Pressure 


The effect of impurities on the electrical conductivity 
of rutile ceramics has been investigated by Hauffe,“ 5 
by Grunewald,*! and by Johnson.*° 

Johnson’s samples were air-quenched from 1200 to 
250°C and measurements were made at the latter 
temperature. The results should be accepted with some 
caution because little attention was paid to the state of 


53 Hauffe, Grunewald, and Trinckler-Greese, Z. Elektrochem. 
56, 937 (1952). 
s H. Grunewald, Ann. Physik 14, 121 (1954) ; Hauffe, reference 


47. 
55 G. H. Johnson, J. Am. Ceram. Soc. 36, 97 (1953). 
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reduction of the specimens, and the impurities may have 
assumed valences other than those expressed in the 
formulas given. In every case the effect of the impurity 
approached a saturation value when the concentration 
exceeded about one or two mole percent. This saturation 
value is given in the following summary. 


6.4.2. Impurities of the Type X:0 


The oxides of lithium, potassium, and silver decreased 
the conductivity of rutile, while that of sodium pro- 
duced an increase (Table ITI). The effect was small in 
comparison with some of the results discussed in the 
following. 


6.4.3. Impurities of the Type XO 


The effects of admixture of the oxides of beryllium, 
magnesium, calcium, strontium, barium, zinc, cadmium, 


TABLE III. Effect of various impurities® on electrical conductivity 
of rutile ceramic at 250°C (after Johnson»). 


Change in Change in 
Impurity conductivity Impurity conductivity 
WO; 4000 
MoO; <0.5 
P20s X 1020 BeO X25 
Sb:05 X833 MgO X<0.5 
V205 X27 CaO 0.07 (?)° 
Ta:0s X4166 SrO X0.25 
Nb205 5500 BaO X<0.95 
ZnO X0.33 
SiO: X3.3 CdO 0.50 
ZrO: xX0.9 NiO X0.33 
B:03 x6 CoO X0.33 
AlO; X0.4 Li:0 X0.07 
Fe.03 X<0.5 Na:0 X1.7 
Cr:0; X3.5 K:0 X0.7 
Ga:0; X0.4 Ag:0 X<0.7 
Y20; x0.1 


a “Saturation” effect—usually about one mole percent impurity. 
b See reference 55. 
© See text. 
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nickel, and cobalt were investigated. The effect was 
small, except in the case of beryllium (increase by a 
factor of 25), and calcium (decrease by a factor of 0.07). 
The effect of beryllium was considered to be possibly 
due to the donor electrons resulting from the formation 
of interstitials. In the case of calcium the explanation 
of the anomalously low value may be as follows. Com- 
parison of Johnson’s Table IV with his Fig. 1 suggests 
that the correct value for calcium is probably consider- 
ably higher than 0.07, a value which was apparently 
influenced by one low experimental point. 

Gorelik®® has investigated the effect of various con- 
centrations of the oxides of the alkaline earths on the 
electrical conductivity of rutile in the temperature 
range 380° to 490°K. Values of the activation energies 
obtained from plots of the logarithm of the conductivity 
versus reciprocal temperature lay in the range 0.78 to 
1.15 ev for additions of CaO and SrO. At a temperature 
of 418°K the admixture of about three moles of CaO 
per hundred moles of TiO» caused a fourfold increase in 
conductivity. Further increase in CaO content caused 
the conductivity to decrease until it reached one 
quarter of its initial value when the CaO content 
reached twenty-five moles per hundred moles of TiO». 

One mole of SrO per hundred of TiO, yielded a 
fifty-fold increase in conductivity over the value for 
“pure” rutile. Further admixture of SrO caused a 
decrease in conductivity, so that an admixture of 50 
moles of SrO per 100 moles of TiO» resulted in a value 
of conductivity only slightly higher than the value for 
“pure” rutile. Gorelik interpreted his results in terms 
of contributions from ionic and electronic conductivity. 
This interpretation was based partly on his investi- 
gation of the effect of a magnetic field on the conduc- 
tivity of these specimens, and is discussed further in 
another section. 


6.4.4. Impurities of the Type X20; 


With the exception of the oxides of boron and 
yttrium, the effect of oxides of this group is not pro- 
nounced. The low value assigned to the oxide of yttrium 
appears to be unduly influenced by a single experimental 
point, and one could well expect a value as high as 0.3 
from Johnson’s Fig. 1. 

Admixture of 0.5 mole percent of chromium oxide 
decreased the conductivity at 1100°C, but produced a 
sharp increase in conductivity at 500°C. Further 
increase in admixture of this oxide caused little change 
in conductivity. The apparent activation energy was 
reduced by a factor of three by the admixture of 0.5 
mole percent. This apparent change in activation energy 

appears to be associated with a different mechanism of 
conductivity. This conclusion is strengthened by the 
oxygen pressure dependence of the conductivity of this 


material, discussed in another section. 
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6.4.5. Impurities of the Type XO» 


The effect of the oxide of silicon appears to be 
surprisingly high in view of the valence assumed by this 
element. 


6.4.0. Impurities of the Types X2Os and XO; 


Some of the largest effects were produced by oxides 
in these groups. However, the low values corresponding 
to the oxides of molybdenum and vanadium are re- 
markable. The difference between the elements molyb- 
denum (atomic number 42) and niobium (atomic 
number 41) is interesting, as is the difference between 
the effect of the oxides of tungsten (atomic number 74) 
and tantalum (atomic number 73). 

Table IV summarizes the effect of adding certain 
oxides to rutile containing 0.5 mole percent of niobium 
oxide. The large effect of aluminum oxide is readily 
explained. The trivalent ion has a noble gas configu- 
ration, hence a comparatively large amount of energy 
is required for the removal of the fourth electron. 
Aluminum occupies a titanium site in the crystal struc- 
ture but remains in the trivalent state. The association 
of trivalent aluminum and pentavalent niobium ions 
preserves electrical neutrality without the formation of 
trivalent titanium ions. Hence the large reduction in 
the electrical conductivity of the niobium-doped rutile 
resulting from the addition of aluminum oxide. 

Unfortunately this explanation does not account for 
the enormous difference between the effects of the other 
oxides listed in Table IV. The fourth ionization po- 
tentials are believed to lie in the range 50 to 64 ev 
(compare with that of aluminum: 120 ey). 

The effect is not closely related to the ionic radius, 
since the value for trivalent gallium (0.26 A) is only 
slightly smaller than the ionic radius of trivalent 
chromium (0.64 A). 

Gallium, indium, and chromium are known to exhibit 
more than one valence in the formation of oxides. The 
extreme values in the table are for the oxides of alumi- 
num and yttrium, both of which form only trivalent 
ions with noble gas configurations. A satisfactory theory 
of these effects would have to explain the remarkable 
difference between the effects of these oxides. 


6.4.7. The Results of Hauffe and Co-Workers 


Hauffe*? and co-workers have found that the ad- 
mixture of a small amount of WO; to rutile increases 
the conductivity enormously (page 136 of reference 47). 
The effect is more pronounced at 600°C than at 900°C. 
These workers have also investigated the effect of 
admixture of the oxides of nickel, gallium, aluminum, 
and chromium on the electrical conductivity of rutile 
‘ceramic in the range 600 to 900°C. The effect of the 
oxides of aluminum, nickel, and gallium was small, and 
produced no apparent change in the activation energy 
for the impurity range 0 to 2.0 mole percent. 
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TABLE IV. Effect of 1 mole % of (other) impurity on rutil 
containing 0.5 mole % Nb2O;s (¢=3x10-4 creat St) (after 
Johnson’). 


A Change in 
Impurity conductivity 
Al.O3 +35 000 
Y.0; + 58 
Ga203 +80 000 
In:03 > 990 
CKO: = GA 


^ See reference 55. 


6.4.8. Effect of Impurity Concentration 


Johnson attempted to estimate the percentage of 
foreign oxide that can be taken into solution at 1200°C 
and retained in this state during subsequent heat 
treatment, i.e., air-quenching to 250°C. This result 
showed a saturation effect in most cases at concen- 
tration above about 0.25 to 0.50 mole percent. 

The conductivity was highly dependent on the heat 
treatment a sample received, and the rate of cooling 
determined the degree to which “‘exsolution” occurred. 
[t would be of interest to establish the extent to which 
these effects are (a) dependent on the formation of 
impurity bands and (b) are determined by the solu- 
bility of the added oxide in rutile. In case (a) it is 
probable that there exists a correlation between these 
results and the paramagnetic susceptibility of these 
doped specimens. 


7. THERMOPOWER 
7.1. Introduction 


The electrical circuit for the measurement of the 
thermopower (or Seebeck coefficient) of a semicon- 
ductor S with respect to a metal M, is represented 
schematically in Fig. 6. The junctions are maintained 
at temperatures T, and T», where T% is greater than Tı. 
The thermopower of S with respect to M is defined as 


Qs—Qu=—d(V2—V1)/d(T2—T1) (7-1) 


when the net current in the circuit is zero. 

Since thermal emf’s are additive, it has a meaning to 
talk of the absolute thermopower of a substance. This 
is particularly true of semiconductors, since they exhibit 
much larger thermoelectric effects than metals. 

It is shown by Seitz*’ and by Domenicali®® that the 
thermopower, Q, consists of the two terms,** and can 
be represented by 


Q= (1/e) ((a—Q*)/T). (7-2) 


Q* is the heat transport per electron, which in semi- 
conductors is approximately 2kT, and so this term 


57 Frederick Seitz, The Modern Theory of Solids (McGraw-Hill 
Book Company, Inc., New York, 1940). 

58 C. A. Domenicali, Revs. Modern Phys. 26, 237 (1954). 

**TIt is assumed that these transport phenomena are deter- 
mined entirely by the electronic mobility, and not by that of holes. 
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Fic. 6. Circuit for the 
Measurement of the 
thermopower of a semi- 
conductor S with re- 
spect to a metal M, 
using an electrostatic 
voltmeter V. 


VERY HIGH 
RESISTANCE 
VOLTMETER 


EKD 


contributes about — 170 pv per degree to the thermo- 
power, at all temperatures. The first term on the right- 
hand side of Eq. (7-2) has the same sign as the Fermi 
potential of the electrons, 7, so that when the Fermi 
level lies below the bottom of the conduction band this 
term is negative. 

In practice, the small thermopower of the metal is 
usually neglected in comparison with the much larger 
value for the semiconductor, so that the measured 
potential difference is given by the expression 


F:=Vı— V: 


Tı (1-3) 
=| QsaT. 


Ti 


The interpretation of Qs in terms of the choice of band 
model is discussed in Sec. (5-5). 


7.2. Experimental Results 


Kataoka and Suzuki® have measured the thermo- 
power of ceramic rutile in various stages of reduction. 
Their high resistivity specimens (3000 ohm cm at 
300°K) exhibited a thermopower of —1.03 mv/°C in 
the range 0 to 90°C. The thermopower, as expected, 
decreased with increasing impurity content. The 
specimens of lowest resistivity (0.05 ohm cm at 300°K) 
exhibited a thermopower of —0.17 mv/°C between 0 
and 100°C, falling to about —0.15 mv/°C between 
150 and 200°C. Some of their curves of thermopower 
versus temperature were concave toward the tempera- 
ture axis. Te 

These workers claim to have established a corre- 
lation between thermopower and the conductivity 
the specimens. The difficulties encountered in m 
such a correlation due to the uncertainty in the va 
of the mobility, the effective mass, and in the dyna 
term Q*/eT of the thermopower are discussed ae 


= 


(5.5.2). 
However, the thermopower may be plotted a 
reciprocal temperature to yield a value for the 


S. Katoaka and T. Suzuki, Bull. Electrotech. Lab T 
18, No. 1, 732 (1954). > 
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slightly reduced samples. The thermopower of the 
highly reduced samples, however, increased with 
increasing temperature. This latter effect may have 
resulted from partial reoxidation. 


Fic. 7. Schematic diagram of apparatus for 
Hall effect measurements. 


zation potential of the donor sites, to the extent that 
the simple band model used in deriving Eq. (5-23) is 
valid. This procedure, applied to the data on page 735 
of reference 59 yields values of Ea (noninteracting donor 
model) in the range —0.02 to —0.13 ev. There appears 
to be no correlation between the values obtained in this 
way and those obtained from conductivity data, dis- 
cussed in Sec. (6.3.1). 

When the values of the Fermi level (donor band 
model) are obtained from these same data by the use 
of Eq. (5-24), one obtains values ranging from —0.31 
ev to zero, the position of the Fermi level rising with 
increasing donor concentration (as evidenced by the 
resistivity measurements). 

These workers have also shown that the thermopower 
of their specimens is dependent on the ambient atmos- 
phere. At a pressure of 10~ of mercury the thermopower 
of their specimens was more highly temperature de- 
pendent than at atmospheric pressure. 

Boltaks and co-workers® have investigated the 
thermopower Q of sintered rutile that had been reduced 
in hydrogen or in carbon monoxide. In all cases the 
thermal emf was negative. Here, Q was found to 

with increased reduction. The experimental 
Cael in the range —250 to —415 pv per degree. 
“Thus, Q decreased with increasing temperature for the 


(1951). 


hee Dh 


and Salunina, Zhur. Tekh, Fiz. 21, 532 
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8. THE HALL EFFECT 


The Hall effect arises from the force experienced by 
a charge moving in a magnetic field. A high resistance 


voltmeter is connected to the electrodes P and Q on the 


lower and upper faces (Fig. 7) of the specimen of 


uniform rectangular cross section, and thickness ¿ cm, 


in which a current I+ amp is flowing. The electrodes P 
and Q are assumed to be placed so that they are at the 
same potential before the application of a magnetic 
field. When a magnetic field B, gauss is applied in a 
direction normal to the current J, and parallel to the ¢ 
direction, a potential difference V 7(=Ve—Vp) volts 
appears between the electrodes. The quantity 
R=108V »t/B,I, is defined as the Hall coefficient. 
When lattice scattering predominates, the concen- 
tration of charge carriers is related to the Hall 
coefficient by the formula 


n= —7.40X 108G/R. (8-1) 


For n-type semiconductors, G is a function of the Fermi 
level which varies between 1 for nondegenerate samples, 
to 8/3m for completely degenerate samples.®!® When 
impurity scattering cannot be neglected, this value of 
n must be corrected® by a factor r, which has a value 
in the range 1 to 2 for ionized impurity scattering. 


8.1. Hall Data of Reduced Rutile 


Breckenridge and Hosler®™ have made Hall measure- 
ments on samples of rutile ceramic and single crystals 
which had been subjected to various degrees of re- 
duction in hydrogen. These workers then attempted to 
interpret the values of the electron concentration n 
obtained by the use of Eq. (8-1) in terms of two sources 
of conduction electrons, one of which has a very small 
activation energy, so that it was considerably ionized 
even at liquid air temperature. Their results (Fig. 8) 
were interpreted in terms of the following equation 


m=, exp(AE1/2kT)-+n2 exp(AE2/2kT) (8-2) 


with AFAR, and m<no. 

Interpretation of AH, and AEs in terms of the 
ionization potentials of isolated donor type impurities 
is complicated by the effect of the $ In¢ correction term 
of Eq. (5-6). This correction is important even for AF», 
whereas the significance of the smaller AE, is almost 
entirely lost because of the magnitude of the correction 
term. 


G R G. Breckenridge and W. R. Hosler, Phys. Rev. 91, 793 
1953). 

@ K. Shifrin, J. Phys. (U.S.S.R.) 8, 242 (1944). 

8 W, Crawford Dunlap, Jr., An Introduction to Semiconductors 


(John Wiley & Sons, Inc., New York, 1957); see Fig. 6.5. 
m S3 Foundation USA 
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Attempts have been made below to interpret the 
Hall data of Breckenridge and Hosler® under the 
assumption that the form of the curves at higher 
temperatures (200 to 500°K) is determined by the 
electrons in the conduction band, to which Maxwell- 
Boltzmann statistics and the concept of a single effective 
electron mass are assumed to apply. In the first case, 
the source of these electrons is considered to be non- 
interacting donor sites at an energy Ea with respect 
to the bottom of the conduction band. The appropriate 
type of plot for the experimental data is suggested by 
Eq. (5-6), and from the slope a value of Ea may be 
obtained. 

In the second case, the appropriate type of plot is 
indicated by Eq. (5-7), in which no assumption con- 
cerning the source of electrons is implied. However, 
the slope of the plot yields the height of the Fermi level, 
which will lie near the center of the donor band if it 
exists. In this case the plot yields also the effective 
mass of the conduction electrons. 

At the temperatures considered, the effect of the 
magnetic field on the concentration of conduction-band 
electrons is believed to be entirely negligible. 

When the data of Breckenridge and Hosler’s Fig. 11 
are plotted in these two different ways for the tempera- 
ture range 250 to 500°K, the results are as follows. With 
the exception of one very low value, for curve (b), the 
values of Ea (noninteracting donors) lie in the range 
—0.095 to —0.147 ev. With the same exception, the 
values of the Fermi level (donor band model) lie in 
the range —0.025 to —0.051 ev with respect to the 
bottom of the conduction band. The corresponding 
values of the effective mass ratio are in the range 0.16 
to 1.6. Although Maxwell-Boltzmann statistics can 
scarcely be said to apply to these data, these values of 
the effective mass of the conduction electrons are much 
lower than the values usually quoted for rutile. 


8.2. Hall Mobility 


An electric charge moving in the x direction normal 
to a magnetic field B,, “sees,” in addition, an electric 
field #., where 


E.=2:B, (mks units). 


Since the mobility b is defined as v-/Ez, we obtain 
E./E:=byB, (mks units). 


Hence the equipotential surfaces in a conductor are 
rotated by the magnetic field through an angle ¢, where 
tang =b nB. In cgs units 


tand= 10-8) By. 


This simple relation is somewhat obscured by the usual 
derivation of the Hall mobility from the electrical 


ôi P. T. Landsberg, Proc. Phys. Soc. (London) 71, 69 (1958). 


10°! 


NO. OF CHARGE CARRIERS /CM* 
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Fıc. 8. Electron concentrations in specimens of rutile reduced 
in hydrogen under the conditions specified (after Breckenridge 
and Hosler®!). Curves showing the temperature variation of the 
factors T4 and T? of Eqs. (5-5) and (5-7) are shown for com- 
parison. 


conductivity and the Hall coefficient 


o=neb 
—R=1/ne 
(—Ro)=b yx. (8-3) 


The determination of by from simultaneous measure- 
ments of ø and R is entirely equivalent experimentally 
to the measurement of tang, so that the simple inter- 
pretation of the Hall mobility is unaltered. 

Figure 9 illustrates a typical Hall mobility curve 
derived from the electrical conductivity and Hall data 
of Breckenridge and Hosler. These workers attempted 
to interpret their results in terms of the theory of 
scattering by optical lattice vibrations developed by 
Fröhlich and Mott, and by Fröhlich, Pelzer, and 
Zineau.® The additional scattering at low temperatures 


6 H. Fröhlich and N. F. Mott, Proc. Roy. Soc. (London) 
A171, 496 (1939). : 
66 Fröhlich, Pelzer, and Zineau, Phil. Mag. 41, 221 (1950). 
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Fic. 9. Hall mobility in a sample of reduced rutile, after 

Breckenridge and Hosler.“ The dotted curve, showing a 7-2-5 


dependence, is included for comparison. The symbols and A refer 
to the constants in Eq. (5-13). 


was ascribed to the effect of impurities as treated by 
Conwell and Weisskopf.® 

When an empirical relation for the temperature 
dependence of the mobility is required to assist in the 
interpretation of conductivity data above about 200°K, 
a T ?5 law appears to be satisfactory. For comparison 
purposes, a T—-5 plot is included in Fig. 9. 


8.3. Magnetoresistance 


Gorelik®* has investigated the effect of a magnetic 
field of 15 000 gauss on the electrical conductivity of 
“pure” rutile, and on specimens of rutile containing 
various proportions of the oxides of calcium and 
strontium. The entire change in conductivity did not 
occur immediately after the application of the magnetic 
field, but was practically complete after two minutes. 
The change in conductivity was small at room tempera- 
tures, but above about 160°C the fractional change in 
conductivity rose to steady values of one or two percent. 

Gorelik has given an “explanation” of these effects 
in terms of the relative importance of the ionic and 
electronic contributions to the conductivity. 


9. MAGNETIC SUSCEPTIBILITY 
9.1. Introduction 


In a semiconductor, such as rutile, one must consider 
the following four contributions to the magnetic 


7. == . 
‘ell and V. F. Weisskopf, Phys. Rev. 77, 388 (1950). 
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susceptibility®: (a) the lattice susceptibility, (b) the 
carrier susceptibility, (c) the contribution of the 
impurities, including effect of departures from stoichi- 
ometry. This effect may be differentiated into two cases 
(i) the effect of noninteracting sites, and (ii) the effect 
of donor (or acceptor) bands; and (d) the effect of 
surface states, including adsorbed gases. 


9.2.(a) Lattice Susceptibility 


Van Vleck” gives the following expression for the 
lattice susceptibility, and points out that there is no 
sharp dividing line between diamagnetic molecules and 
feebly paramagnetic ones: 

Le? 


9 
2 


bf: |m (n; n) 
DHL D aa 


6mc? n'n  ho(n’; n) 


. (9-1) 


Xmole = — 


In Eq. (9-1), m? (n; n) is a nondiagonal element of the 
vector matrix for the magnetic moment of the system, 
evaluated in the absence of a field, v(n’; n) is the fre- 
quency corresponding to the 2’; transition. L is 
Avogadro’s number, e the electronic charge in esu, 
m the electronic mass in grams, and c is the speed of 
light in cm/sec. >> (7°) is the mean value of the radius 
of an electron orbit, summed over all the electrons. 

When the second term of Eq. (9-1) predominates, 
the contribution represents a small, almost tempera- 
ture-independent paramagnetism. Using the simple 
ionic model, the ions in the oxides ScsO3, TiO», V20s, 
and CeO» possess noble gas configurations, hence would 
be expected to have a negative (diamagnetic) lattice 
susceptibility. However, there appears to be some 
evidence, especially in the case of V2Os, for a contri- 
bution due to temperature-independent paramagnetism 
larger than the negative (diamagnetic) term.7!~™ Freed 
and Kasper’ discuss this effect in ions of the type 
VO, WOr?, MnO, VO3-, and Cr.07-?. Bates’ 
considers that the feeble paramagnetism believed to 
exist in Scs03, TiO», and CeO; may be caused by the 
“distortion due to interatomic forces,” and adds that 
it is difficult to determine in a given case whether the 
effect is due to a negative exchange integral, to electron 
migration (in conductors), or to ionic deformation. 

At least in the case of rutile, the wide range of 
experimental values reported in the literature, which 
include both positive and negative values, may result 
from departures from stoichiometry. The formation of 
oxygen vacancies (and Ti+? sites) which result when 


® Stevens, Cleland, Crawford, and Schweinler, Phys. Rev. 100, 
1084 (1955). 

70 J. H. Van Vleck, The Theory of Electric and Magnetic Suscepti- 
piles (Oxford University Press, New York, 1932), pp. 275, 279, 

n T. Ishiwara, Sci. Rept. Tôhoku Univ. 3, 303 (1914). 

” P. Weiss and P. Collet, Compt. rend. 178, 2146 (1924); 181, 
1057 (1925). 

13 R, Ladenburg, Z. physik. Chem. 126, 133 (1927). 

“41. F. Bates, Modern Magnetism (Cambridge University Press, 
New York, 1951), pp. 272, 44. 

75S. Freed and é Kasper, J. Am. Chem. Soc. 52, 4671 (1930) 
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rutile is prepared at moderately high temperatures in a 
reducing atmosphere is discussed in Sec. 14. The various 
values of the magnetic susceptibility of “pure, stoichio- 
metric” rutile, appearing in the literature, are listed in 
Table V. 


9.3.(b) Carrier Susceptibility 


Assuming that the carrier electrons behave as a 
perfect electron gas, and using the appropriate density- 
of-states effective mass ratio, the following expression 
for the carrier susceptibility has been derived®:7: 


2 


ei 2 
Xs (=) fdn], 


(9-2) 
where 7, is the concentration of electrons in the con- 
duction band, 8 is the Bohr magneton, and (f2)s 
depends on the curvature of the energy surfaces in 

” k space. This term is related to the effective mass ratio 
for orbital motion M™ by the relation 


M™ = (SEn). 


Since M™ is expected to be considerably larger than 
unity in the case of rutile,” Eq. (9-2) may be approxi- 
mated by the relation 


X.=B'n./kT, 


(9-3) 


(9-4) 


which is of the same form as the contribution of un- 
paired electrons in noninteracting hydrogen-like donors 
to be discussed in the next paragraph. 


9.4.(c) Impurity Contribution 


Mooser” has discussed (i) the contribution of 
electrons in noninteracting hydrogen-like impurity 
sites, and (ii) the case in which the overlapping of the 
impurity wave functions is sufficiently great for the 
formation of impurity bands. Considering first the case 
in which the donor sites are noninteracting and are in 
s-like ground states, he obtains 


XANA RIL (9-5) 


„where na is the concentration of occupied donors, 
which is related to Na, the concentration of donor sites, 
by the following relation : 


na=Na/1+4 exp((Ea—n)/kT). 


Here, Ea—n is the energy of the donor levels with 
respect to the Fermi level 7. 

A diamagnetic term must be added to Eq. (9-5) so 
that the total contribution of the electrons in isolated, 
hydrogen-like donor sites is given by the following 


(9-6) 


™H. Fröhlich, Theorie der Metalle (Verlag Julius Springer, 
Berlin, 1937), pp. 144-156. 
(495 J G. Breckenridge and W. R. Hosler, Phys, Rev. 91, 793 
. E. Mooser, Phys. Rev. 100, 1589 (1955); G. Busch and E. 
. Mooser, Helv. Phys. Acta 24, 329 (1951); 26, 611 (1953); Z. 
physik. Chem. 198, 23 (1951). 
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TABLE V. Magnetic susceptibility of rutile. 
Magnetic 
susceptibility 
Source Temperature (emu per gram) 

Meyer® 288°K 0.37X 10- 
Wedekind and Hausknecht> room 0.066 
Berkman and Zocher" room —0.20 
Hittig? room —0.30 
Raychaudhuri and Sengupta® 0.073 
Ehrlich‘ 90 and 293°K 0.08 
Zimens and Hedvallé 0.134 
Hill and Selwood® i —0.3 
Reyerson and Honigi 301°K 0.061* 


a S. Meyer, Ann. Physik 69, 236 (1899). 
b E, Wedekind and P. Hausknecht, Ber. deut. chem. Ges. 46, 3763 (1913). 
°S. Berkman and H. Zocher, Z. physik. Chem. 124, 322 (1926). 
d G. F. Hittig, Z. anorg. u. allgem. Chem. 224, 225 (1935). 
e D, P. Raychaudhuri and P. N. Sengupta, Indian J. Phys. 10, 253 (1936). 
{ See reference 80. 
g K. Zimens and J. Hedvall, Svensk Kem. Tidsk. 52, 12 (1941). 
à F, N. Hill and P. W. Selwood, J. Am. Chem. Soc. 71, 2522 (1949). 
i Ae EEN independent of temperature,’ presumably down to 
—190°C, 


iL. H. Reyerson and J. M. Honig, J. Am. Chem. Soc. 75, 3920 (1953). 
x See discussion of adsorption of NO:-N20, on rutile. 


expression : 
x= na(B2/kT — er /6m.c*) (9-7) 


in which r is the radius of the electron orbit in cm, e is 
the electronic charge in esu, c is the velocity of light in 
cm sec~, and me is the effective mass of the electrons in 
grams. 

If the assumption of a high effective mass M™, 
implied in Eq. (9-4) is valid, comparison of this relation 
with the expression for the paramagnetic contribution 
of (atomic) hydrogen-like donors shows that they are 
of the same form. In this case, the paramagnetic sus- 
ceptibility is independent of the distribution of the 
unpaired electrons between the donor sites and the 
conduction band, i.e., the thermal ionization of donor 
sites does not affect their contribution to the sus- 
ceptibility. This is in contrast to the case in which the 
wave functions of the impurity sites overlap sufficiently 
for the formation of impurity bands. This point is 
discussed later. 


9.4.1. Electrons in Impurity Bands 


The diamagnetic contribution of electrons in impurity 
bands is believed to be small and about the same as 
that of electrons in noninteracting donor sites [the 
second term in Eq. (9-7) ]. 

Band wide compared with kT.—At low temperatures 
the paramagnetic contribution is given by the following 
expression 


Xpara= 26°D (n) (9-8) 


in which D(n) denotes the density of states in the donor 
band at the position of the Fermi level. In this tem- 
perature range x should be almost independent of 
temperature. 


Band narrow compared with kT.—At high tempera- 


tures, where the band width is much less than kT, the 


paramagnetic contribution of the electrons in the — 
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TABLE VI. Fraction of donor electrons contributing to magnetic susceptibility of TiO, (after Ehrlich). 


nD 


np’ 


Or Donor (calculated 
Oxygen electrons Experimental from Xpara) 
vacancies per mole Xpara donor np! 
Temperature per mole of TiO: per electrons f=— 
3°K of TiO: (2 per Ov) mole’ per mole np 
293 1.8X 107 3.6X 107 46X 1075 2.2 10” 0.61 
293 3.0 6.0 72 3.5 0.57 
293 6.0 12.0 119 SH 0.47 
90 1.8 3.6 104 1.54 0.42 
90 3.0 6.0 162 2.39 0.40 
90 6.0 12.0 216 3.19 0.26 


impurity band is obtained from the result obtained for 
the noninteracting donor sites by multiplying by the 
factor f, which indicates the fraction of the electron 
spins that are “free” (see Sec. 9.4.2) 


f=exp((Ba—n)/kT)/(1+exp(Za—) kT), 


where Fa is the energy of the midpoint of the donor 
band. This factor has the value unity when the Fermi 
level lies far below the donor band. In the case of prin- 
cipal interest (Ha—7) is zero or slightly negative, and 
the multiplying factor is approximately equal to 3. 
When the Fermi level is more than 3k7 higher than 
the donor band f is very small, tending to zero with 
increased separation. 

The numerical value of this factor is therefore highly 
dependent on the height of the Fermi level. The de- 
pendence of the Fermi level on temperature has been 
discussed in Sec. 5. 

Once again it is necessary to draw attention to the 
high dielectric constant in rutile. It is therefore probable 
that impurity bands form at lower impurity concen- 
trations than in most other materials. This effect may 
be offset by the possibly high effective electron mass. 


(9-9) 


9.4.2. Donor Sites in Nonstoichiometric Rutile 


Although the contribution of the Tit ions to the 
magnetic susceptibility is small, and of uncertain sign, 
the Tit ion definitely makes a large contribution to the 
paramagnetic susceptibility of the crystal. Departures 
from stoichiometry of rutile are believed to result in the 
conversion of Ti* sites to Tit sites. These ions of lower 
valence may in many ways be considered to constitute 
an “impurity” in the rutile lattice, and in this section 
the terms “impurity site’ and “impurity band” are 
understood to apply to nonstoichiometric rutile. 
Ehrlich® has investigated the magnetic pee 
= of solids of the composition TiO, from x=0 to x=2. Tf 
it ig assumed that the formation of an oxygen vacancy 
ults in the formation of two Tit sites, each with a 
moment of 1.73 Bohr magnetons, one may 
; | resulting paramagnetic susceptibility. For 
Bo late Erlich Eed that the paramagnetic 


= ri t still high 
L ch, Z. Electrochen;: áf Si esy Haridwar Collection. on Digiüzed by D intermedi bane Ease, and thata or ; 


a After correction for temperature-independent paramagnetic lattice susceptibility (+6 10-6 emu per mole). 


susceptibility of rutile which had been reduced to the 
extent of two percent weight loss (x= 1.90) should be 
253X10-* per gram mole at room temperature.’ Since 
the experimental value was 119X10- he concluded 
that only 47% of the electrons were ‘“‘free.” 

Ehrlich’s calculations have been extended to his 
other data, and are summarized in Table VI. As 
expected, the fraction f of the donor electrons that 
contribute to the susceptibility (column 7) decreases 
with increasing donor concentration. 

However, we note the higher values of f for the room 
temperature measurements as compared with the 
measurements at 90°K. 

Now subject to the assumptions of a high effective 
mass ratio M™ and the applicability of Maxwell- 
Boltzmann statistics, the raising of electrons from 
isolated hydrogen atom-like donors to the conduction 
band should not appreciably alter the paramagnetic 
susceptibility. We therefore look for an explanation in 
terms of overlapping of donor wave functions and the 
formation of a donor band. 

A very approximate value for the width of this 
hypothetical band may be obtained from Ehrlich’s low 
temperature data by the application of Mooser’s 
expression for the susceptibility of electrons in bands 
wide in comparison with kT. Here, x=26?D(m) where = 
D(n) represents the density of states at the Fermi level. 

The total number of levels was taken to be twice the 
number of oxygen vacancies. The approximation was 
introduced that the density of states D was uniform 
throughout the band, whence AE was estimated. The 
results of calculations for TiO. (where +=1.90, 1.95, 
and 1.97) at 90°K appear in Table VII. The estimated 
values of AE are somewhat larger than kT (which is 
0.0077 eV at 90°K), so that the requirement regarding 
the width of the band is fulfilled. 

It is now necessary to account for the observed 
increase in f with increasing temperature. One may ask 
whether this increase could reflect a transition from the 
“wide band” case, at 90°K, to the “narrow band” case. 
The estimated band widths (last column of Table VII) 
are about fifty percent greater than kT at 293°K. It is 
not unreasonable to suppose that the 293° values 
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temperatures the “narrow band” approximation would 
be valid. Earlier, it was concluded that the Fermi level 
would be expected to lie at, or somewhat lower than, the 
middle of the donor band. In the “narrow band” limit, 
the expected value of f calculated from Eq. (9-8) is 
therefore equal to, or greater than, 3. The observed 
values of f (given in the last column of Table VI) 
appear to be consistent with this interpretation. 

An alternative explanation may be given if two 
electrons are bound to each oxygen vacancy at suffi- 
ciently low temperatures, and these sites are assumed 
to be noninteracting. One would expect the electron 
spins to be paired in such a “molecule.” Raising of one 
or both of these electrons to higher states by thermal 
excitation would uncouple the spins. This conclusion is 
valid, regardless of the extent of the overlapping of the 
wave functions of the higher states, provided that the 
Fermi level lies farther than about 3kT below the 
bottom of any such higher band [see Eq. (9-9) ]. If 
the Fermi level lay closer to this higher band than about 
3kT, the Pauli principle, would have to be invoked, 
and the value of f would remain below unity. 

Ehrlich’s results may also be expressed in terms of 
the Curie-Weiss law (discussed further in the next 
section), in which case one obtains an effective magnetic 
moment of 1.5 Bohr magnetons, instead of the expected 
value of 1.73, and Weiss constants of 700, 72, and 
160°K, respectively, for the three compositions «= 1.97, 
1.95, and 1.90. 

Gray*! has reported an investigation of the magnetic 
susceptibility of “titanium dioxide” during reduction 
by hydrogen. Simultaneous measurements were made 
of the decrease in hydrogen pressure, sample weight 
loss, and paramagnetic susceptibility increase, the 
temperature being maintained at 560°C. There appears 
to be an almost linear relation between change in 
susceptibility and the loss in weight. The sample weight 
and the method of measurement were not stated and 
without this information a satisfactory interpretation 
of the data cannot be made. 


9.4.3. Rutile Containing Impurities 


Representation of the paramagnetic susceptibility 
arising from donor sites in solids is not often made in 
terms of the foregoing model invoking donor bands. 
More commonly the departures from the Curie law 
are represented empirically by replacement of the T 
factor in this law by the factor (T—A). This procedure 
is quite empirical, and may include the effects resulting 
from the “pairing” of electrons, and the formation of 
donor bands, discussed in the foregoing.**:™ Most of the 


T. J. Gray, The Defect Solid State (Interscience Publishers, 
Inc., New York), p. 287. 

® Other factors affecting the Weiss constant: the multiplet 
separations and the effect of inhomogeneous electric crystalline 
fields are discussed on page 150 of reference 84. 

8R. K. Wangness has discussed the relation between the 
exchange coupling between spins, and the Curie-Weiss law. 
R. K. Wangness, J. Chem. Phys. 20, 1656 (1952). 
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TaBLe VII. Width of hypothetical impurity band in TiO, esti- 
mated from Ehrlich’s 90°K magnetic susceptibility data. 


Calculated AE 

Number of density of estimated 

states per states at width of 
Xpara mole Fermi level donor 

x per mole* (2 per O») D(m) per ev band ev 
1.97 104X105 3.6X 102 99X107 0.036 
1.95 162 6.0 154 0.039 
1.90 216 12.0 206 0.058 


a After correction for temperature-independent paramagnetic lattice 
susceptibility ( -+6 X10-7¢ emu per mole). 


published experimental results are expressed in terms 
of the magnetic moment of the unpaired electron and 
the corresponding Weiss constant A. 

Selwood* and co-workers have shown the “pure” 
titanium sesquioxide to be antiferromagnetic, with a 
critical point near the specific heat anomaly at about 
248°K. The Weiss constant was unusually large, and 
was estimated to be about 2000°K. The corresponding 
magnetic moment for the Tit? ion was 1.2 Bohr 
magnetons, which may be compared with the expected 
value of 1.73 for the free Ti} ion with one unpaired 
electron. These results are contradicted by the later 
work of Foéx and Wucher,® and of Pearson. These 
workers found a continuous increase in susceptibility 
as the temperature was increased from —200 to about 
360°C, the temperature dependence being most pro- 
nounced in the vicinity of 160°C. 

When paramagnetic ions are separated by solution 
in a suitable diamagnetic crystal, their mutual inter- 
action is reduced. An example of this effect appears in 
the work of Adler and Selwood,*7:85 who have studied a 
solid solution of 14.3% TiO; in the diamagnetic oxide, 
Al.O3. The Tit ion was found to have a magnetic 
moment of 1.1 Bohr magnetons (expected value v3 
Bohr magnetons), and a value of A=105°K was ob- 
tained. This value of the Weiss constant A may be 
compared with the value of about 2000° obtained for 
undiluted TiO}. 

Selwood and co-workers have investigated the in- 
corporation of ions of other transition metals into the 
rutile crystal lattice. The Weiss constant of the man- 
ganese ion tends to zero as expected at infinite dilution, 
and is almost proportional to the concentration.® At a 
concentration of 19% manganese by weight, A is about 
315°. The magnetic moment corresponds to a valence 
state of about 4.1 at zero concentration, falling to 


& Pierce W. Selwood, Magnetochemisiry (Interscience Pub- 
lishers, Inc., New York, 1956), p. 329. 

8 M. Foëx and J. Wucher, Compt. rend. 241, 184 (1955). 

86 A. D. Pearson, Technical Report 120, Laboratory for Insu- 
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lation Research, Massachusetts Institute of Technology, (1957). 


87 Reference 84, p. 161. 

88S. F. Adler and P. W. Selwood, J. Am. Chem. Soc. 76, 
(1954). 

© Selwood, Moore, Ellis, and Wethington, J. Am. Chem. Soc. 
71, 693 (1949). 
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about 3.9 as the concentration is increased to 19% 
manganese (by weight). 

The valence state of nickel incorporated into the 
rutile lattice decreases somewhat with increasing 
concentration.” Extrapolation back from the lowest 
concentration appearing in the curves (about four 
percent nickel by weight) indicates that the magnetic 
moment at infinite dilution is probably between 3.8 
and 4.0, considerably lower than expected for six (or 
four unpaired) electrons. However, from the color, and 
for other reasons, Selwood believes that the oxidation 
state is indeed +4. 

Similar results were obtained for chromium™ and 
iron”? incorporated into the rutile lattice. In the case 
of iron, A fell from 52° at 12.9% iron, to 16° at 1.5% 
iron (by weight). The corresponding values of the 
oxidation state were 3.2 and 3.9. The chromium ion in 
rutile (2.6% by weight) exhibited a magnetic moment 
of 2.8 Bohr magnetons, and a negative A (—27°). 

Thus with the possible exception of nickel, these 
elements assume a valence state of almost four when 
occupying titanium sites (at sufficient dilution). 
Selwood has applied the term “induced valence” to 
this phenomenon. 


9.5.(d) Surface Contribution 


The magnetic susceptibility will include the contri- 
bution of adsorbed gases, in particular physically 
adsorbed oxygen. The meaning of the term chemisorp- 
tion when applied to oxygen and metallic oxides is not 
clear. However, the nature of the space-charge-barrier 
layer at the surface of a semiconductor (of the order of 
10 cm thick) is influenced by the chemisorption of 
gases. Any factor which influences the height of the 
Fermi level in this region will alter the paramagnetic 
susceptibility. The importance of this effect is expected 
to increase as the particle size is reduced. 

Sandler® has used the rate of ortho-para conversion 
of hydrogen on rutile powder as an indication of the 
surface density of unpaired electrons arising from the 
paramagnetism due to surface defects. These defects 
are believed to be due to a stoichiometric deficiency of 
oxygen. This surface paramagnetism is apparently 

associated with the blue color of reduced rutile. 

A mention of the investigation of the surface para- 
magnetism of rutile conducted by Reyerson and Honig, 
and of the effect of reoxidation by NO2-N20, will be 


found in Sec. 12. 

9 F. N. Hill and P. W. Selwood, J. Am. Chem, Soc. 71, 2522 
ee Wi. Selwood and L, Lyon, Discussions Faraday Soc. 8, 222 
230 ood, Ellis, and Wethington, J. Am. Chem. Soc. 71, 2181 

. Phys. Chem, 58, 54 (1954). 
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10. THERMAL CONDUCTIVITY, SPECIFIC HEAT, 
AND VARIOUS PHYSICAL PROPERTIES 


10.1. Thermal Conductivity 


Kingery and McQuarrie“ have determined the 
thermal conductivity of a number of pure ceramic 
oxides, including rutile, in the temperature range 100 
to 1800°C. The results were corrected for the porosity 
of the samples. Values of 0.0119 and 0.0081 (cal sec™ 
cm~) (°C cm) at 200 and 800°C, respectively, were 
obtained. Kingery®® has discussed the variations in 
thermal conductivity associated with the effect of 
radiant energy transmission, the ratio of the phonon 
mean free path to lattice dimension, the value of the 
Debye temperature, emissivity, and the relation of the 
electronic conductivity to the thermal conductivity of 
rutile. 

The thermal conductivities of sapphire and rutile 
single crystals have been measured in the temperature 
range — 50 to 100°C by McCarthy et al.°® The thermal 
conductivity is greater parallel to the optic axis than 
in the direction perpendicular to it. Both values decrease 
with increasing temperature in the temperature range 
studied, and their ratio is also temperature dependent. 


10.2. Thermal Coefficient of Expansion 


Day” has obtained a value of 8010-7 per °C, for 
the coefficient of expansion of rutile in the range —130 
to 50°C. Mauer and Bolz®* have determined the thermal 
expansion coefficient of stoichiometric and reduced 
rutile from x-ray data. In the temperature range 0 to 
200°C the average values of the coefficient of expansion of 
stoichiometric rutile were 8.010- in the a direction 
and 9.2X 10-6 per °C in the c direction. In the tem- 
perature range 0 to 1000°C the corresponding average 
values are 8.4X10-® and 9.2X10-® per °C. In the 
temperature range 0 to 400°C, the average values for 
reduced rutile (TiO1.97) were 8.4X 10-6 per °C for the 
a direction, and 10.5 10-§ per °C for the c direction. 


10.3. Specific Heat and Debye Temperature 


At sufficiently low temperatures, only the low- 
frequency vibrations are important in determining the 
internal energy of a solid. In this region the specific 
heat capacity is expected to be proportional to 7°, and 
one expects a constant value for the Debye temperature 
9. In rutile, this range extends to above 4°K, and 
Keesom and Pearlman” obtained a value of 758°K 
for the corresponding Debye temperature. In the 


“ W. D. Kingery and M. 
(1954). 

5 W. D. Kingery, J. Am. Ceram. Soc. 38, 251 (1955). 
aos Ballard, and Doerner, Phys. Rev. 88, 153A 

* Jean Day, Bull. soc. sci. Bretagne 24, 13 (1949). 

°” Floyd A. Mauer and Leonard H. Bolz, National Bureau of 
Seas assay right Air Development Center Technical Rept: 

” P, H. Keesom and N. Pearlman, Phys. Rev. 98, 1539A (1955). 
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temperature range 10 to 20°K they found that 0 
decreases from 650 to 460°K. 

Also from specific heat studies, McDonald and 
Seltz” obtained a value of @=670°K in the temperature 
range 68 to 298°K. 

Basing their calculations on earlier measurements of 
infrared absorption bands, Breckenridge and Hosler™ 
obtained values of 6=936°K and @=1161°K. However, 
their interpretation of their Hall mobility data was 
more consistent with McDonald and Seltz’s value of 
6=670°K. 

After slight reduction in H» or Ar, at 1000°C, the heat 
capacity in the helium region increases greatly.” As a 
tentative explanation, the authors suggested that the 
additional heat capacity was possibly due to the con- 
tribution of the “free” classical electrons. In which 
band these ‘‘free” electrons are to be found, is an open 
question. From the estimated degeneracy temperature 
these workers obtained a carrier mass of 350 times the 
free electron mass. 

The heat capacity of rutile powder has been measured 
in the low temperature region by Shomate.! Arthur! 
has made measurements in the range 293 to 1073°K 
and Naylor™ in the range 500 to 1800°K. For the heat 
capacity in calories per mole, valid in the range 500 to 
1800°K, Naylor gives the formula: 


Cp= 0.21454 12.3 10-7 — 4380/T?. 
Arthur expresses his results in the form 
C= 0.1512+0.0002457 —0.222X 10" T* 


calories per gram per degree, plus or minus two percent, 
valid in the range 293 to 1073°K. For purpose of 
comparison, these formulas yield, respectively, values 
of 0.217 and 0.333 cal/g/deg, at 800°K. 

Dugdale ef al% expected to find a particle-size effect 
at low temperatures. Instead, however, they found that 
the specific heat of rutile was independent of particle 
size below 50°K. Above this temperature the observed 
dependence was attributed qualitatively as an effect of 
the optical modes of vibration. Because of the relatively 
complicated crystal structure of rutile, no attempt was 
made to make a theoretical calculation of the magnitude 
of the effect. These workers obtained a value of 
C,=0.115 cal/mole°K at 20°K and 1.407 cal/mole°K 
at SO°K., 


w H, J. McDonald and H. Seltz, J. Am. Chem. Soc. 61, 2405 
(1939). 
(1953 G. Breckenridge and W. R. Hosler, Phys. Rev. 91, 793 

12 C. H, Shomate, J. Am. Chem, Soc. 69, 218 (1947). 

8 J. S. Arthur, J. Appl. Phys. 21, 732 (1950). 

1 B. F. Naylor, J. Am. Chem. Soc. 68, 1077 (1946). 

* Dugdale, Morrison, and Patterson, Proc. Roy. Soc. (London) 
A224, 228 (1954), 
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10.4. Miscellaneous Physical Properties 
10.4.1. Density 


The density of rutile,” calculated from x-ray data 
is 4.28; the pycnometer density is 4.22 g/cc. 


10.4.2. Melting Point 


St. Pierre!’ has determined the melting point to be 
1840+10°C. 


10.4.3. Compressibility 


The compressibility of natural rutile single-crystals 
has been investigated by Bridgman.! His results could 
be expressed up to p=12 000 kg/cm? by the formula 2 


AL/Lo= —ayp+bip’. 


From 30° to 75°C the value of a; for the c direction was 
found to vary from 1.038 to 1.09010-’, and in the a 
direction was found to vary from 1.871 to 1.954X 107. 
For all cases investigated, the value of bı was 7X10~*. 


10.4.4. Hardness 


Moore!” gives a value of 7 to 73 (Mohs scale) for 
the hardness of synthetic rutile crystals. This value is 
slightly higher than that of quartz. Measurements were 
made on specimens exhibiting various degrees of re- 
duction, but the variation in hardness in these different : 
specimens was about the same as the variation with 
crystallographic direction. 


10.5. Miscellaneous 


A list of references for these and other properties of 
rutile may be found in “Properties of Titanium Com- 
pounds and Related Substances,” ONR Report 
ACR-17, October 1956. 


11. DIELECTRIC AND OPTICAL PROPERTIES 
11.1. Dielectric Constant and Loss Angle 


(a) Stoichiometric rutile—The frequency and tem- 
perature dependence of the dielectric constant of 
stoichiometric rutile have been investigated by Eucken 
and Biichner,"™! Schusterius,'” Büttner and Engl, 
Berberich and Bell and by von Hippel and his co- 
workers." The dielectric constant shows little variation fl 
3 å 


106 Skinner et al., “Titanium and its compounds,” Herrick ee 
Johnston Enterprises (1954). ap 
7 A. Schroeder, Z. Krist. 67, 485 (1928). aa 

18 P, D. S. St. Pierre, J. Am. Ceram. Soc. 35, 188 (1952). 
103 P, W. Bridgman, Am. J. Sci. 15, 287 (1928). se: 
uo C, H. Moore, Jr., Trans. Am. Inst. Mining Met. Pe j 
Engrs. 184, 194 (1949). F 
Gane Eucken and A. Büchner, Z. physik. Chem, 27B 
u2 Ç, Schusterius, Z. tech. Phys. 16, 640 (1935). ~ 
us H, Büttner and J. Engl. Z. tech. Phys. 18, 113 (1937). _ 
w L. JeBarberich and ME. BaT, ppl; Phys. 11, 
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with frequency until the region of reststrahl vibrations 
is reached (wavelengths shorter than about 300 x). 

Thus the measurements of Schmidt?!” on single 
crystals, at a frequency of several hundred megacycles 
per second are the same as the low-frequency values. 
This worker obtained a value of 173 for the c direction 
and 89 for the a direction. These results are in agree- 
ment with the values calculated from the reflection 
measurements of Liebisch and Rubens.” The value 
for densely fired ceramics!!® is approximately 100. If 
care is taken to prevent the loss of oxygen during firing 
of the ceramic, the dielectric loss is very low; tané 
= 0.0003 at 1000 cy/sec and 24°C. 

At a frequency of 1000 kcps, the measurements of 
Bunting et al. yielded a value of 106 for the dielectric 
constant of pure rutile ceramic at — 60°C and a value 
of 93 at 85°C. From these results one obtains a value 
of —1110~* for the temperature coefficient of the 
dielectric constant at 85°C and a value of —7.5xX10~ 
at — 60°C. 


11.2. Static Dielectric Constant and Polarizability 


It is usually assumed that the local field F, effective 
in polarizing ions in certain types of diagonal cubic 
crystals, is given by the relation 


FP;=E+y:P/eo (in mks units). (11-1) 


Following Lorentz, y; is usually taken to be 3. The 
static dielectric constant K, and the polarization P, 
are defined in terms of the electric field E, and the 
electric displacement D, as follows: 


D=Keok= E+ P. (11-2) 


The symbols N, V, n; F; Pi and a; are defined as 
follows. V=the number of unit cells per cubic meter, 
V=1/N=the volume of unit cell in mê, n;= the 
number of ions of type i per unit cell, #;=the local 
field at ions of type i, P;=the contribution of ions of 
type 7 to the total polarization, and a;=the polariza- 
bility of ions of type t, and is defined by the following 
equation : 


P= (Nni)ai(eoF:) (11-3) 

niæi= V P;/ oF ;. (11-4) 
The success of the Clausius-Mosotti relation 

a=} ma; (11-5) 

=3V(K—1)/(K+2) (11-6) 


; js usually taken as evidence for the validity of the 
Lorentz equation. To the extent that the assumptions 
jn deriving these results are justified, it may be 


. Phys. 9, 919 (1902). 
3 Tet Phys. 11, 114 (1903). 
isch and H. Rubens, Preuss. Akad. Wiss. Ber. 8, 211 
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shown that the polarization P in crystals having 
“diagonal cubic” symmetry may be expressed in terms 
of the sum of the polarizabilities; thus 


Py, P:= «o >, Fian:/V. (11-7) 
Assuming the same F for all ions, then 
P=(eF:/V) $ niai 
= (eFiæ/ V). (11-8) 


Making use of the additivity of a implied in Eq. 
(11-8) Roberts has used the experimental values of 
a of a number of ionic crystals to prepare a table of the 
polarizabilities of their constituent ions. The polariza- 
bility of oxygen was arbitrarily taken to be 30 A3, so 
that the resulting polarizabilities of the halide ions 
would be nearly proportional to their volumes. 

No attempt was made to separate the total polariza- 
bilities into their electronic and ionic components. 
Nevertheless, it is possible to predict within a few 
percent the polarizabilities of a large number of ionic 
cubic crystals using Robert’s values of a;. 

(a) Temperature dependence.—Differentiation of Eq. 
(11-6) with respect to temperature leads to 


(1/a)- (da/dT)=3-(1/3V)dV /dT 
+(3K/(K+2)(K—1))-(1/K)dK/dT. (11-9) 


Thus the temperature coefficient of polarizability is the 
sum of the temperature coefficient of volume expansion 
and a term proportional to the temperature coefficient 
of the dielectric constant. Roberts!° quotes the values 
(1/K)dK/dT=—70X10-, (1/3V)dV/dT=0.91X 10° 
and K=114 at 300°K for rutile ceramic. Substitution 
of these values in Eq. (11-9) yields the calculated value 
of (1/a)da/dT=0.90X 10-5. This result is smaller by a 
factor of about 30 than the corresponding value for the 
alkali halides. To the extent that the Clausius-Mosotti 
equation can be said to apply to rutile, the large nega- 
tive temperature coefficient of the dielectric constant 
results from the thermal expansion of rutile and the 
small temperature coefficient of polarizability. 

(b) Defect structure—The frequency dependence of 
the dielectric constant and loss angle is used to in- 
vestigate the defect structure of dielectric materials. 
Thus Srivastava! has studied the defect structure in 
slightly impure rutile crystals. The defect structure was 
enhanced or modified by quenching of the crystals in 
liquid nitrogen after heating at 820°C for three hours 
in dry oxygen. The resulting absorption centers were 
quite anisotropic, and apparently could be observed 
only with the electric field in the c direction. 

Below 10 kc/sec, both the real part of the dielectric 
constant, K’, and the tangent of the loss angle, tan), 
appeared to be increasing indefinitely with decreasing 


1% Shepard Roberts, Phys. Rev. 76, 1215 (1949). a 
121 K, G. Srivastava, Progr. Rept. No. XXI (June,1957), ant 
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frequency, and exceeded the values K’=700 and tané 
=0.7 at 100 cy/sec. Above about 1 mc/sec, K’ appeared 
to be asymptotic to the value for the defect-free ma- 
terial, and tané fell off to low values. 

Relaxation effects occurred in the frequency range 
10 kc/sec to several megacycles a second. The frequency 
fmax at which tanô reached a maximum value (of 
approximately 0.75) increased from about 8000 cy/sec 
at 27°C, to about 800 kcps at 278°C. These effects 
could be annealed out by heating in atmospheric air at 
moderate temperatures (about 150°C). Measurements 
made below about 40°C indicated that fmax was an 
exponential function of reciprocal temperature, and the 
data were used to yield an estimated value of 0.18 ev 
for the activation energy involved. 

These effects were associated with the presence of 
traces of silver and copper (about 0.01%), and were 
absent in crystals free from these impurities. 

Similar phenomena have been observed by Frey- 
mann,!” who ascribed the effect to “reorientation of 
positive and negative lattice defects.” 

(c) Effect of other impurities—Skanavi'®.* has 
found that small additions of Group II oxides to rutile 
cause a large increase in the dielectric constant, of the 
order of 3X 10t. Zerfoss and co-workers™® believed that 
this might be due to the formation of an “irregular 
sandwich of a conducting surface region and a dielectric 
interior with indefinite boundaries,” which would 
produce a large space charge polarization. 

The phenomenon may be explained by the over- 
lapping of wave functions of the more closely spaced 
of the impurity sites. Regions of closer than average 
impurity separation, would be expected to occur at 
random. In these regions there would be produced 
chains of atoms which would produce high conductivity 
paths of limited length. Such a mechanism would 
constitute a slight extension of the hypothesis of Zerfoss. 

Nicolini!® has prepared a form of ceramic “rutile” 
with unusual dielectric properties. The material was 
prepared by heating “‘pure’ titanium dioxide to 1400°C. 
The dielectric constant, which has a value exceeding 
10 000 at 10 cy/sec, drops to its normal value of about 
100 at approximately 10° cy/sec. Tané has a maximum 
value of 3.2 at 20 000 cy/sec. 


11.3. Optical Absorption 


At 4°K, “pure” rutile is transparent in the visible 
region at wavelengths longer than the sharp absorption 
edge at 4100 A. (See Figs. 10 and 11.) This edge becomes 
less sharp at higher temperatures, and extends to 


12 M, Freymann and R. Freymann, Compt. rend. 235, 1125 
(1952). 

123 G. I. Skanavi, J. Exptl. Theoret. Phys. U.S.S.R. 17, 399 (1947). 

14 G, I. Skanavi, Doklady Akad. Nauk, S.S.S.R. 59, 231 (1948). 

125 Zerfoss, Stokes, and Moore, J. Chem. Phys. 16, 1166 (1948). 

126 L, Nicolini, Nature 170, 938 (1952). 
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Fic. 10. Optical density of clear rutile (after 
Breckenridge and Hosler!*8). 


slightly longer wavelengths. The absorption coefficient 
rises rapidly at wavelengths longer than 5 y.27.¥8 

After heating in hydrogen for 2.5 min at 600°C there 
is an increase in optical density, with a very broad 
maximum centered at 1.2 4.18 

In the far infrared, Parodi!” found transmission lines 
at 30.5, 41, and 50 nu. 

Breckenridge and Hosler?! attempt to interpret the 
4100-A optical edge (3.02 ev) in terms of the transition 
of an electron from a helium-like, neutral oxygen 
vacancy (O,-2Ti**)® or from a (O,- Tit*)* center to the 
Ti conduction band. Their assignment of the energy 
levels is based on evidence obtained from thermo- 
dynamics and other grounds. At higher temperatures 
some of these vacancies and their associated electrons 
are in excited states. As a result, the activation energy 
is supposed to be reduced and rendered less sharp. 
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Fic. 11. Optical density of slightly reduced rutile 
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27 D, C. Cronemeyer, Phys. Rev. 87, 876 (1952). 
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Following Mott and Gurney™ the difference between 
the optical and thermal activation energies (3.02 and 
0.62 ev, respectively) was explained on the basis of the 
difference between the statical and optical dielectric 
constants. 

Breckenridge and Hosler associate the absorption 
band at 1.2 u in reduced rutile (about 1 ev with the 
value of 0.2 ev for the ionization potential (thermal) 
of the postulated donor sites (O,-Ti**)+. Thus the 
optical ionization potential of the singly ionized donor 
sites was believed to be five times the thermal value, 
consistent with the corresponding values assigned to 
the ionization potentials of the neutral donor (previous 
paragraph). 

Support is claimed for this assignment of energy 
levels by experiments on the bleaching of colored rutile 
on cooling to liquid air temperatures mentioned briefly 
in the same paper. 


11.4 Color 


Weyl and Forland'! have discussed (i) the color 
changes introduced by traces of ions such as Nb?t®, 
Tats, Sbt®, Wt®, which are usually considered colorless, 
and by very small traces of Fet* in rutile; (ii) the 
phototropy of rutile (reversible darkening when exposed 
to light) which may be obtained by incorporating 
certain impurities into rutile, as for example, a com- 
bination of the oxides of iron and niobium; and (iii) the 
oxidizing effect which titanium dioxide may exert on 
other materials when it is irradiated by ultraviolet 
light. 

The presence of the Fet* ion introduces a rose color, 
_ that of the Nb*® ion a blue-gray color into rutile. The 
-blue-gray color is also present in reduced rutile, but 
_ unlike the effect of the Nb**, the discoloration of 
e a rutile may be bleached by reoxidation. 

© Sandlers experiments on the rate of ortho-para 
= Conversion of hydrogen on the surface of rutile indicate 
‘that there is a connection between the surface para- 
‘magnetism and the blue color of lattice defects in 
rutile“? The blue color may be due entirely to a stoi- 
chiometric deficiency of oxygen in the rutile. However, 
Sandler points out certain inconsistencies which may 
indicate that the blue color is associated also with the 
the presence of impurities. 

According to Gebhardt and Herrington, the dis- 
coloration of slightly reduced rutile is due to organic 
contamination from stopcock grease, etc. Degassing 
of a “clean” sample in a “clean” vacuum system pro- 
= duced no discoloration. However, if either the vacuum 
tem or the specimen were contaminated, a dis- 
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coloration resulted. At temperatures of 300 to 550°C 
the loss of oxygen was estimated to be about 1078 mole/g 
of TiO». 

When heated in air to high temperatures, powdered 
rutile becomes lemon-yellow but returns to its original 
whiteness on cooling again to room temperature.'* At 
900°C there is a maximum in the emission spectrum at 
about 4700 A. 


11.5 Refractive Indices 


Measurements of the index of refraction (normal ray) 
in the visible region, were made by Birwald!’® and 
Schroeder.” Cronemeyer’’ has obtained good agree- 
ment with the earlier work, and has given the result 
of his measurements together with values calculated 
from reflection measurements (Liebisch and Rubens") 
and from transmission measurements. 

Using prisms cut from single crystals, DeVore’? has 
measured the refractive indices of rutile in the wave- 
length range 4250 to 15 000 A. DeVore has expressed 
his results by means of the formulas 


n?= 7.197+3.322X 107/(A?— 0.843 X 107) 
for the extraordinary ray and 
n?= 5.9134 2.441 X 107/ (A°— 0.803 X 107) 


for the ordinary ray. The results are said to be in good 
agreement with those of Bärwald.™* 

Radhakrishnan”! has used the method of Bragg to 
calculate the refractive indices of rutile. This worker 
has also used the measurements of Schroeder™® to 
derive the values of the constants a, b, c, and Xo in the 
dispersion formula for rutile 


n?—1=a—cro?— br?/ (A2—)o?) 


in which z is the index of refraction at wavelength À. 
Values calculated using this relation agreed with 
observation within the experimental error. 


11.6 Reflection Coefficient 


The reflection coefficient of natural rutile crystals 
has been measured by Liebisch and Rubens!® in the 
range 1.4 to 300 u. Their results appear in Fig. 12. 
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12. ADSORPTION AND SURFACE LAYERS 


In this section no reference is made to the extensive 
literature concerning adsorption of various gases and 
vapors on rutile powder, with the exception of certain 
experiments which yield direct information concerning 
the defect structure of the surface of rutile. 

From the measured rate of ortho-para conversion of 
hydrogen at the surface of rutile (discussed also m 


14 G. I. Sinyapkina, Zhur. Eksptl. i Teoret. Fiz. 19, 581 (194 
13 W, Barwald, Z. Krist. 7, 168 (1883). d 
136 A, Schroeder, Z. Krist. 67, 485 (1928). 
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Fic. 12. Reflection coefficient in percent for natural single 
crystals of rutile (after Liebisch and Rubens!"'8), 


Sec. 11.4), Sandler?! has obtained information con- 
cerning the state of adsorbed oxygen. The oxygen 
adsorbed on rutile at 90°K is apparently in molecular 
form, and the correspondingly high conversion rate is 
ascribed to the paramagnetism of this form of oxygen. 
On heating the specimen to room temperature, dis- 
sociation of some of the oxygen evidently occurred, 
leading to re-oxidation of the surface and a disappear- 
ance of the blue color. Reyerson and Honig" have 
investigated the diamagnetic susceptibility of NOs-N2O, 
adsorbed on rutile. To obtain a constant value for the 
gram susceptibility of the absorbate these workers 
found it necessary to assume that the (positive) sus- 
ceptibility of the rutile decreased during adsorption. 
Thus it seemed reasonable to suppose that the rutile 
was initially partially reduced, but became re-oxidized 
by the NO»-N.O4. The true value (after re-oxidation) 
of the susceptibility of rutile estimated in this way was 
0.061 10-6 cgs units per gram. This interpretation is 
strengthened by the observed permanent increase in 
weight of the specimen, and by spectroscopic analysis 
of the equilibrium gases. 


12.1. Photoconductivity in Surface Layers 


The curves of spectral distribution of photoconduc- 
tivity of most semiconductors exhibit a sharp rise in 
the vicinity of the absorption edge. As the wavelength 
of the incident light decreases, the response reaches a 
maximum value, and begins to decrease while the 


19 Y, L. Sandler, J. Phys. Chem. 58, 54 (1954). 
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absorption coefficient is still rising. At small intensities 
the photocurrent is usually proportional to the light 
intensity at all wavelengths. yt 

On the long wavelength side of the response maxi- 
mum, the absorption coefficient is small, and much of 
the incident radiation traverses the crystal without 
being absorbed. On the short wavelength side, the 
penetration depth is small, and most of the absorption 
takes place in the surface region, where the electron 
recombination velocitytf is high. The wavelength of 
maximum response is determined by the relative 
importance of these mechanisms. Thus the response 
maximum will depend on the thickness of the crystal, 
and on the temperature, if the absorption coefficient 
is temperature-dependent. 

Cronemeyer!?? has measured the photocurrent of 
thin crystals of rutile as a function of the wavelength 
of the incident radiation. The response increased 
rapidly with the energy of the incident photons in the 
region of 2.9 ev (4300 A). The edge of the photocon- y 
ductivity curve is shifted to slightly lower energies as 
the temperature is increased from 103 to 295°K. 
Because of the high absorption coefficient above 3 ev 
(4100 A), and because of the important role of surface 
effect in this region, the interpretation of the results 
is difficult for higher photon energies. 

At 1.2 u, the response is smaller than the maximum: 
response by a factor of 10°. 


13. BAND STRUCTURE OF RUTILE—CONCLUSIONS 


Estimates of the width of the energy gap from 
measurements of electrical conductivity in the intrinsic 
range involve an assumption that the density-of-states 
effective masses of the valence and conduction bands 
are identical. 

The measurements of electrical conductivity made 
in the high temperature range (700 to 1100°C) where 
presumably the intrinsic conductivity is measured, 
indicate that the forbidden energy gap of rutile is 
between 3 and 4 ev. The experimental value obtained 
by Earle (3.4 ev), Hauffe (3.9 ev), and Cronemeyer 
(3.06 ev) are discussed in another section. Cronemeyer’s _ 
investigation of optical absorption and photoconduc- — 
tivity yield a value of about 2.8 to 3.1 ev. a 

The sign of the charge carriers in semiconductors is : 
usually obtained from Hall or thermopower measure- 
ments. Because of the difficulty of measuring the po- 
tential at a point on the surface of a high resis 
specimen, Hall measurements have not been made 
specimens of rutile in the intrinsic range. Measureme 
of thermopower on rutile specimens of high puri 
indicate that the charge carriers are probably ele r 

The measurements of Breckenridge and 
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at which electrons disappeay from th 
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indicate that the mobility of reduced specimens is 
about 0.1 cm? volt sec? at 500°K. The value for 
intrinsic rutile may be approximately the same. By 
combining this value of mobility with the electrical 
conductivity measurements of Cronemeyer, one obtains 
an effective electron mass ratio of the order of 1000. 
However, values of the electron concentration evaluated 
from the combined Hall and conductivity data of 
Breckenridge and Hosler (see Sec. 8) yield effective 
mass ratios of the order of unity. 

An attempt has also been made to discuss the band 
structure of reduced rutile with the inclusion of a donor 
band. The measurements of the magnetic susceptibility 
of rutile containing a stoichiometric deficiency of oxygen 
made by Ehrlich could be explained by the presence of 
such a donor band approximately 0.04 ev in width. 
Assuming the existence of this band (in which the 
mobility is expected to be low, and the effective electron 
mass high), then the donor band is probably 0.05 to 
0.20 ev below a “conduction band” (in which the effec- 
tive mass is considerably lower). The values of the 
energy separation of these bands calculated from the 
conductivity, Hall effect, and thermopower measure- 
ments discussed in earlier sections are given after the 
author’s names. Electrical conductivity of reduced 
rutile: Kataoka and Suzuki (0.12 to 0.15 ev). Thermo- 
power: Kataoka and Suzuki (0 to 0.13 ev). Hall data: 
Breckenridge and Hosler (0.025 to 0.05 ev). Optical 
absorption: An absorption band characteristic of 
reduced rutile was observed by Cronemeyer in the range 
1 to 3 u (0.4 to 1.2 ev). In order to compare these 
energies with the other data, they may have to be 
reduced by a factor of about 5.1 to allow for the 
difference between the optical and statical dielectric 
constants, as discussed in Sec. 11.3. This correction would 
place the donor band about 0.08 to 0.24 ev below the 
conduction band. 

The evidence for the presence of a donor band is not 
conclusive. Consequently, the data have also been 
discussed below in terms of noninteracting sites. On 
the basis of the simple model and the assumption of 
noninteracting donor sites, the electrical conductivity 
measurements lead to the values of donor ionization 
potentials: Kataoka and Suzuki (0.29 to 0.35 ev), 
Boltaks (less than 0.2 ev). Thermopower: Kataoka and 
Suzuki (0.02 to 0.13 ev); Henisch (0.09 ev). Hall 
measurements: Breckenridge and Hosler (0.10 to 0.15 
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14. THERMODYNAMIC PROPERTIES 
AND REDUCTION 


14.1. Thermodynamic Properties 


rmodynamic properties, free energy of for- 

Ber ee o e various oxides (and many other 
unds of titanium) are fully listed in the publi- 

aed comp A The Properties of Titanium Compounds and 
So Tacha Substances, ONR Report ACR-17 (October, 
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Humphrey,“ Mixter,"? Sieverts and Gotta," Roth 
and Becker,“ Neumann ef al., Roth and Wolf," 
Foëx, " Kelley,“8 Shomate,"® Naylor,!° and others. 
Brewer’! has prepared a review of the thermo- 
dynamic properties of metallic oxides, including the 
various oxides of titanium. These properties enable one 
to estimate an upper limit for the free energy of for- 
mation of an oxygen vacancy in rutile. We are chiefly 
interested in departures from stoichiometry so small 
that the rutile structure is maintained and lower oxides 
such as Ti305 are not present as separate phases. 


14.2. The Reduction of Rutile 


X-ray studies®!** show that the rutile structure is 
maintained even for a weight loss as great as two 
percent (composition TiO;.90). When the weight loss 
is somewhat greater than this value, the lower oxide 
TisOs (or TizO3) forms as a separate phase, depending 
on the temperature. 

The pressure of oxygen in equilibrium with titanium 
oxide decreases with the degree of reduction, the 
decrease being most rapid when the composition is 
close to the relative oxygen content represented by the 
formulas Ti203, Tiz0s, and TiO». For values of weight 
loss up to two percent, the degree of reduction may 
apparently be controlled'® by heating the specimen of 
rutile in oxygen at reduced pressure, in a vacuum 
system free from contamination by stopcock grease. 
For high degrees of reduction, the equilibrium oxygen 
pressure is inconveniently low for the use of this simple 
method, and it becomes necessary to reduce the speci- 
men by chemical means. 

The oxygen pressure may be established at the desired 
low value by heating the specimen in a mixture of 
hydrogen and water vapor of known composition. The 
equilibrium constant of this reaction is known with 
accuracy,’ and the equilibrium oxygen pressure may 
be calculated from the following equation: 


logK p= logpo, (pu,/pu,o)?=5.9—25 900/T. 


Alternatively, the CO-CO: reaction may be used to 
establish the desired oxygen pressure around the 
specimen.!® A third method involves the heating of 
the titanium oxide in a sealed container with some 
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pieces of a metal which forms a very stable oxide. 
Experiments in which these methods have been used 
are discussed in the following. 


14.2.1. Heating in Oxygen al Reduced Pressure 


Assayag ef al! have measured the equilibrium 
‘oxygen pressure and the corresponding weight loss 
when rutile is heated in oxygen to temperatures above 
1000°C.{f Little information was given concerning the 
vacuum system used. Because of the possibility of 
contamination by stopcock grease, and the presence of 
CO in the vacuum system, the results should be 
accepted with caution. 

Gebhardt and Herrington" obtained the usual dis- 
coloration when rutile was degassed for sixteen hours 
at 400 to 600°C in a conventional vacuum system, using 
greased stopcocks. However, when a sample was 
cleaned by heating in oxygen before degassing in a 
clean system, this discoloration was not obtained. 
These workers believe that the discoloration results 
from contamination of the rutile by organic impurities 
originating in the stopcock grease. The “clean” vacuum 
system made use of a mercury diffusion pump and 
mercury cutoffs instead of stopcocks. The conclusions 
were substantiated by mass spectrometric analysis and 
the detection of CO and CO» in the vacuum system. 


14.2.2. Use of the CO-COz Equilibrium 


Assayag ef al. have used the equilibrium between 
CO and CO; to control the degree of reduction of rutile 
corresponding to compositions in the range TiOx.90s to 
TiO; .077 (i.e., 0.4% to 2% loss in weight). They have 
also shown by x-ray methods that this material retained 
the rutile structure. 


14.2.3. Use of the Hydrogen-W ater Vapor Equilibrium 


Nasu!®5-!57 has used the hydrogen-water vapor 
equilibrium in the reduction of rutile in the tempera- 
ture range 1022 to 1282°K. A specimen of powdered 
rutile was placed in a platinum crucible and suspended 
in an iron reaction chamber at the end of a quartz 
spring. The approach to equilibrium could be deter- 
mined by the change in weight. A mixture of hydrogen 
and water vapor of unstated composition was intro- 
duced to the specimen. Later the composition of a 
sample of the equilibrium mixture was determined by 
measuring its pressure with a McLeod gauge both 
before and after the removal of the water vapor by 

tt In the temperature range 7=1318 to 1531°K, and p=107 


to 10-4 atmos, the results of these workers may be expressed very 
roughly by the following expression for the fractional weight loss f. 
log f= — logpo.—5000/T—0.10. 
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condensation with liquid nitrogen, and the ratio of the 
partial pressures in the equilibrium mixture was then 
calculated. 

Nasu!®5—®7 found that the equilibrium oxygen pres- 
sure remained constant from one percent to about 
9.7% reduction, at which point its value fell abruptly. 
The absence of variation in the equilibrium constant in 
this range of reduction seems to indicate that the lower 
oxide existed as a separate phase, and the value of 
9.7% corresponds closely to the loss in weight corre- 
sponding to the formation of Ti.O3. These results seem 
to set an upper limit of one percent to the weight loss 
which rutile can sustain at these temperatures without 
leading to the presence of aggregates of a different 
phase, i.e., when speaking of slightly reduced rutile, a 
weight loss of less than one percent is implied. 

Unfortunately, the equilibrium ratios calculated from 
more recent calorimetric and free energy data are con- 
siderably higher than Nasu’s values. Moreover, as 
Kelley has pointed out, Nasu’s high-temperature data 
correspond exactly to the pu,0/Pu, ratio for the FeO-Fe 
system, which fact may be related to Nasu’s use of an 
iron reaction chamber. 


14.2.4. Use of the Alkaline Earth Metals 


Kubaschewski and Dench! have investigated the 
partial free energy of the titanium-oxygen system at 
1000 and 1200°C between the limits of zero and 38% 
oxygen by weight. A quantity of barium, calcium, or 
magnesium was mixed with a quantity of one or other 
of the oxides of titanium. The mixture was sealed in a 
steel container and fired at the desired temperature. 
After cooling, the alkaline earth oxide was dissolved 
with hydrochloric acid and the remaining powder was 
analyzed for O; N, H, Fe, Ba, Ca, or Mg, and Ti. The 
equilibrium oxygen pressure was assumed to be that 
of the alkaline earth oxide at the temperature of the 
reaction. From these data a plot was made of partial 
free energy against composition of the oxide. In this 
paper, the effect of the formation of alkaline earth 
titanate was not discussed. The pressure of oxygen in 
equilibrium with a quantity of alkaline earth titanate 
is probably lower than that in equilibrium with the 
alkaline earth oxide. 


14.2.5. Other Methods 


Ehrlich® prepared specimens of TiO. with x greater 
than 1.97 by heating appropriate mixtures of rutile and 
titanium metal or TiO to 1500-1600°C. Wyss? in- 
vestigated the reduction of rutile by hydrogen and 
reported that at 1575°K, TiO; is formed. The absence 
of further reduction under these conditions seems to 
indicate, as pointed out by Brewer," that TiO; is 
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s by hydrogen, and by alcohols.!® Freund- 
1 Bichara!® have used CaH» to reduce rutile. 


E. Welch, 120th Meeting of the American Chemical 
ety Tew York (September, 1951). 
A. Komarov, Uchenye Zapiski Leningrad Univ. No. 169, 
Khim. Nauk No. 13, 29-35 (1953). 

. Freundlich and M. Bichara, Compt. rend. 238, 1324 
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In his paper “Reduction of metallic oxides by carbon 
in boric oxide and borax melts,” Khundkar'® reports 
that rutile is not reduced by active carbon at 1000°C. 
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REVIEWS OF MODERN PHYSICS 


Comparison of the Debye 8 Determined from 
Elastic Constants and Calorimetry 
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INTRODUCTION 


S the temperature of a solid approaches 0°K the 
simple Debye continuum model should be 
applicable and the lattice specific heat calculated from 
the elastic constants using this model should agree with 
that measured by calorimetric techniques. Now that 
accurate low-temperature specific heat and elastic 
constant data are available, it is possible to compare 
critically the values of the Debye characteristic tem- 
perature 0, which these two methods yield. Usually, the 
calorimetrically determined @ is obtained from the slope 
of the straight line in a plot of C/T versus T? where C 
is the measured specific heat and T the absolute 
temperature. Such a plot separates the contribution 
of the electrons and is a straight line only if the lattice 
specific heat varies with T’ as predicted by the Debye 
model. To determine @ from the elastic constants 
requires the calculation of the three sound wave 
velocities as a function of direction in the crystal lattice 
and then the averaging of the reciprocal of each velocity 
cubed over all directions. It is possible to do this by 
hand for hexagonal materials, but cubic materials 
require a long and elaborate calculation which is 
reasonable only if an electronic computer is available. 
Without a computer the tables prepared by de Launay' 
or the approximate methods developed by Quimby 
and Sutton, and by Betts et al.’ may be used. 

Various models have been proposed to determine the 
lattice specific heat in the temperature range where 
the Debye model is expected to become inapplicable. 
These models consider a discrete lattice rather than a 
continuum and use measured elastic constants to define 
the effective force constants which act upon any given 
lattice point. With these force constants, the dynamics 
of the entire lattice can be worked out and the specific 
heat calculated. 

This paper evaluates the different methods of calcu- 
lating @ from the elastic constants and compares the 
resulting values with the results of calorimetric 
measurements. 


DETERMINATION OF 0) 


The necessary data for calculating the value of 
Debye @ at 0°K, hereafter denoted by ĝo, are shown in 
Table I. In this table, the low-temperature elastic 


1 J, de Launay, Solid State Physics (Academic Press, Inc., New 
York, 1956), Vol. 2, p. 286. 

2S. L. Quimby and P. M. Sutton, Phys. Rev. 91, 1122 (1953), 
P. M. Sutton, Phys. Rev. 99, 1826 (1955). 

3 Betts, Bhatia, and Wyman, Phys. Rev. 104, 37 (1956). 
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constants of copper,’ silver,® gold,® zinc, magnesium,” 
and lithium fluoride® are extrapolated from measure- 
ments performed at 4.2°K while those of aluminum,® 
lithium,’ sodium chloride,“ and potassium bromide! 
are extrapolated from measurements near 77°K. The 
0°K atomic volumes and densities are calculated from 
the lattice constants appearing in the ASTM card 
catalog and the following thermal expansion measure- 
ments: Nix and McNair! for the noble metals and 
aluminum, Pearson“ for lithium, Griineisen and Goens!® 
for zinc, and Henglein'® for potassium bromide and 
sodium chloride. The data for magnesium and lithium 
fluoride are taken from the papers cited. 

Table II lists values of ĝo as calculated from the data 
in Table I by various methods along with the results of 
calorimetric measurements. The first column shows the 
o obtained from de Launay’s tables! using case 2 and 
the Sterling central difference interpolation formula!” 
in two dimensions. A graphical interpolation may cause 
the results to differ by about 0.3%. These tables are 

TABLE I. Constants of certain materials extrapolated to absolute 


zero. The elastic constants are in units of 10 dyne/cm?, Q isthe 
atomic volume and p the density at 0°K. 


N71 (103 p ( 
Cu Cir Cu Cia Cu cm74) cm) 
Elastic constants extrapolated from 4.2°K 
Copper 1.7620 1.2494 0.8177 8.553 9.018 
Silver 1.3149 0.9733 0.5109 5.938 10.635- 
Gold 2.0163 1.6967 0.4544 5.953 19.488 — 
LiF 1.246 0.424 0.649 2.292 2.646 © 
Magnesium 0.6348 0.2594 0.1842 0.2170 0.6645 4.407 1.779 
Zinc 1.7909 0.375 0.4595 0.5537 0.6880 6.708 7.2719 
1.7696 0.3480 0.4589 0.528 0.6848 
Elastic constants extrapolated from the 77°K region 
Aluminum 1.230 0.708 0.3090 
Lithium 0.1574 0.1333 0.1158 
NaCl 0.5750 0.0986 0.1327 
KBr 0.418 0.056 0.052 


4 W. C. Overton and J. Gaffney, Phys. Rev. 98, 969 (1955). 
€ J. R. Neighbours and G. A. Alers, Phys. Rev. 111, 707 (1958). 
6 The first set of constants are from G. A. Alers and ` 
Neighbours, J. Phys. Chem. Solids, 7, 58 (1958). The seco 
are the recent measurements by C. W. Garland and R. 
Phys. Rev. 111, 1232 (1958). 
1L. J. Slutsky and C. W. Garland, Phys. Rev. 107, 9 
8 C. V. Briscoe and C. F. Squire, Phys. Rev. 106, 117: 
9 P, M. Sutton, Phys. Rev. 31, 816 (1953). aa 
3 * ay cae and C, S. Smith, Bull. Am. Phys. Soc. Ser. II, 
; . F 
u W, C. Overton and R. T. Swim, Phys. Rev. 84, 758 (1951). 
2 J. K. Galt, Phys. Rev. 73, 1460 (1948). A 
! F, C. Nix and D. McNair, Phys. Rev. 60, 597 (1941); Phys. 
Rev. 61, 74 (1942). i > 
u W., B. Pearson, Can. J. Phys. 32, 708 (1954). _ 
18 E. Griineisen and E. Goens, Z. Ph 29, 
18 A. Henglein, Z. physik. Chem. 11! i 
1 J. de Launay (private communi 
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TABLE II. Debye temperature at zero degrees Kelvin. 


AND J. 


Quimby Numeri- 


and Betts cal inte- Calori- Refer- 
de Launay Sutton etal. gration metric ence 
Elastic constants extrapolated from 4.2°K 
343.8 a 
Copper 344.5 344.4 345.4 344.4 3467 b 
345.1 c 
259a 
Silver 226.4 226.6 227.1 2264 2265 »b 
226.2 d 
164.6 a 
Gold 161.7 162.1 162.2 161.6 1648 b 
164.6 œe 
LiF 7344 734.6 734 734.1 737 f 
743 g 
Magnesium 385.4 385.8 406 h 
403.3 i 
Zinc ifs Hye 302) j 
328.8 309) 
Elastic constants extrapolated from 77°K 375 
75 l 
Aluminum 427.4 428.5* 428.7 428.2 408} 
419 m 
Lithium 334.6 338.4 335.9 369 n 
KBr 171.7 172.8 174 o 
NaCl 321.6 321 321.9 320 p 


a See reference 19. 

b See reference 30. 

eJ. A. Rayne, Australian J. Phys. 9, 189 (1956). 

a J. A. Rayne, Proc. Phys. Soc. (London) 69, 482 (1956). 

e See reference 29. 

í See reference 20. 

g T. H. K. Barron and J. A. Morrison, Can. J. Phys. 35, 799 (1957). 

h See reference 21. 

i See reference 28. 

j See reference 22. 

k The calculated value listed in reference 9 is incorrect. The corrected 
value calculated by Dr. Sutton is listed above. 

reference 23, 

m See reference 24. 

n See reference 25. 

© See reference 26. 

P See reference 27. 


based on a computer calculation and the values calcu- 
lated should be exact except for errors introduced by 
interpolation between the tabulated values. Since the 
published tables apply only to cubic materials with 
(Cir—Ci2) <2C44 there are no entries for potassium 
bromide and sodium chloride nor for the hexagonal 
metals. The second column gives the results obtained 
from approximating the velocity function with a six- 
term power series? and the third column shows the 
results obtained if a six-term expansion in cubic 
harmonics? is used. The latter expansion technique has 
been applied to hexagonal materials by using spherical 
harmonics!’ and was used to complete column three for 
zinc and magnesium. 
The IBM 650 digital computer of the Ford Motor 
‘Company Research Department was programed to 
find the average velocity function using 120 points 
spread uniformly over the unit triangle for cubic 
- materials and using a one degree interval for the single 
ey. varjable in hexagonal materials. The resulting 


an Je arı 


K fe 


erence BSP 43. CC-0. Gurukul Kangri University Haridwar Collecti 


R. NEIGHBOURS 


values of ĝo are given in the column headed “Numerical 
integration.” The two entries for zinc are calculated 
from the two available sets of 4.2°K elastic constants® 
in Table I. Increasing the number of points considered 
in the unit triangle to 933 increased the value of 8o for 
copper by 0.17°K or 0.06% and decreased the ĝo value 
for lithium by 0.49°K or 0.15%. Since the values from 
a 120-point calculation differ so slightly from the results 
of the longer calculation, and since these differences are 
much less than the errors expected from errors in the 
elastic constants, it is felt that the 120-point calculation 
with its consequent saving in computer time is suff- 
ciently accurate. 

Table II shows that all the foregoing methods for 
calculating @ from elastic constants are essentially 
in agreement. In every case the differences are less than 
one percent. The columns headed “de Launay” and 
“Numerical integration” are results of essentially the 
same calculation and should agree closely, as they do. 
The differences between these columns are indicative 
of the errors introduced by interpolation and by the 
method of numerical integration. The large difference 
apparent for aluminum is presumably due to our slight 
extrapolation of de Launay’s tables to cover the 
particular values of elastic constants for this metal. For 
lithium, the large difference (1.3°K) is not due to the 
inadequacy of the 120-point division of the unit triangle, 
but to some other unknown cause. 

Roughly the ease of calculation of 6) by any one 
method is in the same order as the columns in Table II. 
If one uses graphical interpolation, the method of 
de Launay is outstanding in its simplicity. Although the 
two series methods are about equally involved, it is 
our experience that the value of ĝo resulting from the 
method of Betts and co-workers is apt to be more 
nearly mistake-free, probably because the individual 
steps can be checked separately. 

Since some cubic materials and all hexagonal ma- 
terials fall outside the range of de Launay’s published 
tables and since the series methods are probably 
poorer approximations, we take the column headed 
“Numerical integration” to be the best values of 6o 
calculated from elastic constants. 

The uncertainty of these values is determined by the 
absolute accuracy of the elastic constants. With the 
computer program setup it is a simple matter to deter- 
mine how sensitive the final @ value is to errors in 
each individual constant. Table III shows the values of 
99 for aluminum (low anisotropy) and for gold (average 


TABLE III. Changes in Debye temperature with 
changes in elastic constants. 
EE EEE ae 


Numerical 


integration 1.005 Cu 1.005 Cız 1.005Ca 1.01 Cu 
Aluminum 428.2 429.2 427.7 428.8 
Gold 161.6 162.7 160.7 161.8 
Zinc 327.1 327.3 


VALUES OF DEBYE 


anisotropy) which are obtained by numerical inte- 
gration if only one elastic constant is increased by 0.5%. 
A 0.5% change in elastic constants causes 0) to change 
by less than 1%. Also listed is the value obtained for 
zinc when the elastic constant C1; is increased by 1%. 
This particular constant is difficult to measure ac- 
curately so it is encouraging to observe that @ is not 
very sensitive to its uncertainties. From pulsed ultra- 
sonic measurements, the absolute values of the measured 
elastic constants are estimated to be in error by about 
one-half percent. Since the errors in ĝo are about the 
same size as the errors in the elastic constants, the 
calculated Debye temperature may be expected to be 
accurate to about one percent. This conclusion is 
supported by the fact that the @o’s calculated from the 
two sets of low-temperature elastic constant data 
available on zinc differ by only 0.7%. 


COMPARISON OF 0 WITH CALORIMETRIC RESULTS 


The last column in Table II gives the values of 8o 
determined from specific heat measurements!*~’ in the 
1 to 4°K temperature range. Included are several as yet 
unpublished results?8.*° as well as a new calculation,™ 
using the 1955 temperature scale, of data already in the 
literature. These values have been obtained by fitting 
the data on a C/T versus T? plot with a straight line 
by the method of least squares. The slope of this line 
defines the calorimetric ĝo. Small differences in the 
numerical value of the calorimetric and elastic @9’s may 
arise simply because the elastic constant line is not the 
best least squares fit to the data. Thus the best way 
to compare the two results is to plot the calorimetric 
data and superimpose on it the line defined by the 
elastic constants. Since only the slope of the line is 
defined, it must be forced to pass through the calori- 
metric data at some point and disagreement is then 
apparent if the line and the data diverge. 

Rather than plot a separate C/T versus T? graph for 
each material listed in Table II, the calorimetric data 
have been normalized to allow several sets of data to be 
presented in one graph. This is shown in Figs. 1 and 2 

19 Corak, Garfunkel, Satterthwaite, and Wexler, Phys. Rev.{98, 
1699 (1955). 

*D. L. Martin, Phil. Mag. 46, 751 (1955). 

2 P, L. Smith, Phil. Mag. 46, 744 (1955). 

2 G. Seidel and P. H. Keesom, Bull. Am. Phys. Soc. Ser. II, 3, 
17 (1958). By assuming that @ is temperature dependent between 
1 and 4°, Seidel finds that a least squares fit to his data would 
give A=309°K. We are indebted to Mr. Seidel for communi- 
cating this result to us. See also Phys. Rev. 112, 1083 (1958). 

2 Howling, Mendoza, and Zimmerman, Proc. Roy. Soc. 
(London) A229, 86 (1955). 

24 J. A. Kok and W. H. Keesom, Physica 4, 835 (1937). 


2 L. M. Roberts, Proc. Phys. Soc. (London) E70, 744 (1957). 

6 Barron, Berg, and Morrison, Proc. Roy. Soc. (London) 242, 
472 (1957). 
oes) Patterson, and Dugsdale, Can. J. Chem. 33, 375 

26 J. Rayne, J. Phys. Chem. Solids (to be published). 

2 J. E. Zimmerman (unpublished data, 1958). 

* J. Skalyo and A. Arrott (to be published). These results are 
based on the data of reference 19 but are recalculated using the 
1955 temperature scale and a least squares fit to the data. 


THETA 677 
where C0/T is plotted against (7/@)?. On such a 
plot vertical lines are lines of constant 7/6. If the 
specific heat data are described by the sum of a 
linear and a cubic temperature term then, in a plot like 
Fig. 1, the data will define a straight line with an 
intercept of y@ (where y is the coefficient of the linear 
term) and a slope determined by the ratio of the 
normalizing ð to the @ which best fits the data. In 
constructing these figures, the values of normalizing 6 
were taken from the numerical integration column of 
Table II. The lines are drawn with the slope predicted 
by the Debye theory and so as to fit best the lower end 
of the data. We have chosen not to plot data in which 
there are good reasons for the large and clear differences 
between the calorimetric and elastic ĝos. These are 
sodium chloride,” aluminum,*4 lithium, and po- 
tassium bromide.*® For sodium chloride and alumi- 
num the specific heat data are quite scattered and can 
easily be described with the elastic ĝo. For lithium the 
lines defined by the elastic data and the calorimetric 
data diverge completely, but this is probably owing to 
the presence of a crystal structure change! between 
77°K where the elastic constants were measured and 
4°K where the specific heat was measured. In the case 
of potassium bromide the calorimetric measurements 
extend only to temperatures of the order of 8/50 which 
is not low enough for a valid comparison. 

Figure 1 shows the plot of C0/T versus (1/0)? for the 
specific heat measurements of the noble metals? and 
lithium fluoride.” It is apparent that the elastic ĝo 
defines a straight line which describes the calorimetric 
data for copper, silver, and lithium fluoride to within 


: -1 
Milijoules mole 


c 
Ti 


511074 


Fic. 1. Comparison of the specific heat calculated from Debye 


theory (solid lines) with the measured specific heats of cubic 
materials. 


i C, S. Barrett, Acta Cryst. 9, 671 (1956). 
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the experimental scatter of the data. For gold, the 
elastic line is clearly not the best straight line fit to the 
data. This difference is just outside the experimental 
errors arising from elastic constant errors. It may be 
that simple Debye theory is not appropriate over the 
whole temperature range given in the figure in which 
case the elastic constant line should fit the data only 
below a certain temperature. From this point of view, 
it may be concluded that the elastic 0o and Debye 
theory successfully describe the calorimetric data 
below temperatures of approximately 6/60. 

The disagreement between calorimetric and elastic 
o’s is more explicit for the hexagonal metals since both 
cases examined show large deviations. Figure 2 shows 
a C0/T versus (T/6)* plot of the specific heat measure- 
ments on magnesium?! and zinc.” Superimposed on 
these data are the straight lines predicted by the elastic 
constant measurements with the lines chosen to pass 
through the data at the low-temperature end. It is 
obvious that the line based on elastic data is not the 
best straight-line fit to the data and that a good 
description of the data occurs only below temperatures 
of about 0/150 where we have required it to fit. In 
these cases, the elastic constants would have to be in 
error by about 4% in order to account for the difference 
in results. Such a difference is far outside experimental 
error for all the constants except C3. Table III shows 

_that if this one constant is changed by 1%, the value 
of @ is only changed by 0.1% which makes it unlikely 
that the inaccuracy of Cy; could account for the dis- 
crepancy in @ values. It may be argued that the true 
Debye region has not been reached for these hexagonal 
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metals even though the temperature is below 6/100, 
However, the magnesium specific heat data does fall 
on a straight line which indicates that it can be well 
described with a T’ dependence. The zinc data deviate 
slightly from a T’ law and Seidel? has made a least 
squares fit to his heat capacity data assuming a tempera- 
ture dependent @. From this he extrapolates to T=0°K 
to obtain a value of 6>=309°K which is still 5% below 
the elastic constant value. 


TEMPERATURE DEPENDENCE OF 06 


The previous section has shown the degree to which 
Debye theory describes experimental specific heat 
measurements near absolute zero. In order to calculate 
the lattice specific heat above T=O using elastic 
constant data, it is necessary to modify Debye theory 
to include effects which arise because the solid is not 
a continuum but a lattice of discrete atoms. This 
requires considering a model in which the solid consists 
of mass points held in equilibrium by forces which are 
expressable in terms of the elastic constants. For 
convenience, the results of these modifications are 
usually expressed in terms of an effective @ which can 
be compared with the Debye 0. This modified @ is 
expected to vary with temperature and its temperature 
dependence can be calculated assuming a particular 
force constant model. From this, the lattice specific 
heat can be determined and compared to the calori- 
metric measurements in order to judge the validity of 
the model chosen. It is also possible to calculate a 
calorimetric 6 from specific heat measurements and 
compare it with the modified @ predicted by the lattice 
theory. But unless very precise calorimetric measure- 
ments are used, the resulting values of @ may show 
considerable scatter and make a judgment of the 
theoretical model very difficult. We have chosen to 
compare the specific heat values themselves since these 
are of more technical interest and since it is then 
obvious how much error can be expected from using 
elastic constant models to estimate lattice specific 
heats. 

Figure 3 shows the experimental data for copper,” 
silver, and gold® plotted as C/T versus T°. The solid 
curves labeled D show the variation of the specific 
heat predicted by the unmodified Debye theory which 
has the constant value of 9=6». Gold, which shows 
deviations from the elastic constant ĝo at temperatures 
of 0/60, follows this theory to remarkably high tem- 
peratures (£~6/7) even though silver and copper 
deviate considerably from it at temperatures of the 
order of 6/20. The curves marked L are based on the 
model of de Launay! who assumes a central force 
connecting nearest and next-nearest neighbor atoms 
eas F. Giauque and P. F, Meads, J. Am. Chem. Soc. 63, 1897 
“nds, Forsythe, and Giauque, J. Am. Chem. Soc. 63, 1902 


“T, H. Geballe and W. F. Giauque, J. Am. Chem. Soc. T 
2368 (1952). 
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and a third force arising from the compressibility of 
the electron gas. This model does not describe the 
lattice specific heat for gold and copper over the tem- 
perature range in which data are available but seems 
to describe the silver data up to temperatures of approxi- 
mately 6/10. 

The curves marked H are the predictions of the model 
of Horton and Schiff** which are based on assuming 
a three-force constant noncentral interaction between 
nearest neighbor atoms only. This model is in better 
agreement with the measurements for copper and gold, 
but not enough to be significant. None of the models 
gives a really satisfactory description of the specific 
heat in the range where measurements are available. 
However, this apparent failure may not be serious 
since these theories are expected to apply only up to 
temperatures of 0/20 and the available data do not 
cover this range. A critical evaluation of the models 
must wait for accurate specific heat data between 4 
and 15°K. 

A model applicable to hexagonal metals has been 
calculated by Garland and Slutsky.*® Using central 
forces between the first three nearest neighbors and a 
force due to the compressibility of the electron gas they 
find that @ decreases with increasing temperature but 
not as rapidly as is observed. They also find that zinc 
with its high c/a ratio cannot be described well with 
their model and do not attempt a comparison with the 
specific heat values. 


CONCLUSIONS 


For the cubic materials considered, the value of ĝo 
calculated from elastic constants measured at 4°K 
by direct numerical integration is in very good agree- 
ment (except possibly for gold) with the low- 
temperature specific heat measurements. The tables of 
de Launay yield results which agree with the integration 
to within the accuracy of reading the table and is by 
far the easiest to apply. The two expansion methods of 
Betts e¢ al. and of Quimby and Sutton appear to give 
o accurately to 1% even for the most anisotropic 
case but the method of Quimby and Sutton seems 
laborious and more susceptible to mistakes. Considering 
the differences in the results of the series approximation, 
and the errors in interpolation and extending de 
Launay’s tables, we advance the values listed under the 
Numerical integration column in Table II to be the 
best available values calculated from elastic constants. 

Since the lattice specific heat predicted by Debye 
theory from elastic constants is consistent with the 
calorimetric measurements to the extent shown in 
Fig. 1, it seems possible to conclude that the simple 
Debye model of lattice heat capacity is a very good 
approximation at low temperatures in cubic substances. 
In materials where the lattice specific heat cannot 

3 G, K. Horton and H. Schiff, Can. J. Phys. 36, 1127 (1958). 


3# C. W., Garland and L. J. Slutsky, J. Chem. Phys. 28, 331 
(1958). 
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Fic. 3. Comparison of the specific heat calculated from several 
different models with the measurements in the 15 to 30°K range. 
The curves marked D are based on Debye’s continuum model 
while those marked L and H refer to the lattice models proposed 
by de Launay and by Horton and Schiff, respectively. 


clearly be separated from other contributions (i.e., 
when there are large electronic specific heats, low- 
temperature phase transitions, or magnetic interactions) 
the elastic constants can be used to define the lattice 
contribution without introducing appreciable error. 

For the hexagonal metals and gold there seems to bea 

definite difference between the calorimetric ĝo and the 
elastic constant value. The reason for this is not certain. 
It is not likely that errors in elastic constants will 
account for the difference. Furthermore, for zinc two 
independently measured sets of elastic constants result 
in nearly the same ĝo value. It is possible that small 
systematic errors in the specific heat measurements or 
uncertainties in the temperature scale would account 
for the difference. 

It may be that temperatures of 6/60 are not suffi- 
ciently low for the simple Debye model to apply to 
these metals. This may imply that dispersion of the 
elastic waves is an important effect at these tempera- 
tures. Simple reasoning indicates that for an isotropic 
medium deviations down from the straight line in a 
C/T vs T? plot are a result of an upward bending of the 
w(k) curve and conversely for an upward deviation. — 
Although not strictly applicable to anisotropic media, 
such reasoning would indicate that the w(k) curve 
for gold and magnesium tends to bend upward away _ 
from the Debye approximation while for zinc it bends 
downward. However, examination of Figs. 1 and 2 
indicates that the data need not be considered as 
“curving away” from the theoretical straight line sii 
it can be fitted by a straight line of different slope. 
the dispersion must be of a rather special nature. 
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elastic constants and could therefore cause the observed 
differences. If dislocations contribute to the measured 
elastic constants, the resulting value of ĝo would be 
lower than the calorimetric value. This might be the 
case for gold and magnesium, but the deviation for 
zinc is in the other direction. Furthermore, the excellent 
agreement for copper and silver must then be considered 
as somewhat fortuitous. 

It has been predicted’? that the calorimetric ĝo should 
be larger than the elastic ĝo because the measured elastic 
constants include a contribution due to a relaxation 
mechanism of the electrons. The available low-tempera- 
ture elastic constant data do not indicate any such 
relaxation mechanism of the required magnitude and 
the values of 6) which are predicted are in good agree- 
ment with the calorimetric @y’s for metals with pre- 
sumably the same free electron structure as gold. 

The attempts to describe the specific heat at tem- 
peratures where Debye theory fails by using a lattice 
model are satisfying only in a qualitative way, but are 
unsatisfactory in predicting the detailed temperature 
dependence of 0. However, before a critical comparison 
can be attempted, accurate specific heat measurements 
must be made in the 4° to 20°K range. 
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ADDENDUM 


Since submitting this article, more elastic constant 
and calorimetric data have become available. Calori- 
Metric measurements by Phillips?® down to 6/1000 


TABLE I. Additions. 


Refer- 
Cu Ciz Cu Q- p ence 
Elastic constants extrapolated from 4.2°K 
KCl 0.483 0.054 0.0663 3.294 2.038 a 
Nickel 2.612 1.507 1317 9.206 8.968 b 
Elastic constants extrapolated from 77°K 
KI 0.338 0.022 0.0368 2321 3.197 a 
Thorium 0.780 0.482 0.513 3.058 11.787 c 


OOO mmm 
eer d C. V. Briscoe, Phys, Rev. 112, 45 (1958). 


M. H. Norwood and vd Sato (to be published). 
S bale , Neigh baison, and Smith, J. Appl. Phys. 30, 36 (1959). 
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TABLE II. Additions. 


Numerical Calori- 
de Launay integration metric Reference 
Elastic constants extrapolated from 4.2°K 
KCl 236.3° 237.1 235.1 a 
Nickel 476.3 476.2 441 b 
Elastic constants extrapolated from 77°K 
KI 130.9% 1315 132.3 a 
128 d 
Thorium 164.2 170 e 


a See reference 40. 

b J. A. Rayne and W. G. Kemp, Phil. Mag. 1, 918 (1956). 

e These values were calculated using tables prepared by J. de Launay 
which are to be published in J. Chem. Phys. 

4 See reference 39. 

e P, L. Smith, Proc. Conf. de Phys. des Basses Temp., Paris, p. 283 (1955) 


(0.3°K) gives 60>=426° for aluminum which compares 
favorably with the calculated value of 0o= 428.2 shown 
in Table II. New calorimetric measurement on LiF” 
down to 6/80 give 6.=722°K which is in disagreement 
with the calculated value in Table II. However, these 
latter calorimetric data are somewhat scattered and, 
in fact, can be reasonably well described by a line 
corresponding to @>= 734.1°K. 

Other new data are shown in the added tables. 
Additions to Table I shows the recent elastic constant 
data. The values of atomic volume and density are 
taken from the reference cited. Additions to Table II 
shows the resulting calculated values of 8o. 

As before, the calculated values of ĝo are in good 
agreement with each other. The results of calorimetric 
measurements are also listed. For thorium, although 
the calorimetric ĝo seems to be in disagreement, the 
calorimetric data up to 6/50 is well described by 
choosing 69>=164.2°K. The calorimetric and elastic 
constant values for nickel are apparently in disagree- 
ment but the calorimetric data are quite scattered so 
that critical comparison is not possible. 

For the two alkali halides listed, the calculated and 
calorimetric values are in apparent agreement. How- = 
ever, on a plot like Fig. 1 the elastic constant line 
through the origin does not fit the calorimetric data, 
which extends down to almost 6/100. Dispersion effects 
can explain this failure as shown by Barron, Berg, and 
Morrison® who obtained the values of 8o listed above 
by extrapolating calorimetric data to T=0. This result 
indicates that deviations from the Debye approximation 
occur at relatively lower reduced temperatures for the 
alkali halides than for the cubic metals. If dispersion 
is to account for the discrepancies in the hexagonal 
metals, it must be concluded that the Debye approx 
mation is inapplicable at reduced temperatures much 
lower than those which are suitable even for the alkali 
halides. 


ference on Low Temperature Physics and Chemistry (University of 
Wisc. Press, Madison, Wisconsin, 1958), p. 414. 
3 W. W. Scales, Phys. Rev. 112, 45 (1958). 2 
# Barron, !Berg, and Morrison, Proc. Roy. Soc. (London) 24 
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“Tt is strange that practical electronics remained untouched by these fundamental facts (electron spin) and 
could get along with the notion of the charged mass point or the minute charged sphere’ [Arnold Sommerfeld, 
Electrodynamics (Academic Press, Inc., New York, 1948) ]. 


I. INTRODUCTION 


HE development of electronics during the first 

half of this century was based almost completely 

on the quantum states of translational energy of the free 

electron. Only recently did physicists recognize the fact 

that the spin states and the bound states of electrons in 

atoms and molecules could be employed for amplifica- 

tion. We give here an account of these technological 

advances together with a summary of the physics which 
made them possible. 

The physical principles and experimental techniques 
required for development of paramagnetic amplifiers 
and molecular amplifiers were well established in the 
period 1945-1950. Such microwave amplifiers have 
much lower noise than those employing thermionic 
vacuum tubes. This low noise property has been the 
principal motivation for the large amount of work in 
this field. 

The earliest work on molecular amplification started 
independently at three different laboratories.’ It had 
a different motivation, namely, the hope that millimeter 
wave oscillators and amplifiers would result. 

The word “maser” was coined by the Columbia group? 
as an acronym for “microwave amplification by the 
stimulated emission of radiation.” This word appears in 
Webster’s dictionary with the spellings “mazer” and 
“maser” and has the meaning “a large drinking cup, 
originally of a hard wood.” The free electron vacuum 
tube amplifier may also be regarded as'operating thfough 
the mechanism of the stimulated emission of radiation. 
In what follows, however, the word maser is taken to 


Fic. 1. Absorption of microwaves by a gas. 
I ) 


* This work was supported in part by the Office of Naval 
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Rev. 99, 1264 (1955). 

3N. G. Basov and A. M. Prokhorov, J. Exptl. Theoret. Phys. 
(U.S.S.R.) 27, 431 (1954); Proc. Acad. Sci. (U.S.S.R.) 101, 47 
(1955); J. Exptl. Theoret. Phys. (U.S.S.R.) 28, 249 (1955). 


mean either a molecular amplifier or a paramagnetic 
solid state amplifier. 


II. AMPLIFICATION BY SYSTEMS HAVING A 
HIGHER ENERGY STATE MORE DENSELY 
POPULATED THAN A LOWER STATE 


A. General Principles 


We follow the original! discussion of this principle of 
maser type amplification. In the experimental arrange- 
ment of Fig. 1, there is a microwave system which 
consists of a wave guide with gas inside, in thermal 
equilibrium. Such a gas may absorb microwaves, and 
this absorption can be described in the following terms. 

Suppose the gas has a pair of energy levels (Fig. 2) 
with values £; and Æ», with E> E. Let n, and nə be the 
numbers of particles with energies Æ, and E». Let the 
absolute temperature be denoted by T and Boltzmanns 
constant by k. Then we can write 


ns=n;e EZ EDIKT, 


(1) 


We suppose that electromagnetic radiation is present 
with frequency »v given by 


v= (E,—E))/h. (2) 
The power absorbed by the gas can be written 
Pa = W inhv. (3) 


Here Ws is the transition probability for transitions 
from state 1 to 2 induced by radiation. Similarly the 
power emitted by the gas due to the stimulated emission 
of radiation is given by 


Pe=Waynehr. (4) , 


Neglecting spontaneous emission for the moment we 
can write W.2:=W 2 and for the net power absorbed 


P=W (ni— ne) hv. (5) 
P will be positive and the gas will absorb power, if nı 
exceeds n. This will be the case in consequence of ex- 


pression (1), if the gas is in thermal equilibrium. How- 


Na 
Fic, 2. 


ny 
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ever, the net power absorbed can be made negative, and 
amplification will result, if n>n. For the present we 
consider amplification by a one-quantum process, re- 
sulting from the nonequilibrium situation 22>. The 
same considerations which lead to coherent absorption 
for 71> nə suggest that for* 22>, the amplified signal 
will be coherent with the driving signal. The energy 
levels which are employed may be bound states of 
molecules, or spin states in a magnetic field. The low- 
temperature solid state masers use paramagnetic ions in 
an externally applied magnetic field. 


B. Three-Level Method 


Many methods have been proposed for obtaining a 
higher population density in an upper energy level than 
in a lower one. Most of these are not generally useful, 
but may nonetheless have special applications. A fairly 
complete survey is given in Sec. IV. First we present a 
summary of the three-level method, first proposed 
by Prokhorov, and developed independently by 
Bloembergen,® and also by Javan; this appears to be by 
far the most generally useful one. We follow Bloem- 
bergen’s treatment. 

In a system with three energy levels, #1, E2, and E; 
(Fig. 3), let the selection rules be such that transitions 
are allowed between each level and either of the other 
two. In thermal equilibrium, the numbers of particles in 
the different states satisfy the relations, 


(6) 


The frequencies v2, v21, v31 are defined by the 
relations 
(7) 


Ny > N2> N3. 


Vmn= (Em— En)/h. 


ES Ns 

Ez n 2 
Fic. 3. 

E, h; 


4The term negative temperature [E. M. Purcell and R. V. 
Pound, Phys. Rev. 81, 279 (1951)] is often used to describe a 
= system in which higher energy levels are more densely populated 
than lower ones. This is consistent with expression (1). The idea 
of a negative temperature implies that while part of a system is not 
in equilibrium with its environment, it is sufficiently isolated from 
its positive temperature bath that it can be considered as a sepa- 
rate thermodynamic system described by a temperature. The con- 
cept of spin temperature has been carefully considered by A. 
Abragam and W. G. Proctor [Phys. Rev. 109, 1441 (1958)]. 
Negative temperature is only possible for a system whose energy 
Jevels have an upper bound. Thus when such a system is heated 

fom 0°K to a temperature + % the levels become equally popu- 
Tf it is heated further the temperature changes discontinu- 
d on further heating it eventually tends to minus 
t all particles are in the state of highest energy. 
d, negative temperature is hotter 

oembergen has Bein by ky that the discontinuity 
Blt eee been avoided if the reciprocal of temperature 


fr 

a 

; n o — %2, an 
K at which pom 


“At negative temperatures resistances become pega- oe 
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An intense rf field of frequency v3; will induce transitions 
between states 1 and 3. Since initially there are more 
particles in state 1, the particles will initially leave state 
1 at a greater rate than return to it from state 3. 
Saturation phenomena will result in which #3 —> nı. 
Under these nonequilibrium conditions we may expect 
either that 22> or n> nə. In the first case amplifica- 
tion is possible at frequency v», and in the second case at 
v32. In order to arrive at quantitative values relaxation! 


§ An important issue in the operation of a paramagnetic ion low- 
temperature maser is the mechanism of paramagnetic relaxation. 
We discuss briefly some pertinent aspects of this problem. It 
appears more complex than the corresponding nuclear magnetic 
resonance problem. One experimental fact is that it is possible to 
couple the saturation radiation to one set of spin levels and 
simultaneously saturate a second pair of levels at approximately 
the same spacing which is weakly coupled to the radiation field 
(Strandberg, Davis, Faughnan, Kyhl, and Wolga, Phys. Rev. 109, 
1988 (1958) ]. Strandberg [Phys. Rev. 110, 65 (1958)] has in- 
terpreted these results as evidence for the idea that there are 
anomalies in the phonon excitation, with the lattice modes with 
frequency near the spin resonance in equilibrium with the high- 
temperature saturated spin system. He considers this to be in part 
a consequence of the fact that the specific heat of the lattice 
vibrations at 4°K (calculated from say the Debye model) is 
several orders smaller than that of the paramagnetic spin system. 
Giordmaine, Alsop, Nash, and Townes [Phys. Rev. 109, 302 
(1958) ] have described a series of experiments using paramagnetic 
resonance at 9000 Mc, at 1-4°K; GdeMg3(NOs3)12-24H:0), 
K3Cr(CN).o, and Cu(NH,)2(SO4)2-6H2O) were used. They found 
that it was not possible to “burn a hole” in their resonances by 
saturation, indicating that their lines were homogeneously 
broadened. In the case of Cu, saturation of one line immediately 
saturated seven neighboring resonances. They were not able to 
observe the spin reversal by adiabatic rapid passage. No difference 
in absorption was noted in times as short as one msec after 
adiabatic rapid passage. They also observed that excitation due 
to a rapid sweep through a single resonance decayed very rapidly 
while that due to a slower sweep decayed at the normal rate, 
T,— 10 sec. They interpreted these results as evidence for the 
idea that for some salts at low temperatures it is the lattice bath 
relaxation which limits the total relaxation rate. [Gorter, Van der 
Marel, and Bolger, Physica 21, 103 (1955) ]. They also remarked 
that for these salts the spin lattice relaxation time is several orders 
smaller than the observed values of Ti, and is of order 1075 sec 
for Gd and 10™ sec for Cu. In addition, they concluded that the 
breadth of the lattice modes is much larger than the width of the 
resonances in the diluted crystals (several hundred Mc for 1% 
paramagnetic concentration of the Cu salt), that the relaxation 
time Tı is dependent on crystal size, and that the breadth of the 
lattice modes increases with increasing concentration of paramag- 
netic centers. They also suggested that for those salts described 
by the foregoing conditions, operation of an adiabatic rapid pas- 
sage two-level maser would be impractical while three-level maser 
operation would be possible with reduced band width. 

Bloembergen [Phys. Rev. 109, 2209 (1958)] and Strandberg 
independently [Phys. Rev. 110, 65 (1958) ] have reached a differ- 
ent conclusion. Bloembergen points out that three-level maser 
iction is not possible for a salt whose paramagnetic relaxation rate 
is determined by interaction between lattice vibrations and the 
helium bath and for which the heat conduction between spin sys- 
tem and lattice is 1000 times better than between lattice and heat 
bath. Let v3; be the saturation frequency and v3. the amplification 
frequency and let these frequencies be widely spaced, with no 
overlapping of phonon bands. Then we have a negative spin 
temperature corresponding to levels 2 and 3. If the thermal con- 
tact with the phonons is very good then a band centered at fre- 
quency v2 will start to gain energy, in order to attempt to reach 
equilibrium with the negative temperature spin system. Since the — 
lattice vibration levels are those of harmonic oscillators, they have 
no upper bound. They cannot be saturated [R. Karplus and J. 
Schwinger Phys. Rev. 73, 1020 (1948)] and the temperature as- 
sociated with the v32 band can never become negative. The steady 
te (nonequilibrium) situation would be one in which thean 

HAPI Kewe high negative temperature and thos 
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effects must be included. Let wy. be the heat bath in- 
duced transition probability from state 1 to state 2, 
with corresponding meanings for w21, w32, W23, W13, and 
ws. First suppose the system is in thermal equilibrium 
with no microwave fields applied. The number of 
particles leaving the state E, per sec must equal the 
number returning to it, so 


MW = NeW, NW = 11303). 


(8) 
Employing the Boltzmann factor enables us to write 


(9) 


The w’s are the reciprocals of the spin lattice relaxation 
times. Now suppose that a strong microwave field of 
frequency vs), and a weak microwave signal of frequency 
v32 are present. Let W3, and W32 be the transition 
probabilities induced by these rf fields. The numbers of 

particles 71, 22, and 73 occupying the three levels satisfy 
the relation, 


W2/Wo1 = N/N = Ee AAT, 


My+netn3=N. (10) 
For (/v32)/(kT)<1, the populations satisfy the 
Nhy32 


equations 
Nhy31 
) (2m ) 
3kT 3kT 


(n= na— 
+HWai(ni— n) +}Wa2(no— ns), 


lattice vibrations with which the spins are on “speaking terms” 
will be at a very high positive temperature. A high negative tem- 
perature means that the level populations associated with states 2 
and 3 are almost equal, and therefore the maser would not operate 
in the observed manner. Bloembergen has solved the problem of 
three-level maser and phonon steady states in terms of the spin 
populations 21, %2, n3 and the average lattice oscillator excitation 
quantum numbers in the three phonon bands, npa(va32), npa(vz1), 
nph(v31). His solution confirms that n3—nə is too small for maser 
operation if the contact between spins and phonons is better than 
between phonons and heat bath. 

A possibility which has briefly been explored by the author is 
that the temperature associated with all three maser spin levels is 
positive, amplification occurring as a result of a two-quantum 
process involving absorption of a saturation frequency photon and 
emission of an amplification frequency photon. This is a kind of 
Raman process in which energy is conserved in both transitions. 
Direct calculation shows that this effect is too small, and that a 
low negative spin temperature needs to be assumed, in conjunction 
with single photon processes, in order to explain the operation of a 
maser, 

_It can be concluded that the paramagnetic relaxation mecha- 
nisms suggested by Giordmaine, Nash, and Townes do not apply 
to those salts which have been successfully used in a three-level 
maser, 

Shapiro, Bloembergen, and Artman [Bull. Am. Phys. Soc. Ser. 
II, 3, 317 (1958)] have reported additional indirect saturation 
experiments which they interpret as caused by higher order spin- 
spin interactions rather than by “hot phonons.” This appears very 
reasonable because it is difficult to see how the lattice vibration 
modes could be tightly coupled to the saturated spin system with- 
out being similarly coupled to the negative temperature spin 
system. Further work by Bloembergen, Shapiro, Pershan, and 
Artman (Cruft Laboratory Technical Report No. 285, Harvard 
University, Cambridge, Massachusetts, October 15, 1958) has 

: increased the evidence that the indirect saturation phenomena are 
indeed due to spin spin interactions. This follows the mechanism 
Proposed by R. Kronig and C. J. Bouwkamp, Physica 5, 521 
(1938) and Physica 6, 290 (1939). 
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Hwa (m- n2— ) +W32(n3—n2), 
3kT 
diy Nhyv31 
Z wa(m-m+ ) 
dt 3RT 


N hva 
3kT 


ton(m=m+ ) — Wzi(nı— ns). 


These equations have the following approximate 
steady-state solution for the case in mind, W 31W 32. 


hN [ —wWogv32+We1rer 


= Wo3---Wei-+ Ws 


In case the numerator of (12) is positive, amplification 
is possible at frequency v3: In case it is negative, 
amplification is possible at the frequency v2; and the W32 
in the denominator of (12) must be replaced by Way. 
The signal power emitted by the material is ained 
by multiplying (12) by the signal ind ransition 
probability W3» and the energy of one quantum. 


Ni — No =N3z— N= 


| (12) 


Ni? v30 (weiv21— W32V32) We 


= (13) 
3RT (wos Wait W 30) 


The situation described by (12) and (13) is qualita- 
tively the same, but more complex, if (hv/kT) is not 
small compared to 1. The procedures for calculating the 
transition probability W3 require a knowledge-of the 
matrix elements of the interaction with the Maxwelļ 
field, and appropriate relaxation times. For the re- 
mainder of the section we consider a paramagneticAon 
type of solid state amplifier. In this ca8e the signal 


induced transition probability is given by es 


2m\ ? w . 
W= (=) |2| M13) H 20) (14) 
L 6 ` 
Here M, is the (w component) magnetic dipole 
operator, (H,*(v32))w is the volume average squared 
magnetic field at the signal microwave frequency V32, 
and T> is the spin-spin relaxation time. ae 
Expression (13) shows that the temperature T of 
heat bath must be low. The three-level solid s 
masers which have so far been successfully operat 
have employed liquid helium cooling with temperatu 
in the range 1.25-4°K. In addition to increasir 
amplification, the low temperatures partially ii 
the noise performance. RE 
The power which is absorbed from the saturation} 


4 
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field of frequency v3; is given by that needed to balance 
the tendency of collisions to restore equilibrium, and is 


P,= (hvs) n/2T:kT. (15) 


Here 7; is the spin-lattice relaxation time. 

Expressions (13), (14), and (15) lead to the following 
general criteria for selection of materials for a para- 
magnetic ion three-level maser. If the heat bath transi- 
tion probabilities w2, and we; are nearly equal, the 
frequencies v2; and v3. should be very unequal. However, 
if the frequencies v2; and vz: are approximately equal, 
then we; and wə; should be very unequal. If və is 
exactly equal to v32 the device cannot operate since the 
same signal which induces transitions between states 2 
and 1 will be effective in inducing transitions between 3 
and 2. Amplification between one pair of states would 
be annulled by absorption associated with the other pair 
since (12) requires that nı—n:=n,—nə. In order to 
obtain large gain the temperature T has to be small. The 
spin should be at least 1, but preferably not more than 
2, otherwise the particles will be distributed among too 
many states. (Gd with a spin of 4 is a notable excep- 
tion.) It is also desirable that the nuclear spin be zero. 
Expression (15) requires that for small T, the spin- 
lattice relaxation time T, be as large as possible, other- 
wise excessive saturation power will be needed with 
consequent difficulty in maintaining low temperatures. 
Expressions (13) and (14) show that the product of 
total number of spins, N, and the spin-spin relaxation 
time T» should be as large as possible. Inasmuch as an 
increase in concentration of spins tends, in general, to 
decrease T» it is necessary to attain an optimum value 
of the product NT: by suitably diluting the paramag- 
netic ions. The zero field splitting ought to be of the 
same order of magnitude as the energy level difference 
associated with the signal frequency. This follows be- 
cause mixing of the spin states is essential in order to 
have transitions allowed between all three levels. This 
is most favorable when the Zeeman and crystalline field 
terms are comparable. 

It is possible, and” in some cases desirable, to use a 
four-level system. A particularly elegant solution of the 
problem of four-level maser design employing ruby has 
been given by Kikuchi,’ Makhov, Lambe, and Terhune 
with #,—E,=E;—£). In the latter case a pumping 


DuBois [Phys. Rev. Letters 2, 262 (1959)]. They consider a hot 
reservoir with temperature T, and filter allowing a band of fre- 
es in the vicinity of v3; to pass, in thermal contact with a 
maser. Levels 2 and 3 are in thermal contact with a cold reservoir 


Ki rt, be, and Terhune, Phys. Rev. 109, 1349 
o Ree Lambe, Makhov, and Terhune (to be 
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frequency vy2=¥v3; populates the third level and simul- 
taneously depopulates the second level. 
A generalization of (13), given by Kikuchi, is 


[ee Wvg F Wa1V41-+ Wash a3 


=— [rors (16) 
wat wat wit w-t Ws 


4kT 


The active material of a maser can be placed in a 
wave-guide transmission system or in a resonant cavity. 
In the former case we have a traveling wave amplifier 
with power output given by 


P=P,¢8!, 


(17) 
where / is the distance along the amplifier. 


Differentiating this with respect to / leads to the gain 
coefficient 8 given by 


B=dP/Fdl. (18) 


Here (dP/dl) is the power emitted per unit length and 
P is the power at the point where (dP/d/) is calculated, 
in accordance with expressions (13) or (16). Employing 
(14), this can be written 


3273 (nz— no) | M32 | ?Tovf 
B= 3 (19) 
hv, 


where f is a filling factor which may approach 1 and v, 
is the group velocity. 

Here the power P has been set equal to the energy 
density times the group velocity v,. This equation shows 
that a slow wave structure (small group velocity) is a 
desirable method of obtaining a higher gain-band-width 
product. Slow wave structures proposed thus far are a 
ruby rod with a helix wrapped around it, considered by 
W. W. Anderson of Stanford University, and a rectangu- 
lar wave guide with conducting fingers giving circular 
polarization, considered by DeGrasse, Schulz-DuBois, 
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and Seidel of the Bell Telephone Laboratories. These 
structures could be employed either as transmission or 
reflection devices. Means must be provided to prevent 
output power from being fed back to the input, causing 
the amplifier to oscillate, and for preventing noise from 
the load from getting to the maser and being amplified. 
Transmission devices utilizing nonreciprocity would ac- 
complish this. 

The reflection cavity type of maser has undergone 
most development and appears capable of giving enough 
gain-band-width product for most purposes. A block 
diagram is sketched in Fig. 4. Here a nonreciprocal 
device, the microwave circulator,® prevents power from 
being fed back and causing oscillation, and keeps noise 
from the load out of the maser. Resonant cavity ex- 
pressions for gain and noise performance are usually 
expressed in terms of the Q’s (quality factors) of a 
cavity. The Q is defined by 


Q=2nvE/Pa, (20) 


where Æ is the energy in the principal cavity mode and 
P, is the rate of absorption of energy. The unloaded Q, 
usually denoted by Qù, is defined by 


Q.= 2rvE/ Paw, 


where Paw is the part of the absorbed power that is 
absorbed in the cavity walls. The magnetic Q, denoted 
by Qm, is defined by 


(21) 


Dd EO (22) 
xe 4P. 


On= TI 


where P, is the net power emitted by the active material. 

Here V. is the volume of the cavity and v is the 
frequency being amplified. The denominator is given by 
expressions (13) or (16). The external Q, denoted by 
Q., is defined by 


Qe=2rvE/ P.n, (23) 


where Pen is the net power output. 

The cavity mode may be represented by the equiva- 
lent circuit in Fig. 5. Gm, Go, and Ge are conductances 
associated with the spin system, the walls of the cavity, 
and the output load, respectively. From Fig. 5, a loaded 
Q may be defined as 


Om = O OO: 


The voltage standing wave ratio in the transmission 
system which drives the cavity, B, is given in terms of 


(24) 


8 Microwave circulators are not available below about 1400 
mc/sec. Autler has proposed an ingenious arrangement {Lincoln 
Laboratory, MIT, Rept. M 37-27; Proc. Inst. Radio. Engrs. 46, 
1880 (1958)] using two masers in a “balanced” arrangement, 
which does not require use of a circulator. The two masers are at 
opposite ends of a coaxial or wave-guide “magic” tee. One of the 
maser arms is a quarter wavelength longer than the other. The 
antenna and receiver are connected to the other two arms. Thus 
the two maser outputs combine at the load. Noise from the load is 
amplified by the masers but is then radiated back out through the 
antenna. 
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Fic. 5. Cavity equivalent circuit. 


the incident voltage V; and the reflected voltage V,, as 
IVi|+|V-| 1+g 


= = : (25) 
IRVA =g 
Here g is the “voltage” gain defined by g=|V,/V;|. 
The power gain is g’. 
After solving (25) for g?, we obtain 
g=((1—B)/(1+B) FP. (26) 
The maser will oscillate if —Qm 1> Q +0, and 
will amplify if Q !+0 > —Qm >Q". Let us con- 
sider some characteristics of an amplifier. 
From transmission line theory, the voltage standing 


wave ratio B (at resonance), in terms of the conductances 
shown in Fig. 5, is 

B= (Gm+G.)/Ge=QeLOo1+Om*]. (27) 

This assumes that the input transmission line is 


matched to a conductance (Gmt Go). 
Use of (26) gives for the power gain, 


0 —(Q07-FOm 2); 
= [ee | ? (28) 
QH (OHOn) 
In terms of the loaded Q this takes the form 
ge=20Q 1]. (29) 


The band width is obtained by dividing the operating 
frequency by the loaded Q. For a low-temperature 
maser Q, is so large that it may be neglected in (28). 
The product of the square root of power gain and band 


width is 
n Qt Onl 
Qe|Qn| 


For a high-gain device it is customary to adjust Q. 
so that it equals the magnitude Qm. Under these con- 
ditions (30) is approximately a constant, given by 


gAv=2v/|Qn|. 


(30) 
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Fic. 6. Energy levels of the ground state of Gd*** in ethyl 
sulfate, including spin-lattice relaxation times, as employed by 
Scovil, Feher, and Seidel. 


Ill. SOLID STATE MASER MATERIALS 
AND DEVICES 


The materials which have so far been successfully 
employed in three-level masers are”! gadolinium ethyl 
sulfate, potassium chromicyanide,"” and ruby.” We use 
the roman type g for the magnitude of the Zeeman 
tensor. This should not be confused with the italic g 

used for voltage gain. 

_ Gadolinium ethyl sulfate was used in the first suc- 
cessful solid state maser.!° The gadolinium ion is in an 
8S ground state having 7 electrons in a half-filled 4/ 
_ shell. There is fine structure splitting into 7 lines with 
. = spacing which varies approximately as 3 cos*@—1, 
where @ is the angle between the constant magnetic 


eld, Ho, and the crystalline field axis. If the steady 
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mately as 
IC=gbmH o-S:—4D[SP— 3S (S+1) ] 

+3D[S2—-S,7], (32) 
where m is the Bohr magneton, S is the spin operator, 
g=1.99, D=.02 cm“, and the axis of quantization is 
parallel to that of the constant magnetic field, Ho. The 
first term of (32) represents the interaction with the 
field H, which brings the transition to the required 
frequency. The second term makes the level spacings 
unequal, and the third term mixes the states, giving rise 
to transitions with AS.= +2. The angle between H, and 
the microwave magnetic field should be zero for the 
AS-= +2 transitions and 90° for the AS,=-+1 transi- 
tions. Scovil, Feher, and Seidel used an angle of 45°. 
The energy levels and relaxation times of the ground 
state of Gd*+* in ethyl sulfate are shown in Fig. 6. Be- 
cause the energy level separations are almost equal, 
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Fic, 8. Energy levels of KsCr(CN)« in KCo(CN)s vs applied 


magnetic field for Ho parallel to x, y, and z axes. 


expression (13) requires that the spin-lattice relaxaticn 
times we; and w7! be very unequal. A value for wsz 
ten times that of ws; is accomplished by introduction of 
cerium’ into the crystal. An optimum value of NTz was 
obtained by employing a 90-mg (8% of the resonant _ 
cavity effective volume was filled) lanthanum ethyl 
sulfate crystal containing 0.5% Gd+++ and 0.2% Cet**. 
The device was operated at 1.2°K with a saturation 
frequency of 17.5 kmc/sec. Oscillations were obtained at 
9 kmc/sec. 

The second material to be successfully used” in a 
solid state maser was potassium chromicyanide diluted 
in a cobalticyanide crystal [K3Co(CN).]. In this case 
0.5% Cr was used as the paramagnetic salt. This 
material appears to have the long spin-lattice relaxation 
time of 0.2 sec, at 1.25°K. The paramagnetic resonante 
spectrum of K¿Cr(CN)ę arises from two differently 
oriented complexes per unit cell. The spin Hamiltonian 
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For cobalt as the diluent D=0.083 cm~, E=0.011 
cm™, and the Zeeman tensor g is approximately isotropic 
and equal to 1.99. The direction cosines between the 
magnetic axes (x, y, and z) and the pseudo-orthorhombic 
crystalline axes (a, b, and c) are 


x 


y z 
a 0.104 0 0.994 
b +0.994 0 0.104 
c 0 1 0 


Energy level diagrams computed by McWhorter and 
Meyer are shown in Figs. 7 and 8. They used a dual 
mode coaxial cavity one half wavelength long at 2800 
mc/sec operating in the TEM mode. For pumping, the 
cavity operated in the TE; mode. The constant mag- 
netic field H, was applied, approximately parallel to the 
crystalline c axis, with operating conditions shown in 
Fig. 7. The upper two levels, labeled +3 and +4 were 
used for amplification and the second and fourth levels 


RUBY O° 


ENERGY KILOMEGAGYCLES / SECOND 


oO 2 AOOO 
MAGNETIC FIELD He KILO-OERSTED 


Fic. 9. Energy levels of Cr*+* in ruby vs applied magnetic field 
for He parallel to c axis of the crystal. 


(labeled +3 and —4) were saturated by the rf field. 
Figure 7 shows that the energy level separation of the 
first and second levels (labeled —$ and —%) is ap- 
proximately the same as that of the second and fourth 
levels. It has been shown}? that for the Meyer- 
McWhorter arrangement (constant magnetic field par- 
allel to the c axis and rf magnetic field parallel to the a 
axis) the transition probability for the first- to second- 
level transition is in fact 100 times greater than for the 
desired second and fourth levels. This is because the 
matrix elements are 4 times greater and the saturation 
radio-frequency magnetic field normal to the constant 
magnetic field favors the Am=1 transitions. This im- 
plies that indirect saturation was accomplished by 
McWhorter and Meyer. They coupled strongly to levels 
1 and 2 and succeeded in saturating levels 2 and 4 also. 


13 Standberg, Davis, Faughnan, Kyhl, and Wolga, Phys. Rev. 
109, 1988 (1958). See also Strandberg, Davis, and Kyhl, Fifth 


International Symposium on Low-Temperature Physics and JETP 3, 171 (1956); J. È. Geusic, Phys. Rey. 10 
Chemistry g * 4 15 J, E. Geusic, Phys. Rev. 102, 12: > 
emistry, Madison, Wisconsin, (August, 30, 19ST) versity Haridwar O by S3 PAN ee Se 
ss a > A e i 
Specs, u a te $ ss, 


a 


This possibility of indirect saturation of one pair of 
levels by coupling to another pair which have the same 
energy separation is a useful aid in maser design. The 
total number of spins in the upper quartet of the 
chromium ion is approximately 10% A 10% filling 
factor was used. The calculated gain-band-width product 
gAv, was 2.6X10® sec. The measured value was 
1.8X10® sec!. The small filling factor was chosen in 
order to be able to study the device without undue 
distortion of the cavity electromagnetic mode con- 
figurations. Practical amplifiers would therefore use a 
much larger filling factor, approaching unity. This gives 
a large increase in gAy, of the order of 50. This is partly 
because there are more spins and partly because the 
increased dielectric constant enables a smaller cavity to 
be used. 
The third material to be successfully used was ruby. 
Paramagnetic resonance in this material was investi- 
gated in the Soviet Union in 1955 and 1956," and in the 
United States by Geusic.'® 
Again the Crt** ion is used, but the host crystal is 
Al.O3. The four nonequivalent sites are indistinguishable 
since the spin is less than two, and the crystal has 
trigonal symmetry. Maser action in ruby was first 
demonstrated by Makhov, Kikuchi, Lambe, and 
Terhune at the University of Michigan.” The spin 
Hamiltonian for Cr+** in the AlO; lattice is given by 


K=6,S-g-H+D[S2—35S(S+1)]. (34) 


g is the Zeeman tensor with components g::=gi1, gzz 
=yy=g:. This Hamiltonian leads to the eigenvalue 


equation’ 
x? (Sgu? cos*@—g,* sin”) 
(2-1)? iret 3604 — —<$—$——— 
2g? 


2x? (g? sin’?@— 2g? cos?8) 


- =0. (35) 
g 
Here 
Energy BmH 
= ») Corea 
|D| |D| 
D=—0.1913 cm™=3.798X10-"" ergs, 
D oe 
mary g=1.986, g?=gi,7 cos’@+g,? sin?ð. i 


This equation is readily solved for the situatio 

6=0, 6=7/2, g? sin*@=2¢,,? cos?@. In the latter ce 

= cos! (1/73) gives (assuming isotropic £) 
e(+5)= H04 (5/4)a2+-2(3+22) t], 
e(+3)= [1+ (5/)x— x (83+). 


4 A. A. Manenkov and A. M. Prokhorov, Soviet Phys 
611 (1955); M. M. Zaripov and Iv. Ia. Shamo 
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Fic. 10. Energy levels of Cr*** in ruby vs applied magnetic field 
for H, at 20° to c axis of the crystal. 


Curves of energy levels as calculated with assistance of 
Dr. Schulz-DuBois are shown in Figs. 9-13. The four- 
level device using ruby has already been discussed. We 
are indebted to Professor Townes for the name “‘push- 
pull maser” for the four-level device of Makhov, 
Kikuchi, Lambe, and Terhune. 


B. Characteristics of Some Ruby Masers 


A ruby three-level maser has been developed by 
Alsop, Giordmaine, and Townes.'® A large filling factor, 
approaching 0.9, is used. The “voltage gain” band- 
width product approaches 100 Mc, and the band width 
is approximately 5 Mc. The magnetic Q is about 400. 
The volume of the active material is about 4 cm’. A 
rectangular cavity is used, operating in the TEo1ı mode 
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Fic. 11. Energy levels of Cr**++ in ruby vs applied magnetic field 
AA makat an angle of 40° with the c axis of the crystal. 


in proof.—The paramagnetic resonance absorption 
for A ERA eei has en AGA by Geusic, Peter, and 
Schulz-DuBois [Bull. Am. Phys. Soc. Ser. I, 4, 21 (1959)]. The 
spin Hamiltonian is 
Pe C= BaLeutl Set bs UeSe+1,S,) J+ DIS2—35(S+1)), 
p= 53.6 kme, gu=1.9730.002, gı=1.97+0.01. 
= ee. Giordmaine, Mayer, and Townes, Astron, J. 63, 301 
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for the signal frequency and the T£oi2 mode for the 
saturation microwave field. The small cavity volume 
results from the high dielectric constant of ruby. About 
30 mw of pumping power are required. The pump power 
is coupled to the wave guide by a remotely gear-con- 
trolled probe, and the signal power is coupled in by 
means of an iris. Liquid helium cooling with helium 
maintained at low pressure provided a bath temperature 
of 1.4°K. 

Morris, Kyhl, and Strandberg! have described the 
ruby maser illustrated in Fig. 14. All four levels are 
employed. The chromium concentration is about 0.01%. 
Levels 1-3 and 2-4 are saturated at 23 kMc/sec. 
Levels 2-3 are employed for amplification. The device is 
tunable over the range 8400-9700 Mc/sec, and has the 
new feature that cavity resonance is not needed at the 
saturation frequency if a saturation power of 100 mw is 
employed. The fringing fields near the coupling hole 
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Fic. 12. Energy levels of Cr#++ in ruby vs applied magnetic field 
for Ho making an angle of 55° with the c axis of the crystal. 


allow enough coupling to saturate the crystal at 
23 kMc/sec. 

Design of masers is facilitated by the extensive tables 
of energy levels and transition probabilities now being 
prepared by W. S. Chang and A. E. Siegman of the < 
Stanford Electronics Laboratories.'® 


IV. SUMMARY OF OTHER METHODS FOR 
OBTAINING MASER ACTION 


Bloch! showed, in 1946, that inversion of level 
populations can be obtained by “adiabatic rapid pas- 


11 Morris, Kyhl, and Strandberg, Proc. Inst. Radio Engrs: 41, 
80 (1959). ei; 

18 W, S. Chang and A. E. Siegman, Stanford Electronics Lab. 
Tech. Rept. 156-1 (May 16, 1958). Their machine calculate 
are presented in the Appendix to this paper. Note carefully 
their calculations were done using positive D. To use their curves — 
and data for the correct negative sign of D, change the sign 
their energies and regard their level labeled number four as 
having the lowest energy. Note their comments at the end of the 
Appendix. : 

19 F, Bloch, Phys. Rev. 70, 460 (1946). See also R. K. Wangs 
Reys g9, 728 (1953). 
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sage” through resonance. We have again a system of 
spins in a “constant” magnetic field H.. If a microwave 
magnetic field is applied at right angles to the “constant” 
magnetic field the spin system will precess about H,. 
Suppose that He is smaller than the value required for 
resonance at the microwave angular frequency w. If H, 
is steady for a long time, and is then suddenly increased 
through resonance and beyond, the spin system is 
turned over. The magnetization vector will be anti- 
parallel to H, until equilibrium is restored by the spin- 
lattice relaxation mechanism. The passage through 
resonance must be adiabatic, but rapid enough so that 
the sweep occurs in a time short compared with the 
spin-lattice relaxation time. While the magnetization is 
antiparallel to the field there are more moments anti- 
parallel than parallel. This means more particles in 
excited states than in the ground state. Such a system 
will therefore amplify. 
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Fic. 13. Energy levels of Cr++* in ruby vs applied magnetic field 
for H, making an angle of 90° with the c axis of the crystal. 


We assume the magnetic field H, to be in the 3 
direction. It is simpler to discuss the case where a 
circularly polarized rf field is applied, with its magnetic 
field in the x-y plane. The rf field is given by 


H,=H, coswt, Hy== Hy, sinot. (36) 


The — sign refers to a positive gyromagnetic ratio and 
the + sign to a negative one. The magnetization vector 
then satisfies the following equations. 


M,—y(M,H.—M.H,)+(M;/T2)=0, (37) 
M,—y(M.H.—M.H:)+(M,/T2)=0, (38) 
M.—y7(M,H,—M,H:)+(M./T)=M./Ti. (39) 


y is the gyromagnetic ratio, Tı is the longitudinal 
(spin-lattice) relaxation time, T% is the transverse (spin- 
spin) relaxation time, and M, is the value of the 
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Fic. 14. Ruby maser of Morris, Kyhl, and Strandberg. 


A quantity 6(/) is defined by the relation 
Hel / tN 


= (40) 
Ay 


6(/) is zero at the resonant field H,=w/|y|. The mag- 
netic field H, is assumed to vary at a slow enough rate so 
that 


| d8/dt|<«|yH,|. (41) 


This is a statement of the adiabatic criterion. Subject 
to this condition Bloch gives the following solution of 
(37), (38), and (39): 


M coswt 
===, (42) 
(1+8)! 
M sinwt 
M = F—, (43) 
(1-452)! 
M,=M6/(1+82)}, (44) 
1 t ölt eOe F(t’) 
MO=— f ALEO 5, 
TLS BEZAN, 
1 p‘pa(t’)+(T1/T2 
6()—0() =— f pa ane Jars (46) 
TS 1+8 (0^) 


In order to discuss adiabatic rapid passage we assume 
that 6?(¢’) has been constant for a long time and then at 
time /, is quickly increased through resonance, without 
violating (41). We have, for T/T», 


6(t)—O(t’) = (t—-)/T). (47) 


The part of (45) resulting from the rapid change of 6 is 
negligible. For the rest, since 6 is large we have, 
from (45) 

M()=+M (to). (48) 


Here the + sign refers to the situation where ô was 
positive up to ¿ło (Ho larger than the resonant value, 
initially), and the — sign refers to the situation where 6 
was negative up to /. (He smaller than the resonant 
value, initially, then increased through resonance). 
Thus the sign of M (?) does not change, but the sign of ô 
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changes as we go rapidly but adiabatically through 
resonance. According to (44) this means we have 
changed the sign of M- which is now antiparallel to Ho. 
An early unsuccessful attempt to develop an adia- 
batic fast passage solid-state maser was reported by 
Combrisson, Honig, and Townes.” Maser action due to 
adiabatic rapid passage using the paramagnetic elec- 
trons associated with the phosphorus donors in silicon 
was observed by Feher, Gordon, Buehler, Gere, and 
Thurmond. Somewhat similar experiments using neu- 
tron irradiated quartz and magnesium oxide were done 
by Chester, Wagner, and Castle.” With a quartz sample 
containing ~10!8 spins the inverted state persisted for 
2 msec at 4.2°K. A value of 510° sec was obtained 
for the “voltage” gain-band-width product for gains 
between 8 and 21 db. This work was done at 9 kMc/sec 
with microwave powers for inversion of about 0.5 w, in 
50-100 usec pulses. A modulation structure of the 
pulsed power emitted from the cavity as the magnetic 
field is swept through resonance is observed. Senitzky* 
has suggested that this amplitude modulation of the 
power from the resonant cavity is due to the periodic 
_ transfer of energy between the cavity electromagnetic 
field and the spin system. 
= Purcell and Pound* were able to obtain inverted level 
= populations in a magnetic resonance experiment using a 
ie ‘single crystal of LiF which had a very long relaxation 
_ time. They removed their crystal from the strong field 
and inverted its spin system by means of other rapidly 
: Bee magnetic fields. When the crystal was reinserted 
in ‘the strong field they were able to observe the return 
a 0 the magnetization to its equilibrium value, from its 
n Eea perature state. 
Weber studied the following method! in 1951. Con- 
id er a symmetric top molecule in an externally applied 
lectrostatic field. The Stark effect has linear and 
uadr: atic terms. The linear effect is the dominant one if 
eld i is not too strong. The energy levels are given by 


—w EKM; 
J(J+1) ~ 


a. Esxuj9= (49) 


x ih AM= + 1 transition will be allowed, and the frequency 
of h a transition will be 


y=p EK/J(J+1)h. (50) 


n, Honig, and Townes, Compt. rend. 242, 2451 


a Zombrisson, ti ture adiabatic rapid e 
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If the gas is in equilibrium it will absorb microwaves 
at this frequency. There are more electric dipole mo- 
ments parallel to the field than antiparallel to it. If the 
electrostatic field is suddenly reversed we have a nega- 
tive temperature, and more dipole moments antiparallel 
than parallel to the field, and the device will amplify 
during roughly one relaxation time. A pulsed oscillator 
may be constructed as shown in Fig. 15. Here we havea 
resonant cavity with Stark electrode. If a square wave is 
applied there will be a microwave pulse emitted each 
time the field reverses and the TE mode (microwave 
electric field parallel to the Stark electrode) will be 
excited. 

No experiments of this type were done because calcu- 
lations showed that it would be difficult to achieve a 
useful gain-band width product in any gas-type maser 
amplifier, and the use of a solid was suggested. 

A maser oscillator using a gas is, however, a very + 
useful frequency standard, and compares favorably in 
many respects with a cesium beam clock. Work on an 
ammonia maser oscillator started at Columbia Uni- 
versity in 1951 and the first maser oscillator and 
amplifier of this type was operated successfully by 


STARK ELECTRODE 


RESONATOR?’ 


SQUARE WAVE INPUT 


FG. 15. Pulsed symmetric top maser oscillator. 


Gordon and Zeiger? under the direction of Professor 
C. H. Townes in 1954. Similar work was done inde- 
pendently? by Basov and Prokhorov at the Lebedev 
Institute at about the same time. In addition to pro- 
viding an exceptionally stable oscillator, the ammonia 
maser can also be employed as a very high resolution 
spectrometer. A sketch of the focuser and beam and 
cavity details is given in Fig. 16. The maser action is 
accomplished in the following way. The nitrogen atom 
in ammonia is in a potential which is symmetric on 
either side of the plane of the hydrogen atoms. In this 
kind of potential each harmonic oscillator level is split, 
and the ground state splitting provides microwave 
absorption lines of different frequencies for different 
rotational levels. In a Stark field the energy of the upper 
doublet level increases and that of the lower level de- 
creases. The focuser of Fig. 16, provides an electric field 
with intensity approximately proportional to the dis 
placement from the axis. If a molecular beam is now 
sent through the focuser the molecules in the lower 
doublet state drift outward to regions of lower energy 
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enter the resonant cavity and undergo stimulated emis- 
sion. In this case, then, a higher population in excited 
states is achieved by the simple expedient of removing 
ground state particles from the beam. 

The condition for sustained oscillation of a resonant 
cavity with volume V which contains » excited atoms or 
molecules may be obtained by setting the emitted power 
equal to the power lost in the cavity walls. 


nW oyhv> Pr». 


Wo, is the transition probability. We write W% in terms 
of the appropriate squared matrix elements |u|? and the 
line width Ay, and P,, in terms of the quality factor Q to 
obtain (Gordon, Zeiger, and Townes, see reference 2) 


n> hVAv/4r|p|20. 


The frequency range over which appreciable energy is 
distributed in a maser oscillator is given in terms of the 
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Fıc. 16. Columbia ammonia maser. 


CAVITY 


total power output P, Boltzmann’s constant, and the 
cavity wall temperature T as 


ôv=4rkT(Av)?/P. 


An upper state population ten times that of the lower 
state may be achieved in practice. The power output is 
approximately 10~® w. An ammonia maser oscillator has 
been reported to have frequency stability over periods 
of one hour of the order of 1 part in 10%. The stability is 
associated with the low noise property. 

Another method of achieving inverted level popula- 
tions in ammonia gas has been proposed by Dicke.*® 
This uses a “hot grid” cell and again the quadratic 
Stark effect. A maser using this method of population 
inversion is being constructed by Dr. J. P. Wittke at 
RCA Laboratories. 

Level population inversion can also be achieved by 
means of intense pulses of microwaves. For simplicity, 


* J. P. Wittke, Proc. Inst, Radio Engrs. 45, 291 (1957). 
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consider, a two-level system with “permanent” electric 
dipole moment y, driven at resonance?! by a microwave 
field E= E, sin(wi—é). The Hamiltonian is 


I=, +u: Eo sin (wt— ô). (51) 


Let the wave functions with E,=0 be yı and yo. , 

If we use this in the time-dependent Schroedinger _ 
equation and neglect sum frequency terms, an exact 
solution of the resulting equations gives for the wave — 
function, 


|u: Bot l 
y= os( =) yie EUn) 
2h 


. (| ui2 Eole 
+|sin (==) ue sBatlh, 
2h 


a 

Here w is the dipole matrix element connecting the 
states 1 and 2. From this it follows that if the system is 
in the state yı at ‘=0, then at ¿= (arh)/(|ui2-Eo], the 
microwave pulse has driven the system into state 2, _ 
This treatment is valid only for times short compared A 
with the relaxation time since the effect of collisions is _ 
ignored. a 

Thus if we have thermal equilibrium at ¢=0 and ~ 
apply a strong rf pulse of the correct length we ca 
obtain inversion of level populations. This effect has 
been demonstrated by Dicke and Romer, and by 
Kyhl, Standberg, Collins, and Park.?? Y 

The methods of optical pumping can also be 
ployed’: to achieve maser action. In Fig. 17 we h 
system with a *S; ground state and a 2P, excited : : 
in a magnetic field. In equilibrium almost all of the 
particles will be in the split S state. The m= —3i : 
will have slightly more particles in it than the m= 
state. Right circularly polarized optical resonan 
tion will induce transitions with the selecti 
Am= +1. The excited particles can now sponta 
emit, either with Am= — 1, or 0. However tl 
only removes particles from the m=—4S sta 
returns them to both the m=—} and m= 
Therefore the population of the 294; state 


(52) 


‘1 


2 R, H. Dicke and R. H. Romer, Rev. Sci. Instr. 2¢ 9 

31 Kyhl, Strandberg, Collins, and Park, Signal 
ona State Masers, Fort Monmouth, Ne 

3 W, H. Culver, 
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creased over that of the 2S_; state. A review of the work 
on optical pumping has been given by Kastler.” 

All of the methods discussed so far use single quantum 
processes in which amplification results from more 
particles in excited states. A. Javan® has shown that 
two quantum processes can be used for amplification, 
with more particles in the lower state (Fig. 18). 

Consider a two photon process in which a particle in a 
lower state absorbs a photon, then emits a photon, 
ending up in an excited state. Suppose the frequency of 
absorption corresponds to that of an intense (pump) 
oscillator and the emission is stimulated by a signal 
which it is desired to amplify. In this case we are ob- 
taining emission in a particle transition from a state of 
lower energy to a state of higher energy. If all particles 
are in the ground state, amplification will result. Let the 
transition probability be Wy. and let 1, particles be in 
the ground state. The emitted power is Wyon,hv. If the 
excited state has m2 particles the reverse process may 
occur in which a photon of amplification frequency v is 
absorbed and a pump frequency photon is emitted. The 
net power emitted will be Wiz4v(nı—nə), which is 
positive at positive temperatures. 

For a one quantum process W 2 may be written as 


We=4a?7| H12'| 2/h?. (53) 


Here 7 is the relaxation time and H’ is the interaction 
matrix element. For a two quantum process H’ is re- 
placed by 

Da H, “Hir /(Eı— E;). (54) 


The summation is over all intermediate states of the 
particle and the quantized electromagnetic field. In 
Raman spectroscopy the intermediate states are usually 
quantum states which are different from the initial and 
inal states, because the diagonal interaction matrix 
liements ordinarily vanish. If the diagonal matrix 
tlements do not vanish in either the initial or final 
state, no intermediate state is necessary. For example 
we may have a diagonal magnetic dipole matrix element 
in a product of the following type 


(S335 Mi, N| B’ = Nı+1, No) 
X(—2, Ni+1, N2|H’|N2—-1, Ni+1, +3). (55) 


Here we use the nomenclature of quantum electro- 
dynamics. The first bracket involves a matrix element 
between states in which the particle spin is —3, and the 
electromagnetic field oscillators of frequency vı and v» 
have JV; and N: quanta, respectively, to a state with 
particle spin unchanged, .V;-++1 quanta for the oscillator 
with frequency vı, and N>» quanta for the oscillator with 
frequency vz. The second bracket then involves a final 
state in which the spin becomes +3 and the field 
ae ill tor of frequency v2 has lost a quantum. Three 


a ee 
onal matrix elements, similar to (55) may be 


a One of these corresponds to an intermediate 
; Opt. Soc. Am. 47, 460 (1957). 
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state of the particle the same as the initial state but an 
absorption first of frequency v2 followed by emission of 
frequency vı. The other two matrix elements involve an 
intermediate state of the particle which is the same as 
the final state, with both possibilities for the order of 
absorption and emission. Some of these matrix elements 
may be small in comparison with the others, depending 
on circumstances of the experimental arrangement. The 
summation (54) may be written in terms of the (mag- 
netic) dipole matrix elements M,, and My» as 


ie [Miu 2 H (v) JEM: H)] 


hy, 
[Mie- H (və) ]LMoo- H(v:)] 
+ 
hv, 
[Mi H (v) [M H (7)] 
+ 
lvo 


[Mi Hr) [Moe H (v)] 


(56) 


livo 


H (v) is the magnetic field at frequency v, and H (və) is 
the magnetic field at frequency v2. A corresponding ex- 
pression would contain the electric field vectors for 
electric dipole transitions. The transition probability 
can be written as 


ant fi [Mu H (1) JIMie- H (72) ] 
I? hy, 
es H (və) |LM22-H (v1) ] 
hy, 
„Mu o ee -H(»1) ] 
[Me H][M: H (2) ] 5 


h V2 


W= 
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The net power emitted by a two-level system which has 
nı particles in the lower state and nə particles in the 
upper state may then be written as 


EN _ [Mu H][M HG») J 
h hy, 
M,- H V2 M.»- Vy) 
,t (v2) J[Mo2-H (71) J 
hy, 
Mi-H V2 M.-H Vy 
4E (v2) JL (v1) ] 
hiv 
_(Mv-H0) Ma H0] 
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V. NOISE PERFORMANCE 


Prior to 1950, methods had been developed for 
generation, amplification, and detection of signals in the 
electromagnetic spectrum from zero frequency to the 
mm wave region. At frequencies below 10 mc/sec the 
noise performance is already better than is ordinarily 
required for communications in the presence of atmos- 
pheric and other external noise. Indeed, a well designed 
vacuum tube amplifier in the vicinity of 1 mc/sec is 
capable of noise performance corresponding to an in- 

. ternal temperature of about 25°K. Microwave ampli- 
fiers, on the other hand, did not have good low-noise 
performance, mainly because of electron stream fluctua- 
tions originating at a hot cathode. Inasmuch as this type 
of noise must be absent in a maser, it was suggested that 
molecular beam maser type amplifiers would have very 
little noise.*! It was very gratifying that the important 
sources of low-temperature solid state maser noise which 
were understood in 1956, are sufficiently small to make 
possible sensitivities which may approach detection of 
single microwave photons. 

We consider first the inherent spontaneous emission 
noise of a maser, disregarding circuit noise, at spin tem- 
perature approaching minus 0°K. The earlier discussion 
did not consider the effect of spontaneous emission. 
This is a purely random process” and therefore con- 
tributes noise. In an ensemble of charged particles, each 
having two energy levels, those which are in the upper 
state have a transition probability for transitions to the 
lower state of the form 


Wa= fu? ENH], 


in consequence of the interaction with the electromag- 
netic field.§ 

Here f is a function of frequency and squared matrix 
elements and N is the number of quanta per radiation 
oscillator. The one which adds to W gives the effect of 
spontaneous emission. Let us calculate an equivalent 
temperature for spontaneous emission. In thermal 
equilibrium at temperature T, according to the Bose- 


(59) 


°l The Columbia work (Gordon, Zeiger, and Townes, reference 2) 
showed that good noise performance could be obtained with a 
molecular beam maser. Spontaneous emission from the molecules 
was not discussed. For a room temperature device this is small 
compared with input noise. 

32 Tt has been shown [J. Weber, Phys. Rev. 94, 215 (1954) ] that 
this type of noise is also present in conventional free electron 
amplifiers. 

§ Note added in proof.—Expression (59) is given in Chapter V, 
The Quantum Theory of Radiation, by W. Heitler (Oxford Uni- 
versity Press, London, 1954), third edition. It follows from the 
quantization of the Maxwell field. Each degree of freedom is like 
a harmonic oscillator and the V-++1 arises from the squared har- 
monic oscillator matrix elements for downward transitions. 
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Einstein statistics, VW would be 
1 
eh kT _ 1 (CY) 


The positive temperature which black surroundings 
would need, to emit input noise equivalent to the 
spontaneous emission, will be denoted by Tg. From (59) 
we see that this (equivalent) Te corresponds to an 
equivalent V=1. 

Expressions (60) and (59) allow us to write 

1=e*”kTE—1, (61) 
giving 
Tge= hv/k ln.2. (62) 

Suppose we ask the question: how many microwave 
photons will double the randomly fluctuating output of 
a maser whose spin temperature approaches —0°K. Let 
the averaging time of the receiver be r and let the 
equivalent input noise energy during this interval be 
U,. This is given by the product of the Nyquist formula 
noise power for Tg and the averaging time 7. 


hy 
n| m 
e”IkTE—] 


Here Av is the band width. Avr=1. Employing ex- 
pression (61) gives 


(63) 


U n= hv. (64) 


The equivalent input (noise) energy over the aver- 
aging time is that of one photon. If one microwave 
photon is incident during this interval it will double the 
average noise output and therefore we are justified in 
saying that a maser with no circuit noise and spin 
temperature approaching —0°K may detect single 
microwave photons, and a flux of Av photons per sec, 
with reasonably large probability. 

The foregoing analysis gives exactly what would be 
predicted by the Nyquist formula for any resistance at a 
temperature —0°K. The Nyquist formula noise energy 
(delivered to a “matched” load) over the averaging 
time r is given by 


hvAvr 
U n= —. (63a) 
| ehelkT 4 | 
In the limit as T — —0° (63a) becomes 
Un=hvAvr=hy, (64a) 


in agreement with (64). This makes it clear that the 
residual noise in a low-temperature maser is of the same 
type as discussed by Johnson and Nyquist® long ago. 
If the spin temperature is not close to minus 0°K 
(64a) becomes 
hyAvr 


Ue (64b) 


a | e hel kTepin — 1 | ‘ 


=J, B. Johnson, Phys. Rev. 32, 47 (1928); H. Nyquist, Phys. 
Rey. 32,110 (1928). 
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For the case (hv)/(RT)<1, the number of photons re- 
quired to double the output is therefore 


| kT pin/hyv| 5 (65) 


Javan has remarked that the noise performance of a 
Raman type maser will be essentially the same as that 
of devices which employ greater populations of particles 
in excited states. This follows because the photon of 
frequency vı (Fig. 18) can be spontaneously emitted in 
the absence of a signal, provided that intense “pump” 
radiation of frequency və is present. 

A complete system will also have noise due to emission 
from circuit elements and the transmission system. A 
quantity which has been used to describe the perform- 
ance of radio receivers is the noise figure, denoted by the 
symbol F. F is defined as the ratio of the total noise 
power output of the receiver to that part of the noise 
power output due to the source at the input of the 
receiver. An equivalent definition is that it is the 
quotient of the signal to noise power ratio at the input 
divided by the signal to noise power ratio at the output. 

We now calculate the noise figure of a complete maser 
receiving system, considering first a traveling wave 
maser. Our approach is similar to that employed by 
Strandberg,* for two-level systems.*> We consider three- 
level systems, taking effects of saturation?’ into account. 
Let R,dv be the number of modes per unit volume in a 
range dy which can propagate, A be the cross-sectional 
area, and Vg the group velocity. The number of 
particles per unit volume in the states with energies Fy, 
E», and E; will be denoted by nı, nz, and ns. Let B be the 
power gain per unit length. A quantity K is defined by 
the following relation 


B=K (n2—7). (66) 


Let WV be the average number of quanta per mode in 
the vicinity of the operating frequency 2 and let p, be 
the energy per mode. The change in power as a conse- 
‘quence of energy exchanged in a differential length leads 
to 


VcAR,dv(dp,/dx) 
= VGA R,dv| Kn2hv(N+1)— KnjyhvN—a-Nhv 
+a-p,(T.) =ay3Vhy-+a33p,(T13) ]. (67) 


In expression (67) ,(T.) is the average energy per 
mode in equilibrium at the temperature of the wave- 
guide walls T., and from statistical mechanics p,(T) 
=/y[e”!*?—1]7, pr(T1) is the average energy per 
mode at the temperature Tis associated with particles in 
the energy levels with energies Æ and Es, a, is the 
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power absorption coefficient of the transmission system 
when the maser material is absent and æœı3(v21) is the 
absorption coefficient at frequency v2; associated with 
the particles in the energy levels E, and Ez. The first 
term on the right gives the effect of both stimulated and 
spontaneous emission in inducing transitions out of the 
state with energy E», the second term gives the absorp- 
tion by particles in the ground state, the third term 
gives the effect of absorption by conducting walls or loss 
other than the active solid, the fourth term gives the 
emission from the walls, the fifth and sixth terms 
represent the effects of absorption and emission from 
particles undergoing transitions between the first and 
third states. We define the temperatures Tj. and Tı; by 


niı/nog= enl kT ny/ng= e3 T 13, (68) 
Integrating expression (67) gives 


Bpv(T12) —acpy(T-) S| 


B-—a.—Q13 


(Py)out= (b») ing? 


x (1—g?). (69) 

Here g? is the power gain defined earlier. This tells us 
that the temperature of the environment of the maser 
enters in a subtle way, through the quantity a.p,(T-). If 
the absorption coefficient of the environment, ag, 1s 
small, then little noise is contributed even if 7, is not 
close to absolute zero. This is a consequence of 
Kirchhoff’s law. Let the source temperature be Ts, the 
load temperature be Tz, and let the transmission line to 
the maser have a power loss factor ¢ and a temperature 
T. Thermodynamic considerations enable us to express 
the output noise emission of the transmission line per 
cycle in terms of the input noise emission (per cycle) of 
the source, as 


(py) tine= tpy(Ls)+ (1—)p(T;). 


Expression (70) is (py) in for the maser. 
We can now use (69), (70), and the previous definition 
of noise figure to write 


(70) 


F Ai (bout po(Lr) 


gtp,(T.) 
= —i) p(T; 1— 9-2 
a DoT +(e) 
acp(T. a3p(T13) —BPATi2 p(T 1 
| pT) HoT) |, a in 
B-a,—4a13 ga 


Tt was once thought that the saturation field would 
make a significant contribution to the noise because the 
temperature Tı, associated with the (saturated) 14 
system tends towards infinity. This is not the case; om 
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The quantity a13(v21) may be written 


xM1| mi|? 713 ! 
aiz (v21) =————— zo (72) 
kTis 4r? (va — va) Hr? 


where x is a constant, and 713 is the appropriate relaxa- 
tion time. 


8 may be written in terms of x and the 1-2 system 
relaxation time 712, as 


_Xm| m|’ 
| P»(T 12) | i 


The ratio of the terms involving a3 and £ in the noise 
figure formula (71) is 


(73) 


(r13712)7? 


Ai? (voi— na) 713-2 


aispy(T13) a 
|Bp.(T1) | 


n| m3 | ll 


| o 
na| mi2]? 

For a typical amplifier in the microwave region, this 
gives a number of order 10~*. Physically this is because 
a high-saturation temperature means a small absorption 
coefficient for the 1-3 system, consequently small noise 
emission. 

In order to extend the previous results to a cavity 
type of maser we note that we have one mode with 
effective power width (a/2)Av. Av is the band width 
over which the power exceeds one half of the maximum 
(exact resonance) power. The 2/2 takes account of 
those parts of the response outside of the region where 
the power exceeds half the power at resonance. In order 
to calculate the noise we again employ detailed bal- 
ancing on a power per cycle basis, making use of the 
definitions of the various Q’s employed earlier. The 
noise power output per cycle is 


Pin =(g+1)? |. (T,)+(1—) p(T») 


ah 
Cc) aaah Tiz y Tis 
H PT9 e (Tu)+re(Ts))} 
“ettaianay (75) 
“a 1 
The noise figure is then given by 
(g+1)? 
F=>=— a y T: x I, 
7: aa Dp(T)+tp.(T.) 
+=] p(T.) art OTa +yp(Tis)) | 
Qo TY 
(g— 1) a | g p(T 1) 
Mees A A ————. (76) 
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Here the quantity y is defined by = 
Om —a3(v21,7 13) ; 
y= eS (77) 
Qis B 


with Q;3 the Q associated with energy absorption by the 
1-3 system. The somewhat unusual appearance of (77) 
with the factor (1+-g)? results from the fact that g? is the 
power gain at resonance. Different parts of the cavity 
response contribute noise, but with different effective 
gain. 

The term noise temperature is often used, rather than 
noise figure. This may be defined in terms of the noise 
figure as 


[eh kTn— oe = (F— 1)[e*”kT:— 1} 


hy . (78) 
T,=(F—1)T,, for —<X1 4 
kT 

Here Tn is the noise temperature, and T, is the source 
temperature. 

McWhorter and Arams*’ have measured the noise 
temperature of a complete solid state maser system and 
found it to be 20+5°K. 

The use of the concept of noise figure or noise tem- à 
perature was very meaningful for the older type of 
microwave amplifiers. We propose a method of de- 
scribing the noise performance which seems better for 3 
very quiet amplifiers. The number of photons received 
over the receiver averaging time which will double the 
noise output can be employed as a “noise number” to 
specify the performance of a low noise receiver. We 
showed earlier that the minimum noise number of a 
maser is one. The connection between noise number Nn 
and noise temperature Tn is 


Na= [erri kTn— 1J [er *Ts— Lj 


VI. MICROWAVE PHOTON COUNTERS 


(79) 


The maser makes use of stimulated emission. If all 
of the particles were in the upper state of a two-level 
maser, it could amplify without absorbing any of the 
incident photons. However, under these conditions the 
noise number is one, so that at least one incident photon — 
is necessary in order that its randomly fluctuating out- 
put be doubled over the receiver averaging time. The 
incidence of one photon could be interpreted as 
spontaneous fluctuation. In an earlier paper’ w 
marked that a maser is a voltage amplifier, not a po 
amplifier, and that all voltage amplifiers have 
taneous emission noise. Detectors such a 
counters must absorb energy in order to op 
ever, unlike a maser, the int 
counter or power amplifier 
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small. Such a device has essentially zero output until it 
detects a photon. It was proposed** that power ampli- 
fiers and detectors be developed which employ particles 
initially in their ground states. Methods for doing this 
are now being studied here. Consider a three-level 
system, in which the frequency v2; (Fig. 19) for transi- 
tions from the first to the second state, is in the optical 
region. A lamp illuminates with intense light of fre- 
quency v2; which can be linearly polarized. Then 
transitions from state 1 to state 2 are allowed, since 
Am=0. Excited particles in state 2 can spontaneously 
emit only linearly polarized light. We employ a detector 
which counts circularly polarized optical frequency 
photons, which can be emitted only in certain direc- 
tions. In order to detect microwave photons we arrange 
to have them circularly polarized. Then a microwave 
photon can be absorbed in a transition to state 3 or 4, 
since Am=1. Excited particles in states 3 or 4 can now 
spontaneously emit circularly polarized photons which 
will be counted. Thus if no microwave photons are 
present we have particles in states 1 and 2 and only 
linearly polarized photons. The device has no output. As 
soon as a microwave (or infrared) photon is absorbed we 
have a particle in state 3 or 4, and when this particle 
emits, we have a circularly polarized photon, which is 
counted. This method is similar to that which has been 
employed in optical pumping experiments. The transi- 
tion probability for this type of third-order process can 
be calculated in the following way. Let the wave 
function for atom and electromagnetic field be ®(/). 


D=} a,()V,. 


The W; are unperturbed wave functions of particleand 
electromagnetic field. Let the ground state be denoted 
by the subscript M. Let ay=1 at ‘=0, and a;=0 for 
J+M, at t=0. Let H’ be the interaction part of the 
Hamiltonian. We quantize the electromagnetic field, H’ 
is then not time dependent. Our process is one involving 
three photons. One is absorbed from the lamp, one is 
absorbed from the microwave field, and one is then 
spontaneously emitted. The third-order probability 
amplitude coefficient is given by 


hH yx’ Hry Hya 
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The intermediate states are denoted by the subscripts 
N and K. We are interested in three photon transitions 
e in which energy is conserved or nearly con- 
n all three steps. The subscript N refers to the 
ate in which a particular oscillator excited by the 
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lamp has lost one photon. The subscript K refers to the 
state in which a particular microwave field oscillator has 
lost a photon. In order to calculate the probability that 
the particle has returned to the ground state with emis- 
sion of a circularly polarized photon we must square the 
probability amplitude and integrate over all lamp 
photons, microwave photons, and emitted circularly 
polarized photons. We denote by p (N), p(K), and p(L) 
the density of intermediate and final states. The transi- 
tion probability is then 


1 
wot f 
hêt» 


Hirr Hgy Hya)? 
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Xp(Ex)p(Ex)p(Ey)dE dE xdky. 


We need to evaluate this expression, under conditions 
such that energy is conserved or nearly conserved in all 
steps. All denominators may vanish. An approximate 
solution can be obtained in the following way. When 
integrating over dEy, we are integrating over all lamp 
photons which can be absorbed. Let the lamp have 
intensity which is zero except for a range Av, on each 
side of that required for resonance. Let the lamp in- 
tensity be assumed constant over the range 2A». A 
study of the integrand shows that it is not singular in the 
range where all denominators vanish. The integral over 
Ey is along the real frequency axis from — Av; to +Am. 
We can equally well integrate along a semicircle in the 
lower half complex frequency plane from — Ay, on the 
real axis to +-Av, on the real axis. The second pair of 
terms which have the factor 1/(vxx) may then be 
neglected in comparison with the pair of terms which 
have the factor 1/(vxm). This integration along the 
semicircle is now readily performed. The integration 
over Hx can be performed in the same way if the 
relaxation time for interaction with other particles is 
much smaller than the spontaneous emission lifetime. 


This leaves one integration which may then be handled 

in the same way as for one photon processes. The result 
matl [= a 
mao Ez 
m =-! E3 
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then gives the transition probability 
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All quantities are evaluated at resonance. This pre- 
dicts modified absorption (or emission) at one frequency 
due to intense radiation at a different frequency, for 
single particles. p(N) is calculated from the average 
number of field oscillators excited by the lamp. It is 
reasonable to suppose that each lamp excited field 
oscillator is unlikely to have more than one photon. 
The remaining quantities require, for calculation, a de- 
tailed knowledge of the transitions and the experimental 
arrangements. The excited states of the rare earth salts 
may be suitable for such counters and are now being 
investigated by Mr. U. E. Hochuli. 

At a meeting held at the National Academy of 
Sciences in April, 1958, Bloembergen independently 
proposed a microwave (or infrared) photon counter 
which operates on similar principles (Fig. 20). 

Consider a solid with at least three low-lying levels 
and one or more optical levels. The solid, such as a salt 
of the rare earth elements, is cooled to such a low 
temperature that only the ground state is occupied. 
kTKE2— Ey. Strong optical illumination takes place at 
the frequency W! (E4— Eo). This light is normally not 
absorbed because the level Æ» is empty, unless it is 
excited by an incident quantum E:— E. Then the 
system gets raised to level £4. The spontaneous emission 
from level Æ; to E; can be detected by a photomultiplier 
tube. Again, discrimination from the strong incident 
background can be made by directional, frequency, or 
polarization filters. 


VII. MASERS AT LOW FREQUENCIES AND 
IN THE INFRARED 


There has not been a great deal of motivation for 
development of masers at low frequencies. The theory 
_ previously given would appear to be applicable. Nuclear 
spin energy levels and the levels associated with nuclear 
quadrupole spectra are available in this part of the 
spectrum. The small nuclear moment gives a weaker 
interaction with radiation, but this tends to be compen- 
sated by longer transverse relaxation times. As noted by 
Braunstein?! a nucleus of spin J>2 which possesses an 
electric quadrupole moment will yield at least three 
unequally spaced levels in a crystal of lower than cubic 
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#2 R. Braunstein, Phys. Rev. 107, 1195 (1958). 
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symmetry. If the crystalline field has large deviations 
from axial symmetry the Am=2 transitions (necessary 
for saturation) begin to have significantly large transi- 
tion probabilities. 

Allais® obtained maser action in nuclear resonance 
experiments in the megacycle region. This followed the 
work of Abragam, Combrisson, and Solomon® in the 
kilocycle region. 

In the infrared and optical parts of the spectrum the 
sensitivity of a maser will be expected to be less than 
that of photon counting devices. At 8000 A the spon- 
taneous emission equivalent temperature is 18 000°K. 
Achievement of coherent amplification would be de- 
sirable. This is a useful concept only when large 
numbers of photons are involved so that the phase can 
be well defined.® The availability of highly mono- 
chromatic sources would extend the resolution of 
spectroscopy in these regions. One problem is to provide 
a device at infrared and optical frequencies which has 
the energy storage capacity ordinarily provided by an 
electromagnetic cavity resonator, with wide mode sepa- 
rations. Dicke has proposed“ a number of arrangements 
which employ a standing wave system between parallel 
planes. Similar proposals have been made by Schawlow 
and Townes.” They suggest that a multimode cavity 
could be employed. Single modes might be selected by 
reducing the cavity to partially transparent end plates, 
with open sides. The great directivity associated with 
diffraction from sources which are many wavelengths on 
a side may make it possible to select particular modes 
and suppress unwanted ones. They carry out calcula- 
tions for a system using potassium vapor. 


VIII. PARAMETRIC AMPLIFIERS 


During the past few years another type of microwave 
amplifier has been developed, which has promise of low 
noise performance approaching that of a maser. We 
describe it in classical terms because nothing seems to be 
gained by a quantum mechanical description. In Fig. 21 
two oscillators are coupled by a time-dependent capaci- 
tor. Let the natural frequencies of the oscillators be vı 
and vz, and let the capacity C; have the time dependence 


C= Co +C: sin[ 2r (vı+ vo)t |. 


It has been known since at least the time of Lord 
Rayleigh that two harmonic oscillators with natural 
frequencies vı and ve, with this kind of time-dependent 
coupling may be unstable. 


3 Abragam, Combrisson, and Solomon, Compt. rend, 245, 157 
(1957); E. Allais, ibid. 246, 2123 (1958). 

W. Heitler, The Quantum Theory of Radiation (Oxford Uni- 
versity Press, London), third edition, p. 65. 

ü R, H. Dicke, U.S. Patent 2,851,652, issued September 9, 1958. 

2A. L. Schawlow and C. H. Townes, Phys. Rev. 112, 1940 
(1958) ; see also A. M. Prokhorov, J. Exptl. Theoret. Phys. 34, 1658 
(1958). Research activity is increasing along this line. Since low 
noise is not essential we suggest review of older methods in con- 
junction with parallel planes. 
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Fic. 21. Parametric amplifier. 


Consider first a capacity alone with 
C= C: sin[ 27 (vi-+ vo)t |} 


driven by a voltage V= V; sin (2rvıt)+ V2 sin (2rvəf). 
The principal part® of the current is 


a= — TvV o sin (27ry;/) — TVC: Vy sin (2rvol). 


: Tf, in the circuit of Fig. 21, a voltage V e sin2rvt is 
= applied between terminals a and 8, the output voltage of 
AE frequency v, can be shown to have increased amplitude 


TVC ERIR = 
+ Vie Vel 1—i2eCuRan— | 5 


1—12r12C,Re 
fy, 

= In deducing this we assume that the oscillator of 
ency vı has impedance R; at vı and zero impedance 
other frequencies. A corresponding assumption is 
concerning the oscillator of frequency v2. The 
-dependent capacitor or inductance need have very 
tle random fluctuations. The low noise properties of 
this type of amplifier were pointed out unequivocally by 
van der Ziel“ in 1948. This pioneering theoretical work 
ears to represent the earliest solution to the problem 
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shere G1= effective conductance of unloaded amplifying 
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re following the treatment of H. Heffner and G. Wade, 

of Institute of Radio Engineers Professional Group on 
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the amplifying resonant circuit by the generator, Gry 
=effective conductance of the loaded amplifying reso- 
nant circuit, »;=amplifying frequency, v2=idling fre- 
quency, Fs=shot noise, and F g= gain fluctuation noise, 

The revival of interest in this type of amplifier 
appears to have started with the work of Suhl*® and 
Weiss on the ferromagnetic amplifier. Low noise “up 
frequency conversion” is also possible with parametric 
devices.4® Room temperature operation is possible, with 
moderately good noise performance. 


IX. APPLICATIONS OF MASERS TO 
EXPERIMENTAL PHYSICS AND 
RADIO ASTRONOMY 


The very low noise, coupled with adequate gain-band- 
with product, make the low-temperature solid state 
maser an obvious tool for radio astronomy and radar.|| 
A modified Dicke-type radiometer in which a maser is | 
used for pre-amplification has been described.” The 
ruby maser of Alsop, Giordmaine, and Townes, which 
was described earlier, was mounted near the focus of the 
Naval Research Laboratory 50-ft reflector in order to 
minimize transmission line losses. Liquid helium cooling 
was provided by a stainless steel Dewar with 3 l capacity 
which was maintained under partial vacuum. This 
allowed about 15 hr of observation before recharging 
was needed. The complete radiometer installation made 
it possible to observe with an rms output fluctuation of 
0.03°K for an output time constant of 5 sec. The inter- 
mediate frequency amplifiers had a band width of 5 Mc. 

The great stability of oscillators such as the cesium 
beam clock and the molecular beam masers make 
possible certain experimental tests of both special and . 
general relativity. These possibilities have been dis- 
cussed by Moller.5 A molecular beam maser has been 
employed to repeat the Michelson-Morley experiment.” 
If there is a fixed ether a difference in frequency would 
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be expected between two molecular beam type masers 
with beam velocities parallel and antiparallel to the 
earth’s orbital motion. Let w be the velocity of the 
molecules relative to the cavity which is assumed to be 
moving with velocity v with respect to the fixed ether 
(u is parallel to v). It is assumed that the photons 
radiated by the molecules move with a velocity c rela- 
tive to the fixed ether. The photon velocity relative to 
the cavity must be normal to w. This requires that c be 
tilted forward of the normal to « by an angle ¢= (v/c). 
The expected doppler shift would be vu¢/c=av/c?. 
For a thermal velocity of 0.6 kMs/sec and for the 
earth’s orbital velocity the difference in frequency be- 
tween two oppositely directed beams is 2uov/c?, which 
is ~10 cps if v=23 870 Mc/sec. The experiment was 
done by mounting two maser oscillators on a rack which 
could be rotated about a vertical axis. The oscillators 
were adjusted so that their frequencies differed by a 
small amount. This difference was recorded and the 
apparatus was then rotated through 180°. A slight 
change of about 1/50 cps was observed under the best 
conditions of operation. This is smaller by a factor of 
1000 than what would be expected on the basis of a fixed 
ether. Inasmuch as the special theory of relativity is one 
of the most securely established of all physical theories, 
such experiments may be regarded as a search for other 
effects such as perhaps an anisotropy of space in this 
part of our galaxy. The use of stable oscillators to test 
the gravitational red shift has also been considered. 5.5? 
This is perhaps more a test of whether or not these 
devices are natural clocks (whose intervals are in- 
variants) than a test of the fundamental postulates of 
the general theory of relativity. 

A series of experiments is being planned by the author 
to search for gravitational radiation. The theory of an 
antenna for such radiation has been discussed.** Two 
masses which are separated will have forces exerted 
upon them by a gravitational wave. The phase differ- 
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Fic. 22. Masers for detection of gravitational radiation 
at radio-frequencies. 
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ence at the two masses results in one being driven 
relative to the other. Also strains can be set up in a 
material by a gravitational wave. Under some condi- 
tions it is desirable to make use of acoustical resonance, 
under other conditions where acoustic phase reversal is 
troublesome, it is better not to employ acoustical reso- 
nance. The gravitational wave interacts both with the 
mass of the piezoelectric crystal and the conducting 
masses shown. Very low-frequency search is planned. 
The output voltages are amplified as shown in Fig. 22. 
Liquid helium temperatures and low noise receivers may 
make it possible to observe correlation in the outputs, 
in the presence of electrical noise. 


X. CONCLUSION 


The new microwave amplifiers are the result of 
electronics research to develop millimeter wave tech- 
niques, magnetic resonance research, and research in 
microwave spectroscopy. Maser amplifiers bring to the 
microwave region a detection sensitivity of the order of 
a few microwave photons. Most of these amplifiers 
employ the magnetic moment of the electron, and there- 
fore fill in the gap in modern electronics pointed out by 
Sommerfeld. We may look forward to important ad- 
vances in radio astronomy, spectroscopy and solid state 
physics, in consequence of improved ability to dis- 
tinguish weak signals. 
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APPENDIX. MACHINE CALCULATIONS OF MATRIX 
ELEMENTS AND ENERGY LEVELS FOR RUBY 


The following data are entirely the work of W. S. 
Chang and A. E. Siegman of the Stanford University 
Electronics Laboratories. We thank them for their 
kindness and cooperation in allowing this work to be 
published here. ; 

The spin Hamiltonian used was 


R= gBarHa 0 S+D(S2— 5/4 , 


with g=1.99 and 2D=+11.46 kmc/s. The 
anisotropy in the g tensor was ignored. 


4 This material originally appeared as Tech. Rept. No. 
under Air Force Contract AI33(600)-27784 of the S 
Electronics Laboratories, Stanford University, California. : 
lar set of calculations have also been performed for the m: 
potassium chromicyanide, K;Cr(CN).«, and reported 
Tech. Rept. No. 156-1 under the same contract 
lower symmetry of the chromicyanide, the resi 
to present here. Copies of the re 
the Stanford 

Wright 


a5 > 


700 


Ca e 
aeeoa oan eases | arsaa so] 


J. WEBER 


TABLE I(a). Ruby @= 10°. 


10000 530 


90466 39 


72064 SON 
41310 376 


89242 48 49m <9 


16764 


| s6ise se | 16030 49 + 
20799 40 


| 29417 esn] 
[ povos son| ressa sen] 25300 46 «| 17341 47 [fore] 


16991 SON 


25668 49 | 43304 <7 | 


19840 46N 


47 40205 <8 


17453 49 


Som | 35733 Som] 39737 49 


16691 49 90251 49 
16476 49 


86662 40 


90496 49N 
19060 40 


99956 49N son| 02463 46 


40362 40 14931 47 


18657 50 49 
30490 496 


10939 48N 


sox 
49K" 


33263 
31531 40 
16925 47 
92312 40 
50765 48 o 


as | esisa 47 


16911 
17157 50 
23407 49m 


son 


13142 49 |forE.] 


97748 49 © 


17453 <9 
95864 SOM 


21331 50 


so 20260 SON| 33512 49 o 
45401 49m| 10387 49 


42537 Sonj ; 


16409 SOM 


29690 37 
15754 


39 


17176 


so 80461 40 


21003 49N 


76920 47 


66007 


476 s6594 49 


for E, 


12460 49 e | 27186 39 


S7094 47 10560 <9 for E; 


18016 49 for E;| 


99528 49 © 
20066 46 © 


S4913 496 


45107 496 
11086 SON 


16809 49 © 


$6523 <an 
99836 49 27512 47 


for Es) 


65739 48 33155 47 


12909 SOM 


33067 49 


1700s son| 61007 49 


67367 49 36998 49 


oe03s 49n| 26571 49 


17136 50 
19265 SON 


11392 49 
20622 49 


16590 50 


94719 Sa 


35000 50 
20257 53 


17453 49 
sin 


11562 79711 SON| 72369 49 


16617 


son 94096 47 


17195 SON| 12747 49 


17414 50 27904 


11877 49 69073 


16947 49 60071 47 


217453 49 


12861 Sin 10315 48 


57376 470 77641 49 


68069 47 
74094 <8K 


60576 49 
17381 39 


99721 <9 


43117 47 


$9294 40w 
14375 50 
17205 50 


10025 50 
25690 49 


14929 297 


13500 SON 20351 39 


68469 47 


T6933 47 


12579 +96 


96076 40 32662 47 © 


$0000 so 


CC-0. Gurukul Kangri University Haridwar Collection. Digitized by $3 Foundation USA 


53000 s0 


14910 49n| 76059 47 


17453 49 45000 
14435 Sim] 10676 Sin] 60636 s9 | 24425 
46566 47" | 95477 48 63794 49 «| 75566 
79205 47 14663 49N | 74165 39 | 65451 
70540 40" | 96140 29 17346 49 24255 
99686 49 78940 48 47303 47 27320 


$7030 49 


17537 5ON| 10173 SO © 


16660 50 16785 SON| 32204 49 e 


17207 50 


17234 SON | 15027 49 © 


10597 Son | 10557 50 15220 49 


13065 49N | 13073 49 Boss 47 


60736 <0 68006 4am | 39305 47 © 


17453 49 


20100 Sin | 13301 Sim] 27965 59 


25772 47% 40041 


$1759 40 “9 


95370 47 


16600 49x | 89983 <9 


89247 48N 


98066 49 17299 49 


89601 48 sas97 47 


13739 50 
19012 50 
17206 50 
12114 49% 
26793 40 


17455 som | 75515 49 
34526 49 
17902 49 


65077 40 


19215 son 
17251 


son 


$1890 49 


12110 49 
26010 48n 


79763 47 
15690 47 © 


75000 50 
36940 51 


49022 SO 


MASERS 


TABLE I(b). Ruby 0=20°. 


701 


34906 49 $0000 49 
65560 son [9553 sow [37808 50 77096 50 03397 Som 
89415 47" | 30042 49 75044 49 38052 48 
31103 40 |2506 49 30821 49 14643 48 [?sosi 
3034 aon |21029 47 EEE PT] 99916 49 14978 47% |1020 48 
90947 <9 «| 32345 46 11070 47 41471 46 =| 99827 49 $6409 48 
16039 so |z0029 son |sne67 49 [43579 so 20176 
61063 son [e107 49m |33ea2 45 69877 son [70032 
[ess00 <7 |esovs aos [15002 at 2100) a9 [s1023 sam 
| iezi» Son |te101 son | 000 «0 15917 som [15548 30" [soso 14390 sow 
[26662 son [tessa so EC rT) 16776 50 [|:15)9 49 
[47202 <9 [27902 49 laseze 26 41729 496 [23233 40 
34906 49 20000 50 y4906 49 
9972845 SON | 15536 50 | 10413 sis [27067 sows j 16782 50 10241 50 
[| z1020 aon |25155 49 | 93203 49 3 22204 asn [27917 49 | vies 47 00073 49 
09064 aa |esrei s?n | 19ap2 a5 EET ST RSC AEE 
37900 45% | sada 49 | 10385 49% |sese5 49 | 37134 <9 siese a7 as [ssren 
s [sara 40 o943 ap [i1309 49 | ilosa 43 11805 47 4e [isss 40 
[23ra son | 47650 49" |ziacz SON | 74545 49 sog COC oS 
TE A [oios «9 [cess son j12767 50 son [essay 
66513 49 eo516 4pm 15813 30 [15824 son [21604 49 
18028 30% — aon zo | 30403 49 13077 so | 35327 <9 
T9004 son | usseense ar | 76055 <8 39520 49" | 39825 z> |a0133 40 
= woo asu [2535348 235735 49 | 2557S 49m 


34906 


3s000 so 


19890 Si 


40000 30 


21957 33 


34906 49 


45000 50 


T4tt7? Si” 


90270 som 


11251 50 


49012 39 


37939 49 


16500 40x 


19672 <9 


“9 86204 49 


16114 47 


eo7e1 a9 


10030 40 


16323 47 


309317 


s0000 so 


18442 


15910 


33000 30 


67061 


65199% 


so 26085 51 
as e| 71265 4° |forE, 


29 69591 49 [fore 


39756 


e8405 4e |forE 


20773 


12000 


3000 50 


CC-0. Gurukul Kangri University Haridwar Collection. Digitized by S3 Foundation USA 


7 


702 


s0000 49 


73709 50 


are | 20000 60 
[ 1003s sin [20707 som |4s124 a9 + | ays37 s: 


J. WEBER 


TABLE I(c). Ruby 6=30°. 


E E 

[peee ar e | arae er fere] 
[one ar e [orree an fiore 
[om | fiore 
ered 


$2359 49 


[ 79060 <9 | 
tases 47 


73730 49 


69950 49 
36600 49 


23660 49 


13415 48 


34507 49 © 


25032 <9 


aon | 49950 40 o 


14418 50N 
16410 SON 
471463 49 


SON | 26391 49 


so | 23923 49 


son | 40086 40 © 


30000 so 


[raonovaeui| eave as [eaves ay + [2siee as | 
e arsar aras as = [iaaea ae [fore 
a sea lrer as” [vcoss a llores 
araman saevei<o| apara as | s1556 <7 [ford 
liaapaetayi|issbaclson erea as [1-2] 


s0000 50 


43240 som |19503 49 o |173053 51 
34631 4 94674 29 o |0150 49 
49972 49N |10994 “9 3 


2 
[ropi 49 FEFE 


[rovor as 


faoss 5 


isis 


14720 Som |40960 49 


|asen 
EEE 


10548 
34249 49 


sow |10535 so 


34289 49N 


| 

| 

‘ | 

10561 50 © | 
Soe 

| 

| 

| 

| 


65491 SOm| 11070 50 


so Bosse St 


30557 49 70901 49 


39 e| 50141 49 


49 Te61s 39 


ED) Poesy 49 


30 Tosi 47 


[asean ae EE) 


35000 50 


[essa ay [reser a 


T4719 49 11937 SO >% 
15478 50 10509 SO ® 


75000 50 


Ave1s 49 


27264 


50000 49 


se | 41262 


son | 66422 


20000 50 


85102 <9 


eaves 49 


53159 


35000 50 


25532 $1" 


75760 49 16016 50m |15075 30 © 


i4312 50 14545) son 54257 49 o 


$0000 30 


ae Sy 


MASERS 


TABLE I(d). Ruby 9=40°. 


| LN (SoC oso 30 
see 


eae 7008 40 À 
[ iaae Sin] ros sonj miria e J reon si] | 17080 sis [7907 sonf rowos so J roseo si] 3 
[15936 sos] s0070 ar | szosa a3 | soras ar] DE EE CO ET] 
| zoari ep | aae zonj rrore so] ra ETEA EEC) PONTO > 
Bar EE BEE E] CEOE E 
1700 sonj soroa ao | orse ar 13] ECE E E E 
[ rasa2 sonj rasaae so | oros a9] za] C EA OO} 
[ 30265 a> | sec3o sonj eso7s se e| al EEK E SAECO) EZ} 
Ta 
A 
af 


40000 50 


Jory 49 


[pessasma] 
[47973 49 tor Ej] 
[70790 39 |forE,| 
[30143 49 fore, 
[for Ed 


33009 $0 


Be a eo 


> 


704 J. WEBER 


TABLE I(e). Ruby 6=50°. 


15000 50 


10912 51 


1982549 


27251 39 


94127 49 


16988 son ás 1907) 49 eos) a9 [for E,] 
[eoor ar [ireo ay fror se f eene ar fior 
1-2 


12103 56 


1-2 


$6199 $9 zose? SOn| 96363 39 © 


Brzac Sy Í 20000 50 


fosis> sos |[sioe7 3% [ersrs a9 > 


12025 31 
25705 49 


79949 Syn 


12225 SON | Sesue 49 2-3 
[noses sem [ereor se | ed 
TiS?) SOn | ctiai 39 © T3 


S535 


so zeo? 49 


| s1ssz son | son 


39060 49 


sos 


11710 
15601 


sox 


l-4 


Toze1 SəN j ooze) sa > 


27334 50 


sassa 


30000 50 


00656 49 


25539 39 


20032 49% 
92909 a% 


[eerie 49 
P3987 <9 


$1709 4% 21923 SOs [31016 50 © (mar =e 


35000 50 


321a» so 17502 53 


$2607 49 


35002 son |? 30 15632 51 
$7276 49 39 0 |35598 <9 
e9127 sow +7 e lerz06 49 
áar? 4? $0020 <? o 


7320ə se 


Pra = 


z1¥64 SoN 


11731 $0 


so 


1333% 


29151 


sow |:5512 


35070 


son “9 


21307 50% 
6077 son 


[i1971 sou 


11736 50 


10198 So” 


19073 son [szeo7 so [zorse ao | za] 
[eer ae [evs E = | 


34648 49" |51200 36 


7266 49 20000 so 


70611 47 yoser 49 


[arer as [renee a 


$0000 50 


65000 50 


[sszzsuscn|neetise: e] 
[tor €,] S5i79 497 T7567 39 © 
S3550 496 |is?es a0 
Si080 49 
Be) Sues 


aon |1196 39 


sox pen so 


49 35663 àm 


zosac SON 


17167 50” 


T3403 49 
29702 39 


$5000 50 


77462 49K 


20762 49 


60000 50 


13750 49 


CC-0. Gurukul Kangri University Haridwar Collection. Digitized by S3 Foundation USA 


26001 Sim 79635 som 
5296? a¥ 63200 49 


43021 


pe is a 


’ 15722 <9 


10332 50N 


39391 
57713 49 


ae 


45000 a5x| 57603 49 


73510 <8 79767 49m 


6 


TT] 
996275 37 3727 35 S456 47 


Zo1e4 soOn| p793 49 


11649 50w 


15290 a9 


397930 40% 


z7120 48 e| 


oa sonj waver a7 «| 


14711 $0 T5331 49 


09610 Son] P7031 40 © 


20000 50 


20669 


som] 35960 50 11159 51 


$6100 3? 


75323 49m] 44240 39 


42009 49 


19320 49 
20301 


a9 


21565 Son 


13360 50N 


MASERS 
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x-z plane at an angle 0 from the z axis, and the y axis is 
perpendicular to the plane containing H ae and the c axis. 

The transition probability matrix elements between 
levels k and / are given by 


(H Vea 
Barre Bac rs 
= (adit 1¢3) + jB, 


@ in radians 
E EX 
O: b, 


(k| 26a Hrs: S| 


where H, is the peak amplitude of the linearly-polarized 

rf magnetic field, and ¢1, $2, 3 are the direction cosines 
Fic. 23. Format of each sub-block in tables for ruby. of H,r with respect to the x, y, z axes. 

For each operating point, characterized by Hac and 0, 

The energy eigenvectors |j) and the energy eigen- the four energy eigenvalues Æ; in units of kmc/s, the 


values Æ; are defined by the equations eigenvector expansion coefficients a;- - -d; for each level 
3e| j)= Esl 7) j, and the values of a, B, y for each possible transition 
see j oe Pee pees ie in ae Tey aoa 
RA OT 7 i A i). The tables cover 0= 10° to 0= in incre- 
[j= a3|2)+6513) +e] —2) ds] —2), ments, with Hae increasing from 500 gauss to 7500 gauss k 
where j=1 through 4 denotes the four energy levels. by 500 gauss increments in each table. 
The axes were chosen so that the c axis of the ruby The format of each subblock in the tables is shown in 


crystal is the z axis, the dc magnetic field Hae lies in the Fig. 23. The numbers in the blocks are expressed in 


GJ 5 " 
x, Hdc =0.5 Kg! z. 
ar at 
a a A 
3 = 
= = 
TR x 
o o 
= = 
2 
l 
(0) 4 
(0) 
8 (degree) 8 (degree) 
Fic. 24. Fic. 26. 
GJ 5 J 
SS > 
= 4 = 
ix 1 
x x 
S o 
= 3 = 


10 20 30 40 50 60 70 80 90 i 
8 (degree) 8 (degree) 
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@ (degree) 
Fic. 28. 


Mox Ey | 


8 (degree) 
Fic. 29. 


floating-point decimal form, since the blocks were pre- 
pared directly from the printed output of the computer 
to avoid transcription errors. The numbers can be con- 
verted to ordinary form by following the rule 


abcde gh=a-bcdeX 100% 50, 


The symbols *, N, or CR following a number indicate 
that the number is negative. For example, 56 423 51CR 
in the tables equals —56.423 in ordinary decimal form. 
The maximum possible transition probability matrix 
element for optimum orientation of H, is either 6,7 or 
lætr’), whichever is larger. The maximum values for 
all the transitions have been plotted vs @ for several 
different values of Hac in Figs. 24-31. The operation of 
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@ (degree) 


Tic. 30. 


8 (degree) 
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the selection rules and the rapid variations in regions 
where levels curve strongly are very evident. 

Since the calculations were performed, it has been 
found that D is negative, not positive as used in these 
calculations. This does not invalidate the calculations, 
but does require some simple modifications. To obtain 
correct answers for negative D, the signs of the Ejs 
should be reversed, and the signs of œ and y should be 
reversed. This of course does not change the squares of 
the matrix elements. When these corrections are made, 
the subscript J equals one in the tables and curves of 
this Appendix refers to the highest energy level, while 
J equals four refers to the lowest energy level. For 
example the subscript one dash two in the figures or 
tables refers to transitions between the two top-most 
levels of the ruby energy level spectrum. 
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I. THEORETICAL SURVEY 
A. Introduction 


HE investigation of gamma radiation has encom- 
passed all the important properties of electro- 
magnetic radiation: energy, intensity as a function of 
energy and direction of propagation, multipolarity, and 
polarization. The emission of gamma rays of discrete 
and characteristic energies was associated early with 
the existence of sharp, well-defined energy levels in 
nuclei, whose properties determine the intensities 
(transition probabilities) of the radiations. Of equal 
interest are the radiation patterns (angular distribu- 
tions) of emitting nuclei which have been oriented in 
some prescribed fashion, for the spatial distribution is 
intimately connected to the multipole character of the 
radiation (My 38, Ha 40). By relatively simple meas- 
urements, therefore, it is often possible to establish the 
multipolarity of a gamma ray, at least in the case of a 
pure multipole. 
There still remains an essential ambiguity, since the 
radiation may be either electric or magnetic. This — 
property is associated with the parity of the radiation. 
The ambiguity in the parity is immediately resolved 
the linear polarization of the radiation is known (Fa 
Ha 48). Moreover, a knowledge of the polarization 
be used to untangle the results in the case of - 
multipoles (Zi 50). It is sometimes possible to 
such ambiguities with indirect argumen 
observations of associated phenomena. - SE 
ment of polarization, however, pro 
of attack on such problems. % 
The radiation from nuclei n 
polarized and a measurement oi 
vides valuable additional ii 
Ha 51, To 52, St 53). In 
circular polarization two 


TABLE I. Summary of conditions which produce nuclear align- 
ment or polarization, and the corresponding types of correlations. 
The abbreviations dir, pol, circ, and align stand for direction, 
(linear) polarization, circular polarization, and alignment, re- 
spectively. The symbol LT(B) represents a low temperature 
orientation of the Bleaney type which produces alignment, 
LT (RG) represents a method of the Rose-Gorter type which leads 
to polarization. In each case only the most complex condition or 
correlation is listed. Thus if the circular polarization of the initial 
gamma ray is measured, its direction also is measured and one 
may have a dir-dir or dir-pol correlation as well as a circ-circ 


correlation. 
Initial Initial Type of Type of 
particle condition orientation correlation 
p, n, etc. dir align p-y dir-pol 
p, n, etc. pol pol p-y pol-circ 
B dir pol B-y dir-circ 
OY dir align y-y dir-pol 
y pol align y-y pol-pol 
y circ pol y-y circ-circ 
tee LT(B) align “align”-pol 
LT(RG) pol “pol”-cire 


case of alignment an axis of rotational symmetry is 
established, but the sense of this axis is not determined. 
In the case of polarization an axis is established which 
provides a single preferred direction in space. If the 
spin of the nuclear state is 7 and its component along 
the axis is m and if the (relative) population of the mth 
sublevel is w(m), then the condition for alignment is 
w(m)=w(—m) and that for polarization is w(m) 
~w(—m). If the w(m) are all equal, the nuclei are not 
oriented. 

From an assemblage of aligned nuclei the radiation 
can show linear polarization but not circular polariza- 
tion (Ha 48). Only if there is nuclear polarization will 
there be a component of circular polarization. More- 
Over, the processes which produce nuclear orientation 
can be classified according to the type of orientation 
which is produced, either alignment or polarization. In 
Table I the various processes are summarized and 
classified in this manner and the possible polarization 
correlation for each case is listed (in addition to the 
references already cited, see Si51, Wo49, To 52a, 
Si 53a, Le 56). This tabulation illustrates the formal 
way of designating a correlation. For example, a y-7 
direction-polarization correlation represents the meas- 
urement made on a y-y cascade in which the (linear) 
polarization of one gamma ray is determined relative 
to the direction of the other gamma ray. Two general 
methods are available for producing orientation. In one 
a nuclear particle is observed with or without polariza- 
tion, and in the other an external or atomic field is 

employed at very low temperature. 

The literature on the theory of direction-direction 
correlations is voluminous and detailed (see the com- 
prehensive list of references given in De 57a). In the 
survey, the theoretical results are presented 
t derivation) in a manner which stresses the 
larization experiments in order to pro- 


or t 
presen 
(without 
pplication to p 
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B. Linear Polarization Distribution 


The general expression for the intensity distribution 
of linearly polarized radiation has been given in several 
different forms by many authors (Ha 48, Zi 50, L152, 
St 53, To 53, Sa 53, Si 53a, Bi 53, Sa 55b, Ma 58). The 
various expressions are equivalent and differ only super- 
ficially to cover different experimental situations or 
degrees of complexity. In this review we give three 
forms, two of which are generally appropriate for the 
direction-polarization correlation but differ enough to 
make one or the other preferable in certain problems, 
while the other is designed specifically for the case of 
nuclear orientation at low temperature. It is convenient 
to make such a classification, partly for historical 
reasons, although in many instances the differences are 
hardly more than a matter of nomenclature. 

The direction-direction correlation for two radiations 
in a nuclear process is given in a concise form by 
(Co 53, Bi 53, De 57a) 


W= A,(1)A,(2)P,(cos6). (I-1) 


Here P,(cos@) is the ordinary Legendre polynomial, @ is 
the angle between the directions of the two radiations, 
and 1 and 2 stand for sets of quantum numbers for the 
first and second radiations, respectively. The summa- 
tion over v is over all integral values (including zero) 
for which the coefficient of P,(cos@) does not vanish. 
Tf we let 1 LiL;'jj’ and 2— L2L.'j7’ and sum the 
expression over all allowed sets of values of these quan- 
tum numbers, we can treat problems in which both the 
intermediate state and the radiations are of a mixed 
nature. The quantities L, and L,’ represent possible 
values of the total angular momentum (multipolarity) 
of the nth radiation, while j and j’ are possible total 
angular momenta for the intermediate state. Because of 
their great complexity, problems of this general type 
have not been studied extensively, especially in con- 
nection with polarization studies of gamma rays. , 

In most of the cases which have been studied experi- 
mentally, the intermediate state is characterized by 
unique spin and parity and the radiations are mixtures 
involving not more than two components. With these 
restrictions Eq. (I-1) is greatly simplified, not so much 
in form as in the complexity of the computation. We 
adhere to the notation which seems to provide the 
greatest ease in calculation through the use of published 
tables. The summation in Eq. (I-1) is now restricted 
to the even integers (including zero) and, after setting 
j=j’ for the intermediate state, the factorization of the 
coefficient of P,(cos#) is complete. Hence, writing out 
the summation over the L’s explicitly, we have (Bi 53, 
Fe 55) 


A, (2) =P (LnLnjinf) + 28nF,(LnLn jni) 
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which is the expression that is applicable to gamma 


radiation. The intensity ratio of the Lẹ, and L,’ radia- 
tion is denoted by 6,7 (6, is real) and j, is the total 
angular momentum of the nth state, i.e., either the 
initial (1) or the final (2) state. Equation (IT-1), is 
normalized so that Ao(z)=1+46,2. The most complete 
compilation of the functions F,(L,Ly/jnj) is that given 
in Fe55 which covers all cases of apparent interest 
for even v. 

If the th radiation is a material particle (e.g., an 
alpha particle), then it is only necessary to replace each 
Fy(LnLn'jnj) in Eq. (1-2) by Fy(LaLn'jnj)by(LnLn’) 
and ôn by 6, cos (leaving ôn? unchanged, however), 
where the “particle factor” (Bi 53) is 


2CL(L+1)L’(L’+1) }} 
L (E41) a) Gees 


and é is an appropriate phase angle. If, in addition, the 
particle has a spin, it is convenient to employ the chan- 
nel spin formalism. The total correlation function is 
then an incoherent superposition of the functions 
corresponding to each value of the channel spin ob- 
tained by compounding the spins of the two interacting 
particles. In beta decay and in Coulomb excitation the 
particle factors are more complex in that they involve 
specific features of the interactions. These factors are 
discussed in Al56 (Coulomb excitation) and Bi 53 
(beta decay). The -y circular polarization correlation 
is discussed in Sec. C following. 

If unobserved intervening radiations are present, 
(1-1) is generalized (Sa 54), for transitions between 
nuclear states of well-defined spin and parity, to 


W(0)=>d A,(1)U,(2)---U,(n—1) A,(n) P,(cosd). (I-1’) 


b,(LL')= 


(I-3) 


The function 
U,(m)=(—1) imtı—im—Lm[ (2jm4-1+1) (2jmt 1) J? 
xW ORIANA AA ; vLm) 


is inserted for each unobserved transition jm(Lm)jJm+-1- 
The quantity W(jmJmJmtijmt1; Lm) is the Racah co- 
efficient. For mixed radiation U,(Lm) + ôm U „(Lm ) (nor- 


TABLE II(a). Numerical values of P,® (cosd). 


Ek 2 3 4 5 6 

0 3.00 0 —7.50 0 13.12 
0.1 2.97 1.49 —6.90 —5.05 10.70 
0.2 2.88 2.88 —5.18 —8.87 4.19 
0.3 2.73 ‘ 4.10 —2.52 — 10.46 —4.21 
0.4 2.52 5.04 0.76 —9.18 —11.41 
0.5 2.25 5.62 4.22 — 4.92 —14.15 
0.6 1.92 5.76 7.30 1.61 — 10.11 
0.7 1.53 5.35 9.30 8.81 0.69 
0.8 1.08 4.32 9.40 13.90 14.16 
0.9 0.57 2.56 6.65 12.82 20.13 
1.0 0 0 0 0 0 


TABLE II(b). Numerical values of the coefficients x,(LL’). 


LXX 2 3 4 5 8 
11 —1/2 
12 —1/6 —1/6 
13.» —1/12 —1/12 —1/12 
22 1/2 0 —1/12 
23 —1/4 1/4 —1/60 —1/20 
33 1/3 0 1/3 0 — 1/30 


malized to 1+6,,”) is used. The intervening radiation 
may be any kind of particle. 

The prescription for obtaining the linear polarization 
distribution in a direction-polarization experiment can 
now be stated simply as follows. In each term of (I-1) 
characterized by (L2Lo’) and v, replace P,(cos6) by the 
expression (LI 52, Sa 55b, De 57a), 


P,(cos6)-+ (=) 22’ cos2yxy(LoLe’)P,® (cosh). (1-4) 


Without loss of generality it is assumed that the linear 
polarization is observed for the second radiation. The 
direction of the linear polarization is specified by y, 
the angle between the polarization vector and the plane 
of the reaction. In practice only two values of y (0 and 
90°) are usually considered, so that cos2y= +1. In the 
discussion of the experiments, the intensities of polar- 
ized radiation for these two directions is designated by 
Jo and Joo or Ju and J, (Je, Jẹ are also used in the 
literature). The quantity often given is the ratio Jo/J 0, 
although theoretically the polarization excess or dif- 
ference (Jo—Joo) is the more natural quantity, as 
pointed out by Fano (Fa 49). Thus we write 


PW (0) = (Jo—Jo0) 
= >a ( =+) LA Ky (LoL) Py (cosé), (1-5) 


which may be normalized to the total unpolarized 
intensity 


W (0)= (Jot Jo) =} A, P,(cos0) (I-6) 


if so desired. In these expressions the coefficient 
A,(1)A,(2) or A,(1)U,(2)---A,(n) has been shortened 
to A, and the sum is over v and whatever quantum 
numbers are appropriate. The P,©)(cos) appearing 
in the polarization distribution is the unnormalized 
associated Legendre function. Numerical values for 
P,® (cos@) are given in Table II (a). The other quantity 
x,(LL’) appearing in (I-4) depends on the multipolarity 
of the radiation. It is given by (LI 52) 


(v—2) j C(LL'y,11) 


x,(LL’) = —| ——— | ——————_., 
(a) Fes; C(LL’y, 1—1) 


(1-7) 


where C(LL’y,11) is the familiar vector addition co- 
efficient. Numerical values for these coefficients are 
presented in Table II(b). Finally, in (1-4) the + or — 
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‘TABLE III (a). Numerical values of the coefficients in the series 
ay(LL)=ZnanAn(LL). For odd v, all an=0. 


L v a: at as 
0 —1 
2 1 

2 0 1 —1/6 
2 —1 —5/6 
4 0 1 

3 0 2/3 2/3 —1/15 

= 2 —2/3 10/3 —1/3 

4 0 4 —9/15 
6 0 0 1 


sign is to be employed according as the 2”’-pole radia- 
tion is electric or magnetic in character. 

For the case of mixed radiation from a single inter- 
mediate level the polarization distribution can be 
written more explicitly as 


WOWO 


a i + (=) zZ2’ cos2y > A,(1) @,(2) P, (cosb). 


ER 9. 


(I-8) 


= The sum is over even v (excluding zero). The coefficient 
% AG (1) is given by (1-2) or a suitable modification, if the 
Sui first radiation is not a gamma ray or there are inter- 
_ vening radiations. The new coefficient @,(2) differs only 
in the inclusion of the x’s with appropriate signs [so as 
p unchanged the sign convention in (I-8) ]. Thus 


e = —Ky(LLo)F, (LoLo f2) 
+H2ôx, (LoLo )F, (LoL 727) 
+627k,y (Le! Le’) F, (Lo! Lo jaf). 


s case Lo’=L2+1. 

umber of the problems studied experimentally, 
the complexity of the direction-direction correlation is 
d | to v=0 and 2. The direction-polarization cor- 
n then becomes [with @2=A2(1)@2(2) | 


is W= Aot AP +cos2y QP”. 


38 the directional correlation is nearly isotropic, a 
ment of the polarization easily determines the 
zn in this expression and hence the parity of 
on. The measurement can also be decisive in 
g an ambiguity regarding the mixture of the 
s ganibe seen by inspecting (T9) and Table 


(1-9) 


(I-10) 


a Ped cos2yA 2P,” (I-10) 


in more ican fashion by 
€ e character 


WAGE AW) Ge Ss 


HANNA 
and for 6=90°, we have 

iy i= 
J 90 


(L-10”) 


However, this expression does not hold for a dipole- 
quadrupole mixture. It is a quite general result that the 
polarization measurement can provide vital information 
on mixtures in addition to the determination of parity, 
The earliest development of the theory (Ha 48, 
Zi 50) gave a somewhat different but entirely equiva- 
lent form for the direction-polarization correlation 
function, which has been used extensively in experi- 
mental work. If the direction-direction correlation is 
expressed as in (I-6), then the second form of the 
direction-polarization correlation which we treat is ob- 
tained simply by replacing the coefficient A,(L2L.’) in 
each term by§ 
Ay(LoLs') + (=) 12'ay(LoLs’) cos2y. (I-11) 


The angle y and the sign convention are the same as 
used in the foregoing. In this prescription the new co- 
efficients a, can be expressed in terms of the old ones 
A,. Writing the series in cos’@ as well as in P,(cos§), 
we have 


W (6)=> A, (LoLe’) P,(cosd) = Q, (LL 
PW (0)=> (4) 19’a,(LoLs') P, (cosh) 

=F (4) L2’qv(L2L2’) cos’? (I-13) 

ay(LeLe') =>, anA n(LoL,’) (I-14) 

ql LL’) > BRONDA (114) 


2) cos’@ (I-12) 


The coefficients a, and 8, depend on Le, Ls’, and ». 
Numerical values for these coefficients are provided in 
Table III for pure multipoles, in Table IV for mixed 
radiation of the same parity, and in Table V for mixed 
radiation of opposite parity. We note here that ao=Bo 
=a,=6,=0. Many examples of the use of this formula- 


TABLE III(b). Numerical values of the coefficients in the series — 
qy (LL) =2nBnQn(LL) (cf. Zi 50, Ha 48). For odd v, all 8n=0. 


L v B: Ba Bs 
1 0 —1 
2 
2 0 
2 
4 
3 0 
2 
4 
6 
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tion occur in the literature. We cite here the case of the 
pure octupole gamma ray in the reaction F!9(p,a)O1*, 
(y)O'8, E,=6.1 Mev. At E,=0.33 Mev it is known that 
the proton is captured in an s-state, so that the a-y 
emission can be treated as an independent two-stage 
process. From the measurement of the direction- 
direction correlation (Ar50, Ba50) one selects the 
theoretical result 


W (6)=1-+111 cos*@—305 cos8+225 cos*@. (I-15) 


With the aid of Table III(b), one finds directly for 
the direction-polarization correlation for an octupole 
radiation 


PW (6) = +3[(202—Qs)— (20:— 1204—1905) cos%# 


— (1204-+210s¢) cos'0-+3Q¢ cos°0]. (I-16) 


Inserting the numerical values of the direction-direc- 
tion coefficients, one has 


P W (0) = +[—1+131 cos’ 


— 355 cos'@+225 cos]. (I-17) 


The linear polarization is complete at €2=118° (or 62°), 
i.e., at that angle Jo/J90.=0 or œ depending on whether 
the radiation is magnetic or electric. The measurement 
of this quantity is discussed in Part III. 

A problem frequently encountered is that of the 
dipole-quadrupole mixture in the radiation from a dis- 
crete level. From Tables III (a) and IV (a) we obtain 


ao(11)=—A2(11) ao(12)=—$.42(12) 
ao(22)= A2(22)—§A4(22) 
a9(11)=Ao(11) a2(12)=4A42(12) 
@2(22) = — A 2(22)— %4 (22) 


a,(22)=A4(22). (I-18) 


TABLE IV(a). Numerical values of the coefficients in the series 
ay(LL’) = ZnanAn(LL’) for even v. 


y ar al 
l D 0 —1/3 
2 1/3 
il g 0 —1/6 —1/6 
2 1/6 —5/6 
4 0 1 
D3 0 —1/2 —1/30 
2 1/2 —1/6 
4 0 1/5 


TABLE IV(b). Numerical values of the coefficients in the series 
qv(LL') => „bnn (LL) for even » (cf. Zi 50). 


L IY x Ba Bc 
i2 0 —1/3 
2 1/3 
1S 0 —1/6 
2 1/6 —1 
5 yy N 
—1/2 —2/5 
2 1/2 1/5 
4 0 1/5 
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TABLE V(a). Numerical values of the coefficients in the ser es i 
ay(LL’) =ZnanA n(LL’) for odd v. j : 


Je dy 


y» ai 

il 2 1 —1 

) 3 1 

Ma3 1 —1/2 
3 1/2 

23 1 3/2 —3/10 
3 —3/2 —7/10 
5 0 1 


TABLE V(b). Numerical values of the coefficients in the series = a 
qv(LL’) =2nBnQn(LL’) for odd v. ae 


L’ 


L v a3 
il 22 1 —1 
3 1 
il g) 1 —1/2 
3 1/2 
2 1 3/2 
3 —3/2 
5 (0) 


Inserting these quantities into (I-13), we find (changing 
the sign of the dipole term) 


= (Ao— 4A2(12)4+8A4)PotA Py]. 


In this expression A» and A, are coefficients of the total 
correlation function, while 42(12) is the mixed coeffi- 
cient. From this result we obtain at 6=90° and for a 
even parity (M1-+ £2) 


Jo=Apt+A2t+As—2A0(12) «aon a J 
Jo0= Ap—2A2—4A4+2A2(12). (I-21) 


This example is a generalization of the one given abov 
(I-10”). 

The third form of the direction-polarization correla- 
tion arises because of the convenience of discussing 
certain problems in terms of the magnetic subleve 
the radiating state as has been recognized by m 
authors. Because of the obvious physical insight pro- 
vided by the method in the study of nuclear orientation, 
it has been the starting point of theoretical deve lop- 
ments in this field (St 52a, To 53). Hence, we write 


WOE wI), 2 


$A) > a 
(I-19) 


a 


oy 
th 


where J,,“4(8) is the intensity distribution of í 
radiation from the mth sublevel of the radi 


from —j to j. If the angular momentum 7. 
state is greater than zero, then we have furti 


O eo 40), I-23) 


mea 


iy 
oA is, therefore, [Table IV(b)] 
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TABLE VI(a). The functions Fa2”’ (0) and fxr" (0), both given 
in the form K2C,,P,,(cos@). In each case Fu™*' (0) is tabulated on 
the first line and fyr24’(@) on the second line. The upper sign 
corresponds to M >0; the lower to M <0. 


D TL IMi K Co C: C: Ce 
iL ileal 1 2 1 
—1 1 
0 1 2 —2 
2 —2 
1 77 +315 0 3 
—1 1 
0 500 0 0 
0 0 
2a 2 1/21 S —12 
—28 40 —12 
1 1/21 42 15 48 
7 —55 48 
0 1/21 42 30 = 72 
42 30 —72 
2 3 2 +(350)3/210 0 —30 30 
14 —20 6 
1 +(35)4/210 0 60 150 
—35 5 30 
0 0 0 0 
0 0 0 
3 ss 3 1/66 132 —165 18 15 
—99 165 —81 15 
2 1/66 132 0 —42 —90 
—22 —110 222 —90 
1 1/66 132 99 6 225 
55 —121 —159 225 
0 1/66 132 132 36 © —300 
132 132 36 —300 


and Fy,"4(@) is the intensity distribution of an (L,M)- 
pole. Associated with each Far”*(0) there is of course a 
distribution function fa44(@) for linear polarization. 
Then, 


Eut” (0,7) =F 2” (0) + co0s2y fm” (0) (1-24) 


can be inserted in place of Fm”? (0) in (1-23) to produce 
the direction-polarization distribution (generalized to 
mixed radiation), namely, 


W@6,7y)=> wm) x 2 C(LD)C(L’")ôrôr. 


KP ak” (0) cos2y far’™ (0)]. (I-25) 


(Agreement with previous notation is obtained if 
6,=1). The Fut” (0) and fac%*’(@) are given in 
Table VI. 

There are several empirical examples of angular dis- 
tributions which correspond to a single (Z,M) pole. 
The reaction F!9(p,2)O'**(y)O'* has already been cited. 
In Table IV we see that the distributions given in 
(1-15, I-17) correspond to a pure (3,1)-pole. A second 
example is the capture reaction, H?(p,y)He*®. The 
sured directional distribution is fitted very closely 
the function Fo'1(6)~sin’@. The polarization dis- 


Fo" (0,y) = (1-cos2y) (1— cos’6). (I-26) 


the polarization ratio Jo/Joo=0 or © de- 
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tric. The measurement of this quantity is discussed in 
Part III. 

For an angular distribution or correlation it is clearly 
possible to obtain the population numbers w(m) froma 
knowledge of the preceding transitions (Li 49). In a 
nuclear orientation problem the population numbers are 
functions of the temperature and of the method used 
to produce orientation. In a given method they can also 
depend on specific parameters which may vary from 
one experiment to the next. The calculation of w(m) 
has been treated very thoroughly in the literature and, 
insofar as possible, general formulas have been provided. 
In Table VII we list the methods of producing orienta- 
tion which have been generally discussed and in the 
last column we give references which deal with the 
computation of w(m). 

For purposes of computation it is advantageous to 
retain as much as possible of the formulation which 
makes use of the functions 7,(LL’j,7). In an equation 
such as (I-1’) it is only necessary to replace the co- 
efficient A,(1) by an orientation parameter B,(7) which 
describes the orientation of the initial state. Equation 
(I-1’) retains its generality. The orientation parameter 
is merely a linear combination of the population num- 
bers (Gr 55a, B1 57), 


B,(j) = (2» +1)! E C(jvj; mO0)w(m). (1-27) 


For the linear polarization distribution we need only 


TABLE VI(b). The functions Fu?” (0) and fau” (6), both given 
in the form kD, cos"@. In each case Fm?” (0) is tabulated on the 
first line and fm?* (0) on the second line (cf. Ar 50, Zi 50). The 
upper sign corresponds to M >0; the lower to M <0. 


Jo 362 AYA] k bo bz bs bs 
flea ies 3/2 1 1 
=i 1 
0 3/2 2 =) 
2 =?) 
IE i]. +4v15 —1 3 
=] 1 
0 0 0 
0 0 
2 2 H 5/2 1 0 =i 
=il SL) =) 
1 5/2 1 —3 4 
1 —5 4 
0 5/2 0 6 —6 
0 6 —6 
2 3 2 (175/32) 1 z : 
1 = 
1 +4435 1 —18 25 
= —4 5 
0 0 0 0 
0 0 0 
39.31 3 7/32 15 —15 —15 15 
—15 45 —45 15 
2 7/32 10 —30 110 —90 
10  —110 190  —90 
1 7/32 1 111  —305 225 
=i 131 —355 225 
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the even quantities and of these the following will 
generally suffice: 


Bo=1 
Bo=((1/5)j(j+1) (27-1) (2j+3) "3 
XLE mw(m)— (1/3)7(G+1)]. 


m 


(I-28) 


The calculation of these quantities for the various ex- 
perimental techniques is discussed in the references 
given in Table VII. 

The orientation parameters are given in To 52 ina 
form which differs from the B’s only in normalization. 
The expression for y=2 is 


Je=j È mw(m)— (1/3)7 G+1)]. 


m 


(I-29) 


When the directional distribution has been obtained by 
any one of the foregoing techniques, the polarization 
distribution can be derived from it in the usual fashion 
with the substitution indicated by (I-4) or (I-11). 

The polarization-polarization correlation has been dis- 
cussed by several authors (Fa 48, Ma51, L152, Bi53) and 
a general expression for it can be found in LI 52. This 
type of correlation has not received much attention ex- 
perimentally, since it provides very little additional 
information (Ha 48, Bi53) to compensate for the in- 
creased experimental difficulties. 

In a measurement of the y-y direction-polarization 
correlation, it has not always been feasible to dis- 
tinguish the gamma rays in the detectors. It is then 
necessary to mix the yı— y2 direction-polarization func- 
tion with the y2— yı function according to the respective 
efficiencies of detection. 


C. Circular Polarization Distribution 


The y-y circular-circular correlation can be written 
in concise form (Go 46, Al 57) as 


W (6pip2) => (— p1)’A.(1)(— p2)’A (2) P(cos8). (I-30) 


The summation is over the odd and even integers 


TABLE VII. Some methods of producing nuclear orientation 
at low temperatures. 


Applied 


Type of 
field 


Method orientation References 


External field pol >10®gauss Go 34, Ku 35, Si 51, St 53a, 

(Brute force) Po 54, Kh 55, St 57a, Bl 57 

Magnetic hfs pol ~10? gauss Go 48, Ro 49, Si51, St 53a, 

(Rose-Gorter) Po 54, Gr 55a, Va 56, St 57a, 
B1 57 

Electric hfs align none Po 49, Si 51, St 53a, St 57a, 

(Pound) B157 

Magnetic hfs align none BI 51a, Si 51, St 53a, Po 54, 

(Bleaney) Gr 55a, St 57a, Bl 57 
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(including zero), and pa= +1 according as the circular 
polarization of the nth radiation is right or left handed.|| 
The coefficient A,(m) can be obtained from (I-2) which 
is valid for odd v. The coefficients Fy(LnLn/jnj) for odd 
v have been tabulated in Al 57. 

If the expression (I-30) is summed over a fn, the 
total intensity of the corresponding gamma ray is ob- 
tained. Thus, we find, 


W(0)= > W(Opip2)=2 > W pipe) 


=2 5 W(pip2)= dX A,(1)A,(2)P,(cos8), (1-31) 


which is just the direction-direction correlation. Hence 
there are no direction-circular (or linear-circular) cor- 
relations. If, on the other hand, the excess of circular 
polarization (right handed minus left handed) is meas- 
ured for either gamma ray or both, one obtains 


PW (O= pip2W (Opit) =2 E pip2W (Opip2) 


=2> pipoW (p12) 
= > A,(1)A,(2)P,(cos6). 


vy odd 


(I-32) 


The quantity Ps is the degree of circular polarization, 
as defined by Fano (Fa 49). It is analogous to the quan- 
tity Pı defined for the linear polarization [see (I-S) ]. 
In addition, it follows that 


Z pW Opipr2)= dX pW (Opip2) =0, 


P\P2 P\P2 


which shows again there is no direction-circular 
correlation. 

In the case of nuclear polarization produced by 
orientation at low temperatures, we merely identify the 
coefficient A,(1) in (I-32) with the orientation param- 
eter B,(j). In terms of the population numbers w(m), 
the most important parameter is (To 52, Bl 57) 


By=2N3 > mw(m). 


or 


fi=j7 Xd mw(m). (I-33) 


References which deal with the computation of these 
parameters are given in Table VII. The most important 
feature of Bı is that its sign depends on the sign of the 
hfs interaction which produces the nuclear polarization. 
In turn the sign of the interaction depends on the sign 
of the nuclear magnetic moment. A measurement of the 
degree of circular polarization gives the sign in (I-32) 
and thus yields the sign of the magnetic moment. 


|| The term “right handed” is used for a photon with its spin 
in the direction of its momentum. This definition accords with 
recent practice, but (unfortunately) not with the “optical” 
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TABLE VIII. The functions gar2”“’(0)=K2ZCnzPx(cosd) which 
give the intensity distribution of circularly polarized radiation for 
an (LL’,M) pole. The + or — sign corresponds to M >0 or M <0, 
respectively (cf. St 53). 


L LY |3| K Cı C: Cs 
ae 43 1 
0 T: 0 
feo, a (3/5)3 3 2 
0 3/(5)4 2 =) 
M? +1 2 =9 
1 +1 1 4 
0 Son 0 0 
3 2 - YM =i) 7 5 
1 (35)3/105 48 7 50 
0 (280)4/105 —18 =i 25 
Bilan 3 +1/6 Q M 5 
2 +1/6 6 o A 
1 +1/6 3 14 25 
0 200 0 0 0 


The degree of circular polarization can be written in 
terms of the population numbers directly. In analogy 
to (I-25) for the case of linear polarization, we have 


PW (0)=2} wm) Z X C(IL)C(L’)òz ugut (0), (I-34) 


where C(L) stands for C(jLjy; mM). The function 
gm” (0) gives the degree of circular polarization for an 
= (ZL’,M) pole. A few of these functions are listed in 
= Table VIII. 

; Circularly polarized radiation may be produced fol- 
_ lowing the capture or emission of a polarized particle 
= such as a neutron or a proton (Ha 51, Bi 51). The only 
case of importance so far has been the capture of 
polarized thermal neutrons as discussed in Part III. 
Since the orbital angular momentum is zero, there is no 
complication in applying the formalism of this dis- 
cussion. In this case Ai(1) in (I-32) becomes 4F1(33/14), 
by analogy with an absorbed gamma ray. The functions 
_ F (33917) are tabulated in Table IX. As an example, 
ider 71=0, j=}, j2=3, and L=1. From Table IX 
we obtain F1(3305)=—2.00 and in A157 we find 
#1(1133) = —1.00. Thus (I-32) becomes 


ee tae P3=-+cos6. 


R ith sN 
; At 0=0 (along the direction of the neutron polariza- 
tion), P3=+1 and the gamma rays are completely 


=e 


ae 
ae 


TABLE IX. Numerical values of Fı(ż4;1;)- 


Fiii) nF 


jr mOi) io jo Pi) 
To 1/2 —2.000 2 3/2 0.89% 7/2 3 1.000 
4/2 1 —1.633 2 5/2 —1.366 7/2 4 —1.291 
4 1/2 0.667 5/2 2 0943 4 7/2 1.018 
1 3/2 —1.491 5/2 3 1333 4 9/2 —1.277 
3/2 1 0816 3 5/2 0976 9/2 4 1.033 
A 7/2 —1.309 9/2 5 1.265 
See ie 
“ae ‘eBoy 
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For the -y direction-circular correlation (Le 56, 

Al 57, Ga 57, Mo 57) the formal procedure is the same. 

For the beta particle, we write (Al 57) 


A,(6)= 2 F, (LiL jij)bs (LiL), (I-35) É 
121’ 


where b,(LıLı’) is the particle factor for the beta ray. 
The mixing of multipole orders in the beta radiation is 
governed by the particle factors which are characteristic 
of the beta process. The particle factors for allowed beta 
transitions are given in Table X which is an extension 


of the table given by Alder et al. (Al 57). For the degree 
of circular polarization, we obtain from (I-32) 


P3=—[Ai(6)/A o(B) JA 1(y) P1(cos8), 
where 
A1(8)=[Fi(1 117) | Mer| 2S 
+F (Olji) Merl: |M r| Sy), 


Ao(B) = |Mr 259) + | Mer|2So™. 


POLARIZATION 
DIRECTION 


DETECTED 
RADIATION 


GAMMA RAY 
DIRECTION 


REACTION 
PLANE 


AZIMUTHAL 
Le PLANE 


Fic. 1. Angles and vectors involved in a reaction sensitive to 
the linear polarization of a gamma ray, presented so as to apply 
to all such reactions discussed. The phrase “detected radiation 
can represent a Compton scattered gamma ray, a photoproton 
from the photodisintegration of the deuteron, etc. 


Let 


So =21Cr|?, Sot=1|Cer|?, a=|Cer/Cr|’, 
and «=a!Mer/Mr. 


met 


Then, 


A,(8)/A0(B)=3(p/E) 
XEGE (Ojj), 


where 2 (p/E)G=S:1®/So™® and 3(p/E)I=S1%/So™. 
This is the form and notation given in Bo 58. The ex- 
perimental results on the beta-gamma correlation and 
on the electron distribution from polarized nuclei 
(Wu 57) have shown that G=-+1 for positrons and — 
electrons. As an example, consider the Co™ decay, ` 
5(6)4(2)2(2)0. The beta and gamma radiations are 
pure. For the first gamma ray, with G= — 1, we find _ 


P3=%(v/c) (0.774) (—0.645) cos0= —4 (v/c) cosé 


from the tables in Al 57. For the second gamma ray @ 
factor must be inserted for the unobserved gamma ray 
i bi)sb Þuhdhenwemerical result is the same. — 
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TABLE X. particle parameters b,(LL’) for allowed beta transitions. The parameter b,(LL’) is the product of the matrix element 
(ME) and S,‘“, The coupling constants are denoted by Cs, Cy, etc. The energy and momentum of the electron are denoted by E 
and p, and a is the fine-structure constant. The Fierz interference terms have been put equal to zero (cf. Al 57, Mo 57, Ga 57). 


L I’ ME SLL) 
0 0 |Mr]? So =1([Cs|?+1Cs‘|?+1Cv|2+1Cv'|?} 
1 1 |Mer|? So = ${|Cr|?-+|Cr’|2+|Cal?+|Ca’|?} 
Sı =} (p/E) {2 Re(CrCr’*—CaCa'*) + (Za/p) Im(CrCa’*+-Cr'Ca*)} 
O 1 |Mr\-|Mer| So=0 


Si) =} (p/E) {Re(CsCr’*+Cs'Cr* —CyCa’* —Cy'Ca*) + (Za/p) Im(CsCa’*+-Cs'Ca* —CyCr’*—Cy'Cr*) } 


II. POLARIZATION-SENSITIVE PROCESSES 


Primary emphasis is given in this part to those 
processes that have proved the most practicable and 
useful in the detection of polarization. They are pre- 
sented in the approximate order of their importance as 
determined chiefly by the frequency of application. 
Section D, Part II, contains a brief discussion of some of 
the processes which have not as yet found use in the 
study of nuclear gamma rays. 


A. Compton Effect 
(a) Sensitivity to Plane Polarization 


The sensitivity of the Compton effect to the polariza- 
tion of an incident gamma ray can be obtained from the 
Klein-Nishina formula (KI 29, Ni 29) which gives the 
differential cross section for the process. In almost all 
cases encountered experimentally the direction of polari- 
zation of the scattered photon is not of interest. Thus, 
for the present discussion the most useful form of the 
Klein-Nishina expression is the one in which a summa- 
tion has been made over all directions of polarization 
of the scattered photon with the result J 


ro k?fko k 
dog=— —|-+— 2 sin?ð cost |, (II-1) 
2 ko k ko 


where ro=¢/moc is the classical radius of the electron, 
dQ is the element of solid angle into which the photon 
is scattered, @ is the angle through which the incident 
photon is scattered, ¢ is the angle between the electric 
vector of the incident photon and the plane of scatter- 
ing, and & and & are the energies of the incident and 
scattered photons, respectively. These energies are re- 
lated by the expression 


ko 


k=—_—________—_. (II-2) 
1+ (ko/ moc”) (1—cos6) 


The angles and vectors involved in a Compton scatter- 
ing are illustrated in Fig. 1. 

The effect of polarization is contained in the cos’ 
factor of the last term of (II-1) which shows that the 


TA more detailed treatment including the derivation of this 
and other forms of the differential cross section can be found in 
He 44; see also Ev 55. 


scattering is a maximum in the plane normal to the 
electric vector of the incident photon. As the energy of 
the incident gamma ray approaches zero the sum of 
the first two terms in the brackets approaches two. 
Thus in the limit of zero energy at 6=90° the Compton 
effect has an ideal response to polarization, i.e., the 
differential cross section is zero when ¢=0° and a 
maximum when ¢=90°. At any given @, an increase in 
the energy ko produces a decrease in the efficiency of 
analysis as expressed by the asymmetry ratio, R=doao/ 
doo. Of especial interest is Omax, the value of 0 which 
makes R a maximum for a given energy ko. This fmax 
is 90° for ko=0 and decreases with increasing energy as 
shown in Fig. 2, which gives values of max up to an 
energy of about 6 Mev. A similar curve for energies up 
to about 1.5 Mev is given in Me 50.** 

Figure 2 also shows the values of R at 0=@max plotted 


12 
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o 
= © 80 
<a 
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Fic. 2. The Compton scattering angle @max at which the asym- 
metry ratio R is maximum, and the asymmetry ratio R for @=@max 
as a function of incident photon energy up to about 6 Mey, the 
highest energy at which a Compton A has been used. 


** There is disagreement between the latter curve and the one 
given here, which becomes larger as the energy increases from 
zero. The reason for this discrepancy is not known: perhaps an 
approximation valid only in the region of small energy was made 
in the calculations for the curve in Me 50. In the curve presented 
here no approximations have been made. i 
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Fic. 3. Total cross section ø for the Compton effect (curve from 
Da 52a), and the differential cross section do at 6=@Omax, as a 
function of incident photon energy. 


as a function of energy. This figure shows that beyond 
about 2 Mev the value of R decreases very slowly as it 
approaches a value of one at infinite energy. Conse- 
quently, the Compton effect might in some cases be 
useful at energies as high as 10 Mev since at least a 
10% effect would exist. However, its use at this energy 
might be somewhat marginal since the values of R 
given in Fig. 2 are for an ideal geometry. In any prac- 
ticable experimental arrangement the value of R is 
considerably reduced by the angular spread of the de- 
tector apertures. Of course, another consideration in- 
volved in the case of the Compton effect at higher 
energies is the decrease in the cross section with in- 
creasing energy as is illustrated in Fig. 3. This decrease 
is accentuated in the differential cross section at 0=@max 
because of the corresponding increase in the forward 
peaking in the angular distribution. A discussion of the 
geometrical considerations associated with the applica- 
tion of the Compton effect to particulr polarization 

experiments is deferred to Part III, Section A(c). 

The value of the differential cross section for a Comp- 
ton scattering at an angle (65° to 90°) useful for most 
polarization analyses is of the order of 10° cm?/ 
sterad/electron for the majority of nuclear gamma rays 
(see Fig. 3). Figure 2, also shows that the Compton 
effect furnishes a large efficiency for detection of polari- 
zation over quite an extended energy range. The rela- 
tively high cross section and large analyzing efficiency, 
coupled with the relative ease of experimental applica- 

tion, have made the Compton effect the most useful 
of the polarization-sensitive mechanisms. 


(b) Sensitivity to Circular Polarization 


Interest in the sensitivity of the Compton effect to 
circular polarization has increased considerably with the 
rene investigations of nonconservation of parity in 


i iments on the 
k in tions by means of experimen 
ee a rays following beta 
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decay. Although the scattering of circularly polarized 
gamma rays by polarized electrons had been discussed 
by earlier workers (Bi 51, Ha 51, Cl 52), the first ef- 
fective experimental study of the effect was made by 
Gunst and Page (Gu 53). In this work the total cross 
section is given as g=oo-ko1 where oo and g, are the 
parts unaffected and affected, respectively, by circu- 
larly polarized gamma rays. Gunst and Page measured 
the cross section for the effect at ko=2.62 Mev and 
found a value of o1/m7ro?=0.089-+0.007 in good agree- 
ment with the theoretical value, 0.093, found from ex- 
pressions calculated by Stehle (see Gu 53). The experi- 
ment was done by measuring the difference in intensity 
of the 2.62-Mev gamma rays from ThC” which were 
transmitted through an iron bar 30-cm long when mag- 
netized, and then demagnetized. Although the gamma 
rays were initially unpolarized, they can be considered 
as consisting of equal components of right and left 
circularly polarized radiation. Consequently, when the 
magnetic field is applied, one of these components is 
preferentially scattered. 

Lipps and Tolhoek (Li54, Li54a; see also Fr 38, 
Fa 49, To 56), have derived the Compton cross section 
in a general form with all polarizations taken into 
account, 


do =r07(k?/ ko?) ® (Ko, k, €,€, 0°, 0) dQ, (IL-3) 


where £, č, @, and ¢ are the respective polarization 
vectors of the incident and scattered photon and 
of the electron before and after interaction, and 
(ko, k,¥0,&,¢,%) is a linear function of these polariza- 
tion vectors. The function ® can be separated into 16 


1 


-4 o 4 
cos 8 


Fic. 4. Polarization-sensitive part do. of the differential Comp- 
ton cross section for electrons polarized along the axis of incident 
circularly polarized photons. The curves are normalized to the 
polarization-insensitive cross section doo and plotted for severa 
energies as a function of cos@, where 6 is the angle through w. 
the photon is scattered 


One mm figure from Gu 53). 
circular polarization of ga urukul Kangri University Haridwar Collection. Digitized by S3 Eoi (Ce ie ) 
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Fic. 5. Polarization-sensitive 0.02 


part øı of the total Compton 

cross section (in units of 2rro?) 

for the same electron and pho- 

ton polarizations as in Fig. 4. O, 0,00 
In the dashed curve ø is 
normalized to øo, the polari- 
zation-insensitive part of the 
total cross section. The curves 
are plotted against incident 
photon energy (figure from 
Gu 53). 
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-004 
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terms, depending on the 16 different ways of choosing 
sets of polarization vectors. Thus, 


D=Po+ 21+ P2+P;4+ Py, (II-4) 
where ®o is independent of polarizations, 
1= 4, (E) +41(8)+41(0)+0,(2), (11-5) 
Po=P(¥, E) HB(%, e) HBE, 0) H2: 
+o( 29.) +b2(E, 0°), (11-6) 


and #3, and 4 depend on 3 and 4 polarization vectors, 
respectively. Explicit expressions are given in Li 54a 
for all the $o,- - Py. 

Thus far, the only experiments performed have in- 
volved not more than two polarizations. It is necessary 
then to average over unobserved initial polarizations 
and sum over unobserved final polarizations in order to 
obtain the cross section for some definite experiment. 
Since the states represented by —& and —¢ are or- 
thogonal to those represented by ~ and ¢, the terms 
which are functions only of the unobserved polarizations 
will vanish in these averages and sums. 

For the case of interest here the initial photon po- 
larization % and the initial electron polarization ¢ are 
specified. For this case, (II-3) takes the form, 


do (&°, 0°) = 410° (k?/ Ro’) 

X [Bott (2) +61 (L°) + bo (FE, ) dQ, (11-7) 
In this expression ®o gives the usual Klein-Nishina for- 
mula independent of polarization, 6;(%)=0, and ;(£°) 


=0 if no linear polarization of the gamma ray is present. 
Hence, with 


0,4 


0,2 


k, /mc? 


we have (To 56) 
Pdoy=4ro?(k?/ ho?) Po( 9, O) do 


k? ko k 
=— 1Prt(—) (--—) cos#dQ. (II-9) 
ko Nk ko 


The electron polarization has been taken parallel (or 
antiparallel) to the direction of the incident photon,tt 
and P is the product of the polarizations of the photon 
and of the electron. (P is positive for a right circularly 
polarized gamma ray and an electron spin parallel to 
the direction of the incident photon.) Equation (II-9) 
is given in Gu 53. If an integration is performed over 
dQ, the part of the total cross section dependent on 
polarization becomes 


2rro ko(1-++ ko)? 2ko? 


In(1-++-2o). (II-10) 


with ko in units of moc. 

The ratio of the differential cross sections dø,/døo is 
plotted as a function of cos@ for several energies in 
Fig. 4. This graph shows the difference in the sign of 
the response to polarization in the forward and back- 
ward quadrants. In general, also, the response at a 
given angle increases with increasing energy. The 
polarization-dependent total cross section gı is pre- 
sented as a function of energy in Fig. 5. The change in 
sign of cı at E=1.25mc?=0.65 Mev arises as a result 
of the difference in sign of do, for forward and backward 
directions together with the fact that the forward 
Compton scattering becomes more dominant at higher 
energies. 
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Fic. 6. Cosine of the Compton scattering angle which yields 
the maximum relative azimuthal anisotropy due to circular 
polarization, plotted as a function of the incident photon energy. 
In this case the circular polarization is perpendicular to the elec- 
tron polarization (figure from Be 57). 


This discussion has led to expressions applicable to 
the case in which circularly polarized gamma rays are 
scattered from electrons polarized parallel or anti- 
parallel to the initial direction of propagation of the 
gamma ray. This is the case most used so far in experi- 
ments on circular polarization of gamma rays. Virtually 
all of these experiments detect the difference in the in- 
_ tensity of gamma rays transmitted through, or scattered 
forward, from magnetized iron, when the magnet is 
turned on and when it is turned off or reversed. 
Recently Beard and Rose (Be 57) have suggested the 
use of the case in which the circularly polarized gamma 
rays are scattered from electrons polarized perpendicu- 
lar to the initial direction of gamma-ray propagation. 
This case offers the advantage of determining an azi- 
muthal asymmetry from two simultaneous measure- 
ments at opposite azimuthal scattering angles in the 
plane of the electron spin and photon propagation 
vector. Beard and Rose show{{ that at a certain scatter- 
ing angle 9=06,, the relative azimuthal anisotropy J1/Jo 
is a maximum. Figure 6 gives a graph of cosôm as a 
function of photon energy ko. In Fig. 7, Ji/Jo at 9=6,, 
is shown as a function of the energy of the incident 
photon. For completely polarized electrons, Ji/Jo 
reaches a maximum value of 0.33 at 0.511 Mev. ; 
= The most significant practical limitation involved in 
applying the above cases to an experiment 1s the fact 
hat only about 2 electrons of the 26 in an iron atom 
a , polarized. Thus in all of the experiments performed 
oss section may be grained e a e 
direc on of the inci 
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on forward scattering and transmission of circularly 
polarized photons, the possible effects amount at most 
to only a few percent. In the case treated by Beard and 
Rose, even at 0.511 Mev where Ji/Jo is maximum, 
there will be only a 5% effect (for complete photon 
polarization and ideal geometry). 


B. Photodisintegration of the Deuteron 


At gamma-ray energies below 20 Mev, and particu- 
larly below 10 Mev, the existing theory for the photo- 
disintegration of the deuteron is satisfactory. This is 
the region of interest since we are concerned only with 
the polarization of gamma rays of nuclear origin. The 
threshold energy for the photodisintegration of the 
deuteron, 2.225 Mev (Be 56), places a definite lower 
limit on the energy of a nuclear gamma ray whose polari- 
zation can be studied using this mechanism. In the 
energy region of interest (2.225 to 20 Mev) the disinte- 
gration proceeds almost entirely by an electric dipole 
or magnetic dipole interaction. The former is sensitive 
to polarization, while the latter is not. 

The cross section for the photodisintegration of the 
deuteron in the form derived by Bethe and Longmire 
(Be 50 and Be 56), assuming a central force potential, 
is expected to be valid below about 10 Mev. The dif- 
ferential cross section for electric dipole disintegration, 
the one that is polarization-sensitive is given by 


yk’ 


e 1 
do*=2 ( ) | | ( ) cos’adQ, (II-11) 
he} U(y?-+k?)21\1—yror 


where æ is the angle between the direction of polariza- 
tion of the gamma ray and the direction of motion of 
the proton (or neutron), and ro: is the effective neutron- 
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Fic. 7. Maximum relative azimuthal anisotropy J:/Jo result 
from Compton scattering of incident circularly polarized photo 
plotted as a function of incident photon energy. The circ 
polarization is perpendicular to the electron polarization ( 
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proton range in the triplet state. The parentheses con- 
taining ro, represents a correction to the zero-range 
approximation for the cross section, which is given by 
the rest of the expression. The quantities y and k can 
be obtained from the expressions 


E=hy—Wi=lB/M 
Wi=le?/M, 


(11-12) 
(IT-13) 


where /y is the energy of the gamma ray, W1 the bind- 
ing energy of the deuteron, and M the mass of the 
nucleon. That the angular distribution of the photo- 
protons varies as cos’a has been experimentally con- 
firmed by Wilkinson (Wi 52) using the completely 
polarized gamma rays from the reaction Do(p,y) He’. 

Since it is usually the azimuthal distribution of 
photoprotons or photoneutrons that is observed in 
determining the polarization of a gamma ray, the more 
useful form of (II-11) is 


2e 
dos°= ag Le) sin’6 cos*¢dQ, 
ic 


(11-14) 


where 6 and ¢ are the usual polar coordinates. The 
direction of propagation of the gamma ray is given by 
6=0, and its direction of polarization by 6=90°, ¢=0. 
The geometrical relationship of 0, ¢, and a is illustrated 
in Fig. 1, 

If the photodisintegration proceeded entirely by an 
electric dipole transition, the reaction would produce 
an ideal response to polarization in the sense defined 
above, namely that R=dogo/doo=0.§§ Over a large 
part of the energy range of interest this is nearly the 
case. However, as the gamma-ray energy is decreased 
to the threshold value the reaction proceeds increasingly 


œ in millibarns 


Photon energy in Mev 


2 
B= Aes 


Fic. 8. Total cross section for the photodisintegration of the 
deuteron as a function of incident photon energy. Solid curves 
show the electric dipole and magnetic dipole cross sections as 
labeled, and dashed curve shows their sum. Points show repre- 
sentative experimental data (figure from Ev 55). 


§§ In the Compton effect we have R= œ for an ideal response. 
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Fic. 9. The ratio A/B as a function of energy in Mev. This 
ratio is a measure of the isotropic term in the cross section for the 
photodisintegration of the deuteron. A noticeable effect starts at 
about 10 Mev (figure from Wh 58). 


by the magnetic dipole interaction which yields an 
isotropic distribution of photoprotons. The total cross 
section for this process is given by (Se 53 and Be 50) 


rmesh \* kylLy—1/a, }? 
oo" =— — ) (unu ) 


— p) R(E), 
3 he\Me (k+?) (k+ 1/a?) 


IL-15) 


where un and up are the magnetic moments of the neu- 
tron and proton, respectively, a; is the singlet scattering 
length, and R(E) is a correction for the nonzero range 
of the neutron-proton force, which has been derived by 
Salpeter (Sa 51). 

Since the presence of the magnetic dipole interaction 
diminishes the polarization sensitivity of the reaction 
as a whole, it is of interest to compare the total cross 
sections for the magnetic and electric processes as a 
function of the gamma-ray energy. The total cross sec- 
tion for the electric dipole process, obtained by inte- 
grating (IT-11) over 6 and 4, is 


8m fe*\) yk 1 
o=—(—)|——_|(_). (IT-16) 
3 \he/ L(Y HRN \1—yro, 


The two cross sections in the zero-range approximation 
(i.e., without the range correction given by the last 
factor in each expression) are compared in Fig. 8 
(Ev 55). The sum of the two cross sections is also shown 
as are several experimental points which agree fairly 


well with the curve and indicate the extent of validity E 


of the zero-range approximation. 
Other effects, which generally tend to diminish the 
ideal polarization sensitivity of the electric dipole di 
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Fic. 10. The asymmetry coefficients 8 and £, as defined by 
(I-17) and (IT-18), as a function of incident photon energy. 
Values of 8 and £, are obtained after first determining A/B. For 
comparison the theoretical prediction v/¢=[(hy—W:)/Mce }} 
(Ma 49) is plotted (figure from Wh 58). 


integration, arise at energies above 10 Mev, and espe- 
cially above 20 Mev. Since the nature of these effects is 
under current theoretical discussion, (Hu 57a) we con- 
fine attention to the empirical situation. The cross sec- 
tion is given in either one of the following forms 
(Wh 58): 

do/dQ= A+B sin*6(1+-28 cos6), (II-17) 


do/dQ= (A+B sin%)(1+26; cosd). (IT-18) 


The term in ĝ or ĝı is attributed to the onset of the 
electric quadrupole interaction. In the recent experi- 
mental work by Whetstone and Halpern (Wh 58), 
values of A/B and B or B; have been determined as 
shown in Figs. 9 and 10. These values of A/B, along 
with those of Allen (Al 55), are in good agreement with 
the recent calculations of de Swart and Marshak 
(De 58). 

All these expressions are given in the center-of-mass 
system. Thus, in addition to the forward asymmetry 
which appears at higher energies, a forward asymmetry 
arises experimentally in the laboratory system because 
of the forward momentum of the gamma ray. This 
effect is illustrated in Fig. 11 which gives the laboratory 
angle corresponding to 90° in the center-of-mass system. 
In the range of interest the shift is not large. 

dipole reaction and the deviations from 


d 1 
ae ee above 10 Mev constitute the only 
an t effects that diminish the efficiency for de- 
significan larization in the energy region of interest 
teghon pipe ut 4 to 12 Mev the asymmetry 
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ratio R=dogo/doo is less than 0.04. Perhaps the only 
serious drawback to the use of the photodisintegration 
of the deuteron in the detection of polarization is the 
relatively low total cross section (~10-7 cm?; see 
Fig. 8). 


C. Photoelectric Effect 


The earliest measurements of polarization in the kev 
range made use of the photoelectric effect. Experiments 
using polarized photons having energies less than 40 
kev (Wi 23, Bu 24, Ki 31) confirmed the basic theo- 
retical prediction (Au 27, So 30) that the distribution 
of photoelectrons should vary approximately as cos*a. 
As in the case of the photodisintegration of the deuteron, 
this distribution in terms of polar angles is sin? cos’, 
where a, 0, and ¢ are defined in the same way as before 
(Fig. 1). 

The distribution has been calculated for electrons 
ejected from the K shell. Correspondingly, almost all 
the work dealing with the polarization sensitivity of the 
photoelectric effect has been concerned with emission 
from the K shell,|| || since it constitutes about 80% of 
the total emission for the energies of interest here 
(below about 1 Mev). 

Fischer (Fi31) and Heitler (He 44) have given the 
differential cross section in a nonrelativistic form (good 
to energies of about moc?) : 


(32) / moc? \ 7/? 
dop=rerZ5 (=) 
137! ko 


sin?ð cos*p 
dQ, 


ea TO 
L(1+o/2muoc?)?—B cosh] 


where B=/c and Z is the atomic charge. Sauter (Sa 31, 
Sa 55) using relativistic considerations and assuming 
B~1 and Z/137<1 obtains 


doy~C(A+B cos’) dQ, 
A= (é€/4)(1—8 cos6), 
B= (1—?)!— (¢/2)(1—B cosb) 
C= (1/«)*(1—6?) 36? sin®0/ (1—8 cosb)‘, 


where e is the kinetic energy of the photoelectrons in 
units of moc. 


(II-20) 


Frc. 11. Energy de- 
pendence of the angle in 


ey the laboratory system 
which corresponds to 90 

a in the center-of-mass 
system, in the photodis- 

205 integration of the deu; 


teron (data from Wi 
51a). 
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|| | A nonrelativistic calculation by Schur (Sc30) gives an 
azimuthal distribution for the electrons ejected from the Ly lev 
similar to the nonrelativistic K-shell distribution. However, the 
emissions from the Ly shell represent only a small fraction of those 
emissions (20%) which are not from the K shell. 


| 


POLARIZATION ON 
In these expressions there is a pronounced forward 
asymmetry in the angular distribution which increases 
as the energy of the gamma ray is increased. This effect 
is illustrated in Fig. 12 which gives theoretical curves 
for the photoelectron distribution as a function of 0 for 
several energies. In the nonrelativistic case the response 
3 to polarization is ideal, i.e., R=dø9/doo=0. In the 
relativistic case, however, the response is decreased by 
the fractional amount A/B. Another effect predicted by 
(II-20) is that the favored direction of emission of the 
photoelectron changes by 90° as the gamma-ray energy 
passes through 0.51 Mev. This is shown in Fig. 13.19 
Accordingly, above 0.51 Mev the photoelectron would 
be emitted preferentially normal to the direction of 
polarization of the gamma ray. 
A series of measurements with polarized photons 
(>40 kev) designed to investigate this and other 
- features of Sauter’s expression has been carried out by 
Hereford and co-workers. Hereford (He 51, He 51a) 
studied the azimuthal distribution of the photoelectrons 
ejected from lead by the polarized radiation from 
positron-electron annihilation, and found a much greater 
asymmetry in the azimuthal distribution than pre- 
dicted by (II-20). In a revised experiment Hereford 
and Keuper (He 53) obtained a result in rough agree- 
ment with that predicted by Sauter, if his equation, 
which is valid chiefly for low Z, is applied to the case 
of lead. McMaster and Hereford (Mc 54) continued the 
experiments by measuring the azimuthal asymmetry 
using photons (from Co) which had been polarized 
by Compton scattering. By changing the angle of 
scattering they were able to obtain polarized gamma 
rays of different energies. Their measured values of R 
fell below unity above about 0.55 Mev, showing fair 
agreement with the theoretical curve in Fig. 13. 
Brini ef al. (Br 57) recently repeated the experiment 
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Fic. 12. Differential cross sections for photoelectrons ejected 
from the K shell, plotted against the angle of emission. The curves 
for the different photon energies are not normalized with respect 
to each other. Solid curves are calculated from Sauter’s relativistic 
formula (Sa 31), the dashed curve from Fischer’s non-relativistic 
formula (Fi 31) (figure from Ey 55). 


TT In this figure it is the reciprocal of R (as previously defined) 
that is plotted. 
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Fic. 13. Asymmetry ratio R=doo/doso for photoelectrons 
ejected from the K shell by linearly polarized photons for @=90°, 
as calculated from Sauter’s formula (Sa 31). The abscissa is the 
incident photon energy in units of moc* (figure from Mc 54). 


of McMaster and Hereford, but did not find the 90° 
shift in the polarization response. Their data agree l 
better with a theoretical curve of Archibald as given 
in Br 57. Apparently the knowledge of the polarization 
sensitivity of the photoelectric effect at higher energies 
is still somewhat uncertain. However, it is clear that 
the reaction can be used for gamma-ray energies which 
are well below 0.5 Mev. 

The primary disadvantage of the photoelectric effect 
as an analyzer of polarization is that it requires detec- 
tion of the photoelectron. Low-energy electrons are 
somewhat more difficult to detect than are the gamma 
rays in a typical Compton process. Furthermore, the 
scattering of low-energy electrons will present a serious 
problem. : 


D. Other Polarization-Sensitive Processes 
(a) Pair Production 


Berlin and Madansky (Be 50a) have investigated the 
possibility of using pair production as a polarization 3 
analyzer. Assuming a perfectly plane polarized incident 
photon they have calculated the expected asymmetry 
ratio R=dogo/doo for several different experimental 
cases. They prefer the experimental arrangement i 
which all pairs in a particular plane are counte 
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gardless of angle or energy. For this case R= 1.23; i.e., 
the pair emission is a maximum in a plane perpendicular 
to the polarization. 

However, Wick (Wi 51) pointed out that the condi- 
tions specified by Berlin and Madansky are practically 
impossible to realize experimentally, since they specify 
that the plane of the pair contains exactly the direction 
of the incident quantum. Accordingly, Wick suggests 
removing the restriction on the angle between the 
direction of propagation of the photon and the plane 
of the pair. He then finds that this experimental situa- 
tion yields a polarization sensitivity comparable to 
those obtained by Berlin and Madansky. However, in 
Wick’s case the plane of the pair prefers to be parallel 
to the polarization vector. 

Pair production has not been used as a polarization 
analyzer for gamma rays of nuclear origin. This can be 
attributed primarily to the fact that, at the higher 
energies at which the reaction would prove most useful, 
the small angle between the pairs, the relatively small 
expected asymmetry ratio, and the scattering difficulties 
make it less desirable than the photodisintegration of 
the deuteron as a polarization analyzer. 


(6) Nuclear Photoeffect 


Although the general treatment of the angular dis- 
tribution and polarization of reaction products (BI 51, 
Si 53) is applicable to photonuclear reactions, Agodi 
(Ag 57) has given a specific treatment of the polariza- 
tion sensitivity of photonuclear reactions. Without 
using a model for the nucleus for the photonuclear 
reaction, Agodi shows on quite general grounds (con- 
servation of parity and angular momentum) that the 
azimuthal dependence of the differential cross section 
assumes a form proportional to (1+a cos’), where ¢ 
is the azimuthal angle as defined in Fig. 1. This result 
is quite general regardless of whether a single or mixed 
multipole transition is involved in the photonuclear 
process. In the case in which a single multipole is re- 
ponsible for the transition, the value of a for a com- 
pletely polarized incoming gamma ray can be determined 
from the angular distribution (in @) coefficient, if it is 
known. Such is not generally the case, however, when 
two or more multipoles contribute to the transition. 

Apart from the photodisintegration of the deuteron, 
the most promising of all the known photonuclear re- 
actions is the reaction Be?(y,7)Be®. Since the threshold 

energy is about 1.66 Mev and the cross section at 9.5 
Mev is about 1.6 mb, this reaction could compete with 
the photodisintegration of the deuteron as an effective 
analyzer of polarization for gamma rays if the value of 


a were large enough. 


a (c) Photofission 


Winhold and Halpern (Wi 56), Katz et al. (Ka 58), 
nd others have shown experimentally that certain 
subject to photofission exhibit anisotropy in the 
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angular distribution of the fission fragments. Two such 
nuclei, the even-even nuclei, Th”? and U”8, for example, 
show angular distributions of the form (a+b sin’). 
Although the form of the distribution remains the same 
over the energy range studied (5 to 20 Mev), the ratio 
b/a varies from values of the order of 10 near threshold 
(just above 5 Mev) to values of the order of 0.1 in the 
vicinity of 20 Mev. The anisotropic distribution is 
assumed to result from an electric dipole interaction. 
Consequently, although the azimuthal distribution pro- 
duced by polarized photons has not yet been studied, 
the foregoing angular distribution implies a polarization 
sensitivity proportional to 6/a. Other nuclei, such as 
the odd-A nuclei, U5 and Pu?™, have essentially iso- 
tropic angular distributions and thus are not expected 
to manifest any sensitivity to polarization. 

Although the photofission cross sections in the energy 
region of interest here are of the order of 10 mb, the fact 
that appreciable anisotropy in the angular distribution 

is confined to the region roughly 2 to 3 Mev above 
threshold places some limitation on the potential use 
of photofission as a polarization analyzer. In addition, 
if neutron-induced fission or spontaneous fission is 
present, the detection of the photofission might become 
very difficult. 


Ill. MEASUREMENTS OF POLARIZATION OF 
NUCLEAR GAMMA RAYS 


The following discussion is organized primarily on 
the basis of the type of polarization-sensitive mechanism 
involved. Attention has been given to chronological 
order only insofar as it lends itself to a logical develop- 
ment. The development of technique has also been a 
consideration in selecting experiments for discussion. A 
complete listing and outline of the experiments is 
attempted in Table XIII at the end of Part III. Since 
this discussion is primarily devoted to experiments on 
nuclear gamma rays, the polarization experiments on 
annihilation radiation and bremsstrahlung are not dis- 
cussed except for the earliest annihilation experiments. 


A. Measurement of Plane Polarization by 
Means of the Compton Effect 


Not only the earliest, but also the most utilized, of 
the polarization sensitive reactions has been the Comp- 
ton effect. This is due to the fact (Part II) that the 
reaction has a relatively high cross section along with a 
reasonable sensitivity to polarization over an energy 
range which includes many nuclear gamma rays of 
interest. 

In general, a polarimeter that uses the Compton effect 
consists of a scatterer and an analyzer which detects 
the scattered radiation. The elements of such a po- 
Jarimeter are shown in Fig. 14. Here the polarimeter 
consists of the elements B and C. Gamma rays from the 
source impinge on the Compton scatterer B and the 
scattered radiation is detected by the analyzing elemen! 


POLARIZATION ON 


C. As a general rule B and C are crystals of scintillation 
counters and coincidence between B and C are measured. 
This makes it considerably easier to isolate and detect 
the desired Compton events. A polarization measure- 
ment is made by determining No/Noo (often called 
Nii/N1), the ratio of the coincidence counting rate 
when C is placed at ¢=0° to that when it is placed at 
90°. In principle, B could be merely a piece of scattering 
material, and the polarization measurement would de- 
pend only on the single counting rate in C at the 0° 
and 90° positions. However, in this case the shielding 
(and electronic discrimination) would have to be ex- 
ceedingly good to prevent direct radiation from the 
source being recorded in C. See Sec. (a). 

Except in the earliest experiments, a polarimeter has 
generally consisted of an organic scintillator for the 
scatterer and a sodium-iodide crystal for the detector 
of the scattered gamma ray. An organic scintillator, a 
material of low Z, makes a good scatterer because it 
permits fairly free escape of the scattered photons. 
Naturally, sodium iodide is the best analyzing crystal 
because of its high efficiency and good resolution for the 
detection of gamma rays. The objection to the use of 
sodium iodide for this purpose in the earliest experi- 
ments was the long decay time of its pulses. Develop- 
ment of suitable electronic techniques has greatly 
diminished this difficulty. 

Circular polarization experiments utilizing the Comp- 
ton effect are discussed in the last section of this part. 


(a) Early Experiments on Annihilation Radiation 


Although this paper deals mainly with the polariza- 
tion of gamma rays of nuclear origin, for convenience 
and historical interest we begin with a brief discussion 
of the first measurements of polarization, those per- 
formed in order to detect the mutually perpendicular 
polarizations of the two quanta resulting from the 
annihilation of a positron and an electron. This predic- 
tion of the pair theory was pointed out by Wheeler 
(Wh 46). Assuming that two Compton polarimeters 
would be used to detect the polarizations, Pryce and 
Ward (Pr 47) and Snyder et al. (Sn 48) calculated the 
expected azimuthal distribution of one scattered quan- 
tum relative to the other. This is an example of a 
polarization-polarization correlation with the angle be- 
tween the two polarimeters fixed at 180°. 


Vic. 14. Schematic 
diagram of experi- 
mental arrangement 
of scintillating crys- 
tals in an apparatus 
designed to measure 
a _ direction-polari- 
zation correlation. 
Crystals B and C 
constitute the polar- 
imeter; crystal A 
is the directional 
counter (figure from 
Me 50). 
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Fic. 15. Schematic diagram of typical experimental arrange- 
ment for study of linear polarization of annihilation radiation. 
In this illustration the scatterers are made of aluminum. The 
counters are in the parallel (||) position. 


The essential features of the experimental arrange- 
ment are shown in Fig. 15. The positron source (Na™ 
or Cu“) is placed at the center of a hole about $ in. in 
diameter bored through the center of a lead block with 
dimensions approximately 6X6X6 in. The oppositely 
directed annihilation radiation beams emerging from the 
openings then impinge upon cylindrical scatterers (not 
detectors in this case) about 1 in. long. The detectors 
of the scattered radiation are placed so that the mean 
scattering angle is about 82°, the angle at which the 
Compton effect exhibits optimum sensitivity to polariza- 
tion for this gamma-ray energy. (Fig. 2.) The experi- 
ment then consists in determining Vi/N,,, the ratio of 
the coincidence counting rate when the axes of the two 
detectors are at right angles to each other to that when 
they are parallel. 

Bleuler and Bradt (BI 48) using G-M counters with 
end-windows as detectors confirmed the predicted cor- 
relation, although their results have a relatively large 
margin of error. They obtained N/N =1.9+0.3 as 
compared to the theoretically predicted ratio of ~1.7 
for their geometry. Shortly thereafter, Hanna (Ha 48a) 
performed the experiment using G-M tubes and ob- 
tained values of N/N, which were consistently lower 
than the theoretical values. Vlasov and Dzhelepov 
(V1 49) obtained 1.70.2. However, the approxima- 
tions used in obtaining the theoretically expected ratio 
for their geometry were sufficiently rough that essen- 
tially there was only qualitative agreement with the 
theory. Wu and Shaknov (Wu 50) repeated the exper 
ment using anthracene scintillation counters and ol 
tained a value of N/N u =2.04+0.08, which compa 
well with the theoretical value of 2.00 foni tl 
geometry. i 


(b) Direction-Polarization Correlation 


The arrangement of the scintillation crystals 
direction-polarization correlation experiment is shoy 
in Fig. 14, in which A is the crystal of the co 
which is insensitive to polarization and recor 
the direction of one gamma ray and B and C a 
crystals of the Compton ae wh 
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Fic. 16. Schematic diagram of apparatus of Metzger and 
Deutsch. Two additional counters are used to increase the yield 
in the measurement of the direction-polarization correlation. In 
the position shown, ¢=0 (after Me 50). 


ployed in a triple-coincidence circuit. Thus, a measure- 
ment of the polarization is obtained from the ratio of 
the triple-coincidence yields with crystal C at ¢=0° and 
at ¢=90°. The direction-polarization correlation is then 
obtained by determining this ratio as a function of the 
correlation angle 0. The ratio of the counting rates is 
No/Noo from which one obtains the true polarization 
ratio Jo/Joo discussed in Part I. 

Thorough consideration must be given to the possi- 
bility of recording “‘false” coincidences as a result of 
radiation being counted and scattered among the three 
crystals in combinations different from the one desired. 
Usually a good shield is placed in front of counter C to 
diminish the amount of direct radiation from the source 
or scattered radiation from A arriving at C. This shield 
diminishes the chance that one gamma ray goes to A 
while the other one goes to C and is then scattered to B, 
or that one gamma ray goes to B while the other goes 
to A and is scattered to C. 

As mentioned in Part I, there is little to be gained 
(Ha 48) by putting a polarimeter at A as well as at 
B,C. Furthermore, if the corresponding direction- 
direction correlation is known it is only necessary in 
virtually all cases to measure the direction-polarization 
correlation at a single angle. This angle is usually se- 
lected so as to maximize the effect and it is often 90°. 
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Metzger and Deutsch (Me 50) first used the tech- 
nique of the direction-polarization correlation in the 
study of several gamma-gamma cascades. The method 
has been extended by others to the study of correlations 
in which the radiation whose polarization is not analyzed 
is some other type of particle, either emitted or absorbed. 


(c) Experiments of Metzger and Deutsch 


The classic experiments of Metzger and Deutsch 
(Me 50) exemplify the techniques involved in the 
measurement of direction-polarization correlations. Fig- 
ure 16 is a schematic diagram of the experimental 
arrangement. Naphthalene was used for all of the 
crystals in this arrangement, but Metzger and Deutsch 
suggest the use of anthracene would be an improvement. 

Counters A’ and C’ add to the original coincidence 
combination ABC three more equivalent (theoretically 
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Fic. 18. Asymmetry ratio R as a function of gamma-ray energy 
in Mev for different geometries: (a) 0,=80°, ideal geometry; 
(b) 6-= (0.) max, ideal geometry; (c) 6-=80°, Ad-=55°, Ap=60 
(figure from Me 50). 


and experimentally) combinations ABC’, A’BC, A’BC’, 
so that the real coincidence yield is increased by a 


factor of four. The addition of C’ also, to a first approxi- 


mation, cancels the effect of any small asymmetry Te- 
sulting from a possible misalignment of the axis of the 
polarimeter. 

Figure 17 gives a block diagram of the electronics 
associated with the five counters. This system makes it 
possible to count double coincidences separately, which 
it is necessary to do in order to monitor the triple- 
coincidence rate properly and to determine the acci- 
dental rate. An arrangement such as this, or its equiva- 
lent, has been an important feature of all Compton 
polarimeters, 

Metzger and Deutsch distinguish three polarization 
parameters: ~=Jo/Joo, which describes the state of 
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polarimeter; R, the asymmetry ratio, which is a meas- 
ure of the polarization sensitivity of the polarimeter 
taking into consideration nonideal geometry; and 
No/Noo, the ratio of the triple-coincidence counting 
rates with the polarimeter set at ¢=0 and ¢=90°. 
These three parameters are related through the 
expression 


No/Noo= (p+R)/(pR+1). (IIL-1) 


For the case of ideal geometry (Part II) R is simply the 
ratio of two differential cross sections for Compton 
scattering, i.c., R=dooo/doo. There is an angle of 
scattering*** 0,= (8.)max at which R is a maximum for 
ideal geometry. The dependence of R on the energy of 
the initial photon, as calculated by Metzger and 
Deutsch, is given as curve b in Fig. 18 for 0e= (0c) max. 
Curve a shows the dependence of R for 0,=80°, which 
Metzger and Deutsch chose as a mean value. In order 
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Fic. 19. Figure of merit of a Compton polarimeter as a function 
of the spread in the angle ¢, for various spreads in 8e. Incident 
photon energy equal to 1 Mev, and p=1.2. The curves do not 
differ significantly from those in Me 50. 


to obtain statistically significant counting rates, how- 
ever, it is necessary to use angular spreads Ab. and Ag, 
which are considerably greater than zero, so that R is 
appreciably reduced. Metzger and Deutsch use (p’— 1) 
/Ap’ as a figure of merit for a polarimeter, where p’ 
is the observed polarization ratio, and Ap’ its error. 
This quantity is given as a function of the angular 
spreads in Figs. 19 to 22 for two gamma-ray energies. 
For angular spreads of 60°-70°, the figure of merit is a 
maximum over a wide range of energy. Hence, it is 
possible to design an efficient polarimeter of consider- 
able versatility. 

The actual spreads used by Metzger and Deutsch 
were A9,=55° and Af=60°. That these were the effec- 
tive values was determined experimentally in the 
following way. The partially polarized radiation result- 
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Fic. 20. Figure of merit of a Compton polarimeter as a function 
of the spread in the angle @., for various spreads in ¢. Incident 
photon energy equal to 1 Mev, and p=1.2. The curves do not 
differ significantly from those in Me 50. 


ing from the Compton scattering of a collimated beam 
of gamma rays from a strong Co source impinged on 
the scattering crystal of the polarimeter. The degree of 
polarization of this radiation is known through the 
Klein-Nishina formula. A measurement of Vo/Noo then 
determines R through the use of (III-1). With this 
value of R, the angular spreads A9. and A¢ are obtained 
for the energy appropriate to the scattered quanta. 
With these values of AQ, and Ad, the curve for R as a 
function of energy was then constructed and is given as 
curve ¢ in Fig. 18. 

With the equipment and techniques discussed, Metz- 
ger and Deutsch measured the direction-polarization 
correlations in the decays of Sc'®, Co, Rh'®, and Cst, 
Essentially as a check on the experimentally known 
direction-direction correlations (Br 48) and to elucidate 
the method, the polarizations were measured as a 
function of 6. For each nucleus, possible direction- 
polarization correlations were calculated from the 
known coefficients of the direction-direction correlation. 
Thus, for the cascade 71(L1)7(L2)j2— 4(2)2(2)0, the 
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Fic. 22. Figure of merit of a Compton polarimeter as a function 
of the spread in 0<, for various spreads in ¢. Photon energy equal 
to 6 Mev, and p=1.2. 


direction-direction correlation is 
W (@)=1-+ (1/8) cos’0+ (1/24) cos'é. 


The direction-polarization correlation can be written 
down from Table III(b). At 6=90°, for example 


W (90,y) =1-+cos2y(1/8+1/24). 


The sign is + or — according as the radiation is electric 
or magnetic. This expression applies if the first gamma 
ray goes to the directional counter and the second to 
the polarimeter, or if the gamma-ray paths are inter- 
changed. In the experiment, both processes were ob- 
served indistinguishably with about equal efficiencies. 
Hence (at 6=90°), 


Jo 2+(+)10.17-+(+)x,0.17 


Be 07 (=) 20.17 


and p=1.40, 1, or 0.71 depending on whether the gamma 
rays in the cascade are (#2,22), (£2,M2) or (M2,E2), 
‘or (M2,M2). The complete correlation can be worked 
out in similar fashion. 

Figure 23 gives the three curves modified by the 
efficiency of the polarimeter, with the aid of (III-1). 
The data are those obtained in the decay of Sc‘*. The 
experimental points fit the curve corresponding to an 
(£2,E2) cascade in Ti**. Thus, the polarization measure- 
ment clearly selects the correct parity assignment. The 
situation for the cascade in Ni® resulting from the 
decay of Co® is entirely similar. In the Rh! decay the 

calculated direction-direction coefficients for a 0(2)2(2)0 
scheme were twice as large as the experimental ones. 
ithout an explanation for this discrepancy, Metzger 
and Deutsch were only able to show that the direction- 
olarization measurement was consistent with the 
rection-direction measurement. The experimental re- 
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side of this angle. In the case of Cs!*4, the possible par- 
ticipation of other coincident gamma rays in the cor- 
relation made a unique assignment impossible. Again 
the direction-polarization correlation was consistent 
with the direction-direction correlation, and the result 
was not unlike that for Sc*® and Co™. 


(d) Other Measurements with Radioactive Sources 


Since the Metzger-Deutsch experiments, there have 
been many other measurements of direction-polariza- 
tion correlations in a variety of nuclei. The same basic 
techniques are common to all. Naturally, there have 
been modifications and improvements to fill the needs 
of each new experiment. Accordingly, in the review of 
the remaining experiments using the Compton po- 
larimeter only those techniques which represent rela- 
tively distinctive additions to the art are discussed. 

Shortly after the work of Metzger and Deutsch, 
Williams and Wiedenbeck (Wi 50) using a similar ex- 
perimental arrangement verified the results on Rh” 
and Co®, In the case of Rh! the factor of two between 
the theoretical and experimental coefficients could be 
explained according to Spiers (Sp 50) by assuming the 
participation of another state close to the second ex- 
cited state in Pd, Williams and Wiedenbeck also 
studied Cs™4, but found an isotropic direction-polariza- 
tion correlation. 

Stump (St 52) measured a beta-gamma direction- 
polarization correlation in the decay of Sb‘. The source 
and the beta detector were enclosed in a vacuum con- 
tainer in order to minimize beta scattering. The beta 
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Fic. 23. Gamma-gamma direction-polarization correlation in 
the decay of Sc*®, The ratio Ni1/N1 is plotted against 8. The three — 
curves correspond to the dget parity assignments in the spin 
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detector consisted of a thin crystal of anthracene 
mounted on the end of a light pipe leading to a photo- 
multiplier tube outside the vacuum chamber. The 
polarimeter was considerably improved in efficiency by 
using an anthracene crystal as the scattering crystal 
and two oppositely placed sodium-iodide crystals as 
detectors of the scattered gamma rays. Since the angular 
correlation coefficients were already known (St 51) and 
quadrupole radiation is involved, it was necessary to 
take measurements only with the polarimeter at 90° 
to the beta counter. Stump found a value of about 1.1 
for the ratio No/Noo. This datum, used in conjunction 
with the correlation coefficient, led to the conclusion 
that no parity change occurred in the gamma transition. 

Further work on the radioactive nuclei Co™, Cs, 
and Sb!‘ was done by Kloepper et al. (K1 52). Plastic 
scintillators and type 5819 phototubes were used. The 
phototubes were shielded from magnetic fields in order 
to reduce the spurious dependence of the coincidence 
rate upon counter position. Residual effects were in- 
vestigated by comparing the triple-coincidence rate to 
the single-channel counting rates at the various positions. 
In addition to monitoring the single rates, the various 
double-coincidence combinations, and the resolving 
time, the instrument was checked with observations at 
6=7, where No/N5o must equal 1.0. The gamma-gamma 
direction-polarization correlation for Co® was measured 
to establish the reliability of the instrument. The result 
agreed with the earlier measurements (Me 50, Wi 50). 

The polarization measurements on Cs agreed with 
the results of Metzger and Deutsch (Me 50), but not 
with those of Williams and Wiedenbeck (Wi 50). It was 
pointed out that both the direction-polarization and the 
direction-direction correlations could be explained with 
a 4(E2)2(E2)0 decay scheme, provided a 35% isotropic 
component is assumed to be present in both correlations. 

In the case of Sb! the gamma-gamma directional 
and polarization correlations were essentially isotropic. 
The authors suggested a 3(10% £2; 90% M1)2(£2)0 
decay scheme. The beta-gamma direction-polarization 
correlation was also studied with this source. The meas- 
urement agreed with the result of Stump (St 52) con- 
firming the £2 nature of the radiation. 

Additional work on the decay of Cst was done by 
Robinson and Madansky (Ro52) who measured a 
polarization-polarization correlation for the gamma 
rays. The experiment was quite similar to the investiga- 
tions on annihilation radiation (see Fig. 15). In fact, 
Robinson and Madansky first performed the experi- 
ment with annihilation radiation in order to determine 
experimentally the effective solid angle corrections. The 
signals from the two scattering crystals (stilbene) went 
to a coincidence circuit (710-8 sec), the output of 
which triggered an oscilloscope. The pulses from the two 
analyzing crystals (Nal) were delayed, one more than 
the other, and displayed on the (triggered) oscilloscope. 
A coincidence analysis could be made by visual inspec- 
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tion of the photographs of the oscilloscope traces. The 
ratio of the number of coincidences with analyzing 
counters perpendicular to the number with the counters 
parallel was 0.91 +0.08. This result is not in disagree- 
ment with assignments of spins and parities made from 
the other correlation experiments on this nucleus 
(Wa 50, Pe 51, Ro 51). 

A study of the direction-polarization correlation for 
the cascade involving the first two excited states of 
Pb*8 in the ThC” decay was made by Kraushaar and 
Goldhaber (Kr 53). Again, a polarimeter of the 
Metzger-Deutsch type was used and a value for the 
efficiency was obtained by observing the polarization of 
scattered Co™ radiation. However, a more realistic value 
for the efficiency of the polarimeter was obtained by 
measuring the already known direction-polarization cor- F 
relations for Co® and Rh! sources. For the measure- 
ment in the Pb”! cascade, a value of No/No0= 0.958 0.3 
was obtained. This result, in conjunction with the result 
of Petch and Johns (Pe 50) on the direction-direction 
correlation, indicates an assignment of 4(£2)2(E2)0 to 
the cascade in Pb*®. 

Direction-polarization correlations in the beta-gamma 
decay from K”, As7®, Rb8®, Sb, and Cs!4 were studied 
by Hamilton et al. (Ha 53a). They used a two-counter 
polarimeter having stilbene crystals placed symmetri- 
cally with respect to the polarimeter axis, as shown in — 
Fig. 24, so that each counter played a dual role as — 
scatterer and analyzer. The aluminum plate prevented _ 
beta particles from reaching the counters of the po- 
larimeter. The entire apparatus was surrounded b; 
aluminum can which served as a light shield and con- 
tained a helium atmosphere in order to reduce elec 
scattering. Hamilton ef al. discuss thoroughly th 
pendence on angle of the spurious triple-coinci 
rates associated with the various possible scatt 
combinations. The single and double rates, as in pr 
ous experiments, were determined quite accura 
a check on the proper operation of the instrume 

In the case of Sb", Cs!4, and As?® there ar 
tional beta-gamma cascades involving beta p 
whose energies are lower than in the casi 
study. In these cases an aluminum abso: 
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TABLE XI. Values of the polarization correlation coefficient 
C(obs) and C(pred), as observed by Ha 53a and as predicted (on 
basis of no parity change for the gamma) from the directional 
correlation coefficient Q2+Q, reported by other workers for 
similar beta-energy discrimination conditions. (Table from 


Ha 53a). 

Nucleus Q:+04 C(pred) C(obs) 
Cs! 0.00+-0.01 0.003-0.003 0.009+-0.013 
Rb* 0.13+-0.01 —0.068+-0.005 —0.060-++0.025 
Asi 0.07+-0.02 — 0.067 +0.018 —0.114+0.035 
K* —0.06+0.02 0.022+-0.007 —0.007+-0.016 
Sb! — 0.27 +0.04 0.151+0.022 0.142 0.010 


duced to absorb all but the desired high-energy beta 
particles. The effect of spurious triple coincidences 
caused by gamma rays reaching the beta crystal was 
determined by making runs with absorbers of various 
thickness, but all thick enough to absorb all beta 
particles. 

In these experiments a quantity C is defined which 
is the coefficient characterizing the angular variation of 
the intensity of triple coincidences as a function of 
azimuthal angle of the polarimeter. Thus 


Ne =1+C cos*¢, (III-2) 


so that No/Noo=1+-C. Table XI compares the values 
of C obtained experimentally with values derived from 
the direction-direction coefficients determined by other 
workers.{{{ At 6=90°, the polarization ratio is [see 
Sec. (c) ] 


Jo 1+ (Q:+0:) 
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for pure dipole (Q;=0) or pure quadrupole radiation, 
the upper signs holding with no parity change, the lower 
ones with parity change. From (III-1), one obtains 
No/Noo and hence C. In the case of Cs, an isotropic 
direction-polarization correlation is observed as pre- 
dicted. For Rb**, As’® and Sb", the observed polariza- 
tion agrees with that predicted for no parity change in 
the gamma transition from the first excited state. An 
essentially isotropic direction-polarization correlation is 
observed for K*”. 

Brazos and Steffen (Br 56) have studied the direction- 
polarization correlation of the 0.722-0.566-Mev gamma- 
ray cascade resulting from the decay of the 50-day In 
isomer to Cd!"4. The oppositely placed analyzing crystals 
were rotated automatically about an axis through the 

scatterer and source once every hour to minimize the 
effect of possible electronic fluctuations. The measure- 
ment indicated a 4(£2)2(42)0 cascade, which is also 
favored by the directional correlation measurements on 


his cascade (Br 56). wD 
ee direction-polarization correlation in the 1.24— 
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0.845-Mev gamma-ray cascade in the decay of Co% 
has been measured by Wood and Jastram (Wo 55). A 
liquid scintillator was used as the scattering counter. 
Using the direction-direction coefficients (Hu 55) and 
the polarization measurement on the 0.845-Mev gamma 
ray, it was established that this gamma ray was electric 
quadrupole (no parity change). 

An investigation of the spins and parities of the first 
three excited levels in Ce! and the first two in Sr% 
was conducted by Bishop and Perez y Jorba (Bi 55) 
by means of direction-direction and direction-polariza- 
tion correlations. They used scattering crystals (made 
of polystyrene activated with diphenyl tetrabutadiene) 
of two sizes, one for low the other for high-energy 
gamma rays, since too large a crystal at a given energy 
makes double scattering too probable. They limited the 
double scattering to 15% in their experiments. Single- 
channel pulse-height analyzers were used in the chan- 
nels of both the directional counter and the scattering 
counter. In the case of the scattering counter, energy 
discrimination is desirable since the polarimeter is in 
effect a crude Compton spectrometer. Thus, the window 
of the pulse-height analyzer selects the pulses from 
Compton recoil electrons in the scattering crystal cor- 
responding to a gamma ray of a given energy scattered 
at a given (approximately) angle. Figure 25 shows a 
pulse-height spectrum obtained this way for the gamma 
rays from Sr*8 following the beta decay of Y**. 

The decay scheme in Ce, following beta decay of 
La, is fairly complex as shown by the level diagram 
in Table XIII. Bishop and Perez y Jorba measured the 
direction-polarization correlation with the 1.60-Mev 
gamma ray going to the directional counter and either 
the 0.329-, 0.487-, or 0.815-Mev gamma ray going to 
the polarimeter. The experimental value of No/Noo 
(1.077 +0.022) was consistent with assignments of 
3-—4+—2+ or 3+—4-—2- for the first three excited 
states taken in descending order. However, on the 
basis of the Goldhaber-Sunyar rule, giving 2+ for the 
first excited state, and on considering transition proba- 
bilities in the beta decay to these states, the first 
possibility was naturally chosen. Again, in the case of 
the simple cascade in Sr! the measurements were most 
consistent with the possibilities 3-—2+—0O+ or 3-—1* 
—0+ for the first two excited states and the ground 
state. The first possibility was chosen since it agrees 
best with internal-conversion data (Pe 48, Me 52a). 


Tıc. 25. Pulse-height 
spectrum of recoil elec- 
trons (in the scatterer) 
which are in coincidence 
with scattered gamma 
rays detected by the 
analyzing counters of 
the polarimeter. The 
gamma rays are from 
Sr88 produced in the de- 
cay of Y8 (figure from 
Bi 55). 
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The Sr gamma rays were also studied by Coleman 
(Co 56). His Compton polarimeter was used as a pulse- 
height analyzer by summing the pulse from the scat- 
tered electron in the scattering scintillator with that 
from the scattered gamma in the analyzing counter. 
With this technique the analyzing counter can subtend 
a large solid angle at the scattering scintillator without 
loss of resolution. The pulse-height spectrum obtained 
with this “total absorption” polarimeter is presented in 
Fig. 26. Separate polarization measurements were made 
on each member of the 0.91-1.85-Mev cascade in Sr’, 
The results show that the parities of the 1.85- and 2.75- 
Mev levels are even and odd, respectively, in agreement 
with the results of Bishop and Perez y Jorba (Bi 55). 

Later, using the same equipment, Coleman (Co 56a) 
studied the polarization correlation of the 0.99-1.33- 
Mev gamma-ray cascade in Tit! produced in the decay 
of V*8. He discusses the problem of ambiguity arising 
from a multipole mixture when the result of a direction- 
direction correlation is used to determine level assign- 
ments. Brazos and Steffen (Br56) showed in their 
work on the Cd!" levels that the ambiguity could be 
reduced by measuring the direction-polarization cor- 
relation. In addition, the mixing ratio may be deter- 
mined when both correlations are known. This problem 
is illustrated in Part I. For the cascade in Ti‘*, pure 
quadrupole radiation was established giving the assign- 
ment 4(#2)2(£2)0. 

The gamma-gamma direction-polarization correla- 
tions in the Co® and Na” decays were studied by 
Estulin ef al. (Es 56). The experimental arrangement 
used was similar to that of Metzger and Deutsch. The 
measurements on Co” were made to determine the 
reliability of those on Na”. In each case the value of 
No/Noo indicated an (£2,E2) transition in the cascade, 
leading to assignments of positive parity for the first 
two excited states in Ni® and Mg”. 

Stelson and McGowan (St 57) have measured the 
direction-polarization correlation in the 133-482-kev 
cascade in Ta!®!. They used Nal(Tl) crystals, 3 in. 
thick and 3 in. in diameter, in the counters of the cor- 
relation arrangement except in the Compton scatterer 


Fic. 26. Pulse- 
height spectrum of 
gamma rays from 
the same decay as in 
Fig. 25, observed 
using a “total ab- 
sorption” polarim- 
eter (figure from 
Co 56). 
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which was an anthracene scintillator, 14 in. thick by 
í in. in diameter. The output of a fast-slow triple- 
coincidence circuit gated a 20-channel pulse-height 
analyzer on which was displayed the pulse-height 
spectrum from the analyzing counter of the polarimeter. 
The effectiveness of the polarimeter had been deter- 
mined by measuring the polarization of gamma rays 
resulting from the Coulomb excitation of 2+ levels in 
even-even nuclei (see next section). The direction- 
direction measurements (Mc 54a, Pa 55, He 55) had 
indicated two possible assignments for the 133-482- 
kev decay: 5/2(E2)9/2(E2+M1)7/2 and 1/2(E2) 
X5/2(#2+M1)7/2. The experimental value of No/Noo 
obtained in the polarization measurement clearly agreed 
with the latter assignment with even parities for the 
levels. The direction-polarization correlation was meas- 
ured with both liquid and solid Hf!*! sources as a check 
on possible systematic errors. It was estimated that the 
4% contribution of 136.82-kev radiation from the 
decay of the 618.9-kev state just above the 615-kev 
state, did not significantly affect the results. 


(e) Measurements on Gamma Rays from Nuclear 
Reactions and Coulomb Excitation 


French and Newton (Fr 52) applied these methods 
to an alpha-gamma direction-polarization correlation 
in the F°(p,a)O1%*(7)O! reaction. This experiment was 
exceptional in that an octupole transition (Ar 50, 
Ba 50) is involved and complete polarization is ex- 
pected at an angle of 118° with respect to the coincident 
a particle as seen in Part I. Furthermore the gamma-ray 
energy was 6.13 Mev. Even though both the crystal 
absorption and the polarization sensitivity of the 
Compton process are low at this gamma-ray energy, a 
Compton polarimeter was still used successfully. Figure 
2 shows that R=1.27 at this energy. After correcting 
for the angular spreads in the apparatus, the authors 
obtain values of 0.84 and 1.19 for the asymmetry ratio 
for magnetic and electric octupole radiation, respec- 
tively. Since the experimental ratio was 1.14-+0.06, it 
was concluded that the radiation is electric octupole. 
This gave an assignment of odd parity to the 6.13-Mev 
state in Ol, and even parity to the compound state in 
Ne” and the ground state of F, assuming that the 
ground state of O!® is even. 

McGowan and Stelson (Mc 58) have measured the 
polarization of gamma rays resulting from the Coulomb 
excitation of several odd-A nuclei. In general, these 
gamma-rays are from mixed 1+ £2 transitions. Some- 
times, a measurement of the direction-direction correla- 
tion between the gamma rays and the incident beam — 


will determine the ratio £2/M1 in addition to specifying bar 


the spin of the excited state which is involved. However, 
the value of E2/M1 or of the spin often remain 
biguous. Such ambiguities can usually be resolvec 
direction-polarization measurement. me 
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the same as those used by these workers in their study 
of the gamma rays from Ta!*! discussed in the pre- 
ceding section. The only basic change is that the bom- 
barding beam and target replace the directional counter 
and source, since the correlation in this case is between 
bombarding particle and gamma ray. With this method 
ambiguities in £2/M1 for the gamma-ray transitions or 
uncertainties in the spin of the first excited states of the 
following nuclei have been resolved: Mo%, Rh!®, Agi”, 
Agi?) Cdi Cd's, Au’, TPS, and Tl (see Table 
XII). 

Litherland and Gove (Li58) have measured the po- 
larizations of the 1.64-, 0.94-, and 1.37-Mev gamma rays 
from the C”(He*,p)N™, O (He? p)F!S, and Mg” (p,p'y) 
reactions, respectively. The measurements on the 1.64- 
Mev gamma ray showed that the 3.95-Mev level in N” 
had the same parity (positive) as the first excited state. 
Similarly the 0.94-Mev level in F!8 had the same parity 
(positive) as the ground state. The measurements on 
Mg” were performed at the 2.01- and 2.40-Mey reso- 
nances and in both cases very large polarizations were 
found for the gamma rays emitted at 90°. The results 
also indicate that the first excited state of Mg” has the 
same parity (positive) as the ground state. 


(f) Measurements on Gamma Rays from Aligned Nuclei 


Of the several possible methods used for orienting 
nuclei, t}} the two that have been used the most in the 
experiments with gamma-ray polarization are the 
Bleaney method (nuclear alignment) and the Rose- 
Gorter method (nuclear polarization). Either one or 
both of these methods was used in the gamma-ray 
polarization experiments now to be described. The 
method of Rose and Gorter (Go 48, Ro 49) takes ad- 
vantage of the nuclear polarization produced by mag- 
netic hyperfine coupling in paramagnetic ions which 
have been polarized by an external magnetic field. In 
Bleaney’s method (Bl! 51a) no external field is used. 
Instead, the internal crystalline field is used to produce 
the nuclear alignment, through the magnetic hyperfine 
coupling in paramagnetic ions. In any nuclear orienta- 
tion method the very low temperatures which are re- 
quired are produced by adiabatic demagnetization. 

The gamma rays emitted by aligned nuclei are 
linearly polarized. In fact, all the phenomena and the 
nomenclature which are associated with the direction- 
polarization correlation have their counterpart in the 

alignment-polarization correlation. 
Bishop et al. (Bi 52), using Bleaney’s method, made 
= polarization measurements on the gamma rays emitted 
_ from aligned sources of Co and Co®. After adiabatic 
demagnetization, values of NV,,/N. 4 were measured with 

Cor olarimeter as a function of 1/T* by letting 
a Compton P ; b H 

warm up (Fig. 27). The magnetic tempera 

ture T* is obtained by measuring the susceptibility 
ss 3 iscussions of the methods of orienting nuclei 
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and applying Curie’s law. The polarization of both 
gamma rays together in the 4-2-0 cascade in Ni® and 
of the 805-kev gamma ray from Feë8 showed that all 
these quadrupole transitions (established in the case 
of Fe®® by Da 52) were electric. 

Bishop et al. (Bi 54) in a later experiment measured 
the linear polarization of the gamma rays from aligned 
Mn* nuclei. This was done both by Bleaney’s method 
and by that of Rose and Gorter, a greater gamma-ray 
polarization being obtained with the latter technique. 
Since the Rose-Gorter method uses an external mag- 
netic field, the scintillation crystals of the polarimeter 
were mounted on the ends of light pipes, and the 
photomultipliers were magnetically shielded. With each 
method the values of N/N, (obtained as a function 
of 1/T*) showed that the radiation from the first ex- 
cited state of Crt at 835 kev had positive parity and 
was therefore electric quadrupole (Gr 54). 

Again using Bleaney’s method, Bishop et al. (Bi 55a) 
measured the polarization of the 123-kev radiation from 
the 137-kev level of Fe*” in the decay of Co*’. This work 
was accompanied by a measurement of the alignment- 
direction correlation. It was known (Al 54) that the 
123-kev transition is predominantly dipole. The meas- 
urement of V,,/N. as a function of 1/T* indicated that 
the transition is predominantly magnetic, thus con- 
firming the suggested parity assignments to the 14- and 
137-kev levels (Le 55). From the polarization measure- 
ments and the directional correlation a value of 
(#2/M1)?=-++0.19+0.02 was obtained. 

Cacho et al. (Ca 55), using Bleaney’s method, found 
a value for N/N, of 1.58 at 1/7*=40 for radiation 
from the 145-kev first excited state of Pr™! in the Ce™ 
decay. This led to the conclusion that the transition is 
predominantly magnetic dipole. The polarization study 
in conjunction with the directional correlation (Ca 55, 
Am 56) gave (£2/M1)?=0.08=-0.02. 

The linear polarizations of the quadrupole (Hu 56a) 
gamma rays, coming from the 6-4-2-0 cascade in Cr", 
following the decay of Mn”? were investigated by 
Huiskamp et al. (Hu 57) using the Rose-Gorter method. 
The Compton polarimeter which they used also served 
as a Compton spectrometer [sce Sec. (d) ] to analyze the 
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Fic. 27. Linear polarization of the gamma rays from an aligned 
Co source as a function of 1/7* (figure from Bi 52). 
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0.73-, 0.94- and 1.46-Mev gamma rays in the cascade. 
The energy resolution was sufficient to determine the 
polarization of each of the three gamma rays. In each 
case, the radiation was found to be electric. 

The same apparatus and method also were used by 
Diddens eż al. (Di58) to polarize Co* nuclei in the 
study of the levels in Fe5*, The Compton polarimeter 
used as a spectrometer displayed sufficient resolution 
(14% at 2.76 Mev, 25% at 1.2 Mev) so that the linear 
polarizations of five gamma rays (0.845, 1.24, 1.75, 
2.61, and 3.25 Mev) could be measured satisfactorily. 
As in the work on Mn® a plastic scintillator was used 
as the scatterer in the polarimeter instead of NaI(T!) 
because it was found that pair production in the Nal 
with its resulting annihilation radiation produced a 
serious background for the more energetic gamma rays. 
The authors also investigated the alignment-direction 
correlations for the above and other gamma rays pro- 
duced in the decay of Co®®. From all their measure- 
ments and with the work of others (Sa 55a) they ob- 
tained spins and parities for the 0.845-, 2.08-, 3.45-, 
3.84-, and 4.10-Mev levels (see Table XIII). Also the 
0.845-, 1.24-, and 3.25-Mev gamma rays were found to 
be electric quadrupole, whereas the 1.75- and 2.61-Mev 
gamma rays were magnetic dipole. 


B. Measurement of Plane Polarization by Means 
of the Photodisintegration of the Deuteron 


Next to the Compton effect, photodisintegration of 
the deuteron has proved the most useful polarization- 
sensitive reaction for gamma rays. In Part II, it was 
shown that the electric dipole photodisintegration 
serves as an ideal polarization-sensitive mechanism for 
gamma rays of energies ranging from about 4 Mev to 
about 12 Mev. The main disadvantages of the reaction 
are its low cross section (~10-*7 cm?) and the fact that 
the presence of the isotropic magnetic interaction makes 
it unsuitable for gamma rays whose energies are much 
below 4 Mev. It can however be used well above 12 
Mev without serious loss of sensitivity. 
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_ Fic. 28. Figure of merit of a polarimeter using the photodis- 
integration of the deuteron, plotted as a function of the spread in 
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Fic. 29. Figure of merit of a polarimeter using the photodis- 
integration of the deuteron, plotted as a function of the spread in 
angle @., for various spreads in @ (computed for p=1.2). 


(a) Experimental Considerations 


Two methods of utilizing the polarization sensitivity 
of the deuteron are apparently feasible. One method 
would be very similar to the Compton polarimeter. 
The gamma rays would impinge on a “scatterer” con- 
taining deuterium, and the photoneutrons would be 
detected in one (or two) side counters. Almost all the 
discussion of the Compton polarimeter can be applied 
to this case. Thus, the advantage of making the “scat- 
terer” itself a detector is obvious. In this case the 
photoproton would be detected. A very serious draw- 
back to such a polarimeter would be the very high 
background of events from Compton scattering and 
pair production. This difficulty might be overcome with 
modern techniques such as the use of time-of-flight to 
isolate the neutrons. The advantage of the method over 
the Compton polarimeter would be its increased sensi- 
tivity to polarization at high gamma-ray energy. Its 
advantage over the emulsion technique, described in 
the following, would be its possible use in a direction- 
polarization measurement requiring the detection of co- 
incident radiations. Figures 28 and 29 show the figure 
of merit for such a polarimeter for various angular 
spreads. The conditions are the same as in the corre- 
sponding calculations for the Compton polarimeter, 
Figs. 19 to 22, except that here the energy is not an 
important parameter as long as the electric dipole dis- 
integration is dominant (as assumed in Figs. 28 and 29). 
That the polarization sensitivity is practically inde- 
pendent of 6, is reflected in the continued rise of the 
curves as Af, increases in Fig. 29. The curves in Figs. 
28 and 29 can be used in the analysis of data obtained 
in the emulsion technique. 5 

The simplest polarimeter using the photodisintegra- 
tion of the deuteron, and the one used in all the experi- _ 
ments performed so far, consists of a nuclear emulsion __ 
impregnated with heavy water (D2O). The emulsions — 
are mounted so that the plane of the emulsion is norma 
to the direction of propagation of the gamma rays. In 
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lie in the emulsion since the angular distribution of the 
photoprotons goes essentially as sin*@ about the direc- 
tion of propagation of the gamma ray. The emulsion 
can then serve as a polarimeter by comparing the 
density of tracks at ¢=0° with that at ¢=90°. 

The technique of loading the emulsion was first used 
by Goldhaber (Go 51) in studies of the angular dis- 
tribution of the protons from the photodisintegration 
of deuterons. In the experiments described below, either 
Ilford C2 or G5 emulsions were soaked in DO to 
saturation at some fixed temperature between 10°C 
and 20°C. Although the grain density of proton tracks 
is considerably diminished under these conditions, 
satisfactory range definition can be obtained in Ilford 
C2 emulsions with appropriate developing techniques. 
Better definition can be achieved using Ilford G5 emul- 
sions but in this case precautions must be taken to 
reduce fogging if low-energy gamma or x-radiation is 
present. 

The amount of water absorbed by a saturated emul- 
sion increases with increasing temperature. However, 
the emulsion generally becomes too soft to use at tem- 
peratures much above 20°C. Saturation of the plates is 
desirable in order to attain a uniform distribution of 
water throughout the emulsion. If range determinations 
are to be made, it is essential for the proton range for a 
given energy to be constant throughout the emulsion. 
A uniform distribution of the water is important also in 
making depth measurements in the emulsion. If the 
expansion of the emulsion on soaking and the subse- 
quent shrinkage on development do not take place 
uniformly, it is not possible to obtain accurate measure- 
ments on the vertical range of a track. A depth measure- 
ment depends on a knowledge of the shrinkage factor 
(ratio of emulsion thickness before and after develop- 
ment). For most purposes an adequate measurement 
of the shinkage factor can be obtained with a mi- 
crometer, although more elegant techniques are possible. 
While the emulsions are being exposed they are kept in 
watertight containers at a controlled temperature. 
Usually one plate in an exposure is saturated with H:O 
to provide a background measurement. 


(b) Experiments Using Nuclear Emulsions 


Emulsions soaked in D20 were first used in a polariza- 
tion experiment by Wilkinson (Wi 52) in a study of 
the gamma rays from the H?(p,7)He’ reaction. In this 
nonresonant reaction, the energy of the gamma ray is 
greater than 5.5 Mev by an amount which depends on 
the energy of the bombarding protons. Since the direc- 
tional distribution is given by sin’é (Fo 49), the gamma 
rays are expected to be essentially completely polarized 
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50-ua proton beam (1.1 Mev). Wilkinson adopted some 
simple range and depth criteria to insure the counting 
of desired events. Since the polarization sensitivity is 
independent of the polar angle, only the azimuthal 
angle is recorded. Figure 30 shows the azimuthal dis- 
tribution of proton tracks projected on the plane of the 
emulsion. It confirms that the gamma rays are electric 
dipole and are essentially perfectly polarized. On the 
other hand, if it is assumed from other evidence (Fo 49) 
that the gamma rays are completely polarized, then the 
experiment serves to confirm the ideal polarization 
sensitivity of the photodisintegration of the deuteron 
for gamma rays of about 6 Mev. 

Fagg and Hanna (Fa 53) used the photodisintegra- 
tion process to measure the polarization of the 6.13-, 6.9-, 
and 7.1-Mev gamma rays from the F(p,a)O!%*(y)O"% 
reaction, at H,=0.874 Mev. Ilford C2 emulsions 400 4 
thick were soaked to saturation in D:O and placed 4 
cm from a thin target, with the plane of the emulsion 
normal to the gamma rays emitted at 90°. Several ex- 
posures of about 250 ya-hours each were made. Com- 
plete resolution of the 6.9-, and 7.1-Mev gamma rays 
could not be expected but it was essential to isolate the 
strong 6.13-Mev radiation from the other two. This 
could be done in a simple manner which permitted 
(relatively) rapid counting, with the result shown in 
Fig. 31. In essence, a single depth criterion was adopted: 
an acceptable proton track must be in sharp focus along 
its complete range. The utility of the technique is seen 
by writing the photoproton distribution, (II-14), in 
terms of projected range p= R sind: 


do~ p (R— p*)— dp. 
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Fic. 30. Linear polarization of gamma rays from H?(p,y) He, 
demonstrated by photodisintegration of deuterons in impregnate 
emulsion. The azimuthal distribution of photoproten tracka RD. 
| gato the plane of the gaplsion is shown (figure from Wi 94)- 
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This distribution is plotted as the dotted line in Fig. 31. 
Experimentally, it gives a very pronounced peaking at 
p=. The cutoff at p. is of course established by the 
depth criterion. In this case pe was chosen to resolve 
the 6.13-Mev gamma ray. The resolution could be 
improved, by increasing pe, without greatly decreasing 
the total number of accepted tracks. 

With this technique enough resolution was obtained 
to give qualitatively the polarizations of the 6.9- and 
7.1-Mev gamma rays. This qualitative information was 
used along with the measured polarization ratio and 
the directional distribution coefficient (Sa 52) for the 
unresolved group to deduce the parities of the 6.91- 
and 7,12-Mev states in Ol, These parities (plus and 
minus, respectively) agreed with those given by Seed 
and French (Se 52). No assignment could be made for 
the 6.13-Mev level since the corresponding polarization 
distribution was essentially isotropic, in agreement with 
the isotropy observed by Sanders (Sa 52) in his direc- 
tional distribution measurement. 

Nuclear emulsions impregnated with heavy water 
were also used by Hughes and Sinclair (Hu 56) in their 
study of the polarization of the gamma radiation 
from the reactions Al?"(p,y)Si28, Mg*®(p,y)AP’, and 
Na™(p,y)Mg". Ilford G5 emulsions (3004 and 400 u 
thick) were used in order to give well-defined tracks 
and to avoid fading of the tracks during the long ex- 
posures. A copper absorber $ in. thick was effective in 
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Fic. 31. Projected range distribution of the photoprotons pro- 
duced by the 6.13-, 6.91-, and 7.12-Mey gamma rays from the 
F” (p,a) O!* (y)O' reaction. The photoprotons are produced in 
emulsions impregnated with D20. Dashed curve is the theoretical 
distribution for the projected range, assuming an empirical cutoff 
at Pe. R is range of photoproton resulting from the 6.13-Mev 
gamma ray. Arrows in the vicinity of 80 microns mark expected 
ranges of photoprotons from the 6.91- and 7.12-Mev gamma rays 
(figure from Fa 53), 


preventing low-energy x-rays from blackening the 
plates. The targets used were thin enough so that in each 
bombardment a single resonance was isolated. Both the 
horizontal and vertical projection of each proton track 
was measured in the microscope and the range was 
calculated from these values. The following gamma rays 
were investigated: the 7.5- and 10.4Mev gamma rays 
from Si?8, the 7.6- and 8.6-Mev gamma rays from Al?’, 
and the 7.2-, 8.1-, and 10.8-Mev gamma rays from Mg”. 
A definite polarization was observed in the case of the 
10.4-, 7.6-, and 10.8-Mevy radiations, all of which were 
found to be electric in character. The 10.4, 7.6-, and 
10.8-Mev radiations had each been shown to be dipole 
in character in Ru 54, Ru 56, and Gr 55, respectively. 
The character of the radiation was used to assign even 
parity to the ground state and first excited state of Al?’ 
(in agreement with Da 53) and to the ground state of 
Na”. For the remainder of the gamma rays no polariza- 
tion was observed. Hughes and Sinclair also include in 
an appendix a range-energy relation for protons up to 
5 Mev in wet emulsions. 


C. Measurements of Plane Polarization by 
Means of the Photoelectric Effect 


In Part II it was shown that the usefulness of the 
photoelectric effect for detection of gamma-ray polariza- 
tion is rather doubtful. The experiments which have 
been performed have been primarily motivated by a 
desire to elucidate the photoelectric effect rather than to 
measure polarization. With an arrangement very similar 
to Fig. 15, Hereford (He 51) used two photoelectric po- 
larimeters to observe the cross-polarization of annihila- 
tion radiation. In an improved experiment Hereford 
and Keuper (He 53) substituted a Compton polarimeter 
for one of the photoelectric detectors. These investiga- 
tions were followed by the experiments of McMaster 
and Hereford (Mc 54) and of Brini et al. (Br 57) in 
which the polarization produced in Compton scattering 
was used to investigate the photoelectric effect, with the 
discordant results already noted in Part II. 


D. Measurement of Circular Polarization by 
Means of the Compton Effect 


Recently, considerable interest has been exhibited in 
the measurement of circular polarization of gamma rays. 
This interest has been stimulated primarily by the 
fundamental reformulation of the law of conservation 
of parity in weak interactions and the resultant intro- 
duction and confirmation of the two-component neu- 
trino theory (Le 56, Le 57). The developments in the 
measurements of the circular polarization of gamma 
rays have progressed at a rapid rate. It is the aim here 
to bring together the methods and techniques of the 
various investigations and to collect the results which 
have appeared. 
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Taste XII. Sign of the circular polarization effect R for right 
; circular polarization, for the conditions listed. 
eee 


Sign of R Transmission Scattering 
= Ey <0.63 Mev 0e <90° 
+ E,>0.63 Mev 0.>90° 


(a) Experimental Considerations 


A discussion of the Compton scattering of circularly 
polarized gamma rays by polarized electrons has been 
given in Part II. The most convenient source of polarized 
electrons is a ferromagnet. Experimentally the circular 
polarization of gamma rays is detected by observing 
the difference in the scattering from electrons in mag- 
netized iron when the magnetization is reversed or when 
it is removed. This may be done by observing (1) the 
gamma rays transmitted through the magnetized iron, 
(2) those scattered from the iron at same appropriate 
angle, or (3) the electrons ejected forward from the 
surface of the iron. Although a considerable effect 
(change in the counting rate with change in magnetic 
field) could be expected for perfectly polarized gamma 
rays and electrons at certain angles and energies (Figs. 
4 and 5), in practice the effect is greatly diminished by 
the fact that only about 2 electrons of the 26 in each 
iron atom can be polarized by magnetization. Therefore, 
the polarization effect is at most only a few percent; and 
quite often it is less than one percent, if the gamma-ray 
polarization is small. 

For any one of the above methods, let N denote the 
experimental yield of gamma rays when the magnetiza- 
tion of the analyzer is directed toward the source, and 
N_ the yield when it is directed away from the source. 
i The experimental effect is given by 


F Ny —N_ 


rs (III-3) 
N +N- 

The sign of the effect is of great importance. It is deter- 
mined by the sign of the polarization-sensitive cross 
section (Figs. 4 and 5) and the sense of the photon 
polarization. For ready reference the sign of R is given 
in Table XII for right circular polarization (see defini- 
‘tion in Part I, C) and for four experimental conditions. 
The sign of R is reversed for left circular polarization. 
Hence, from the sign of the effect, one may determine 
the sense of the circular polarization and so also of the 
polarized emitting nucleus. This much alone has led to 
very important discoveries. 

Tn accordance with the theoretical definition (Part I, 
ae C), the degree of circular polarization is 
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intensities and J,+ and J,+ the transmitted intensities 
corresponding to (+) or (—) magnetization. Then 


J +=J, exp{ —nt(oos-ko1r)} 
Ji+=J/ exp{ —nl(oo2-Fo.v)} 5 


With V,=J,++Ji4, we obtain for (III-3) 


R=2P3 tanh (— nto), (III-5) 
where n is the number of iron atoms per unit volume, 
tis the effective thickness of the iron analyzer, and v is 
the number of polarized electrons per iron atom, equal 
to 2.06 at saturation (Ar 53). In using this formula to 
obtain P; from a measurement of R, it is necessary to 
find appropriate values for ¢ and v. Alternatively, the 
analyzer may be calibrated with a source of known 
polarization. 
For the differential cross section we can write 
do=dovt fP3do1, (III-6) 
where do; is the polarization-sensitive cross section 
shown in Fig. 4 for the case in which the electron 
polarization is parallel to the photon momentum. The 
circular polarization P and the electron polarization 
f are displayed explicitly. At saturation /=2/26. Since, 
in an analysis by scattering, the observed intensity is 
simply proportional to the differential cross section, we 
can write§§§ 


do,—do— doy 
2 


=2— = (III-7) 
do4+do_ 


os 
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where doi/dop is the ratio of the two differential cross 
sections. In order to use this expression it is important 
to have a value for f that applies to the iron in the 
region of the scattering. When large solid angles are 
used, it is also necessary to integrate the cross sections 
over the angular spreads. In practice the efficiency of 
the analyzer is often checked with radiation of known 
polarization. 

Since long runs usually are necessary in order to ob- 
tain statistically significant results, the changes in the 
magnetic field are usually made at frequent, periodic 
intervals in order to minimize the effect of possible 
electronic drifts. Changes in the counting rate of the 
detector produced by the changes in the magnetic field 
are avoided by magnetically shielding the counter and 
by removing it from the influence of the field by use of 
a light pipe. If the experimental arrangement is such 
that changes in magnetic field do affect the counting 
rate, the effect can be controlled in the beta-gamma 
coincidence experiments by normalizing the coinci- 
dence rate to the gamma-ray singles rate. 


ee 


§§§ If there is a component of linear polarization in the gamma 
radiation, the normalization of this expression is changed (Wh 55 : 
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(b) Measurements on Gamma Rays from Polarized 
Nuclei (Parity Conserving) 


The first experiment dealing with the circular po- 
larization of gamma rays was performed by Clay and 
Hereford (Cl 52) who attempted to measure the circular 
polarization of annihilation radiation by detecting 
Compton electrons ejected in the forward direction from 
magnetized iron. The effects they observed are now 
believed to be instrumental in origin (To 56). 

Using the apparatus shown in Fig. 32 Wheatley et al. 
(Wh 55) studied circularly polarized gamma rays of 
nuclear origin in their investigation of the gamma rays 
emitted from Co® nuclei which had been polarized by 
the Rose-Gorter method. After adiabatic demagnetiza- 
tion the cryostat containing the source was inserted 
into the position shown in Fig. 32 (which shows one 
half of the symmetrical apparatus). It is recalled that 
the circular polarization is a maximum for gamma rays 
emitted along the axis of the nuclear polarization, and 
that the circular polarization changes sign if the axis 
is reversed. It was convenient therefore to perform the 
experiment by reversing the nuclear polarization (in- 
stead of the electron polarization) which is controlled 
by the inner magnet M, (which shares a yoke with the 
outer magnet M,). Except for the features associated 
with the nuclear polarization, especially in the design 
of the magnet, the polarimeter itself is a forerunner of 


Fic. 32. Diagram of apparatus used by Wheatley et al. for 
producing and measuring circularly polarized gamma rays. A 
change in the differential Compton cross section is measured by 
changing the direction of circular polarization of the gamma rays, 
emitted from polarized nuclei in the cryostat, relative to the 
direction of magnetization in the scattering iron S. Magnet Mp 
(with coil B) determines the direction of polarization of the nuclei; 
Magnet M, (with coil W) determines the direction of magnetiza- 
tion of S (figure from Wh 55) we 
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more recent analyzers using the scattering method. 
Direct radiation is shielded from the detector, but 
gamma rays scattered (at angles between 45° and 70°) 
from the magnetized strip S (Armco iron) are recorded. 
The magnetization in S is produced by the magnet M,. 
The dégree of circular polarization achieved was as 
high as 75% at the lowest temperature, 0.006°K. The 
observed fractional change in counting rate upon chang- 
ing the direction of polarization of the nuclei was at 
most 3%. In this experiment the circular correlation 
function has the form (Part I, C) 


P3=(A1Pi(cos@)+A3P3(cos6) ]/W (0), 


where Ay, A3, and the coefficients in the directional ) 
function W (0) are functions of the degree of orientation E; 
and so of the temperature. For temperatures in the Ì 
region 1/T <60, Wheatley et al. obtain excellent agree- 

ment with the theoretical correlation function. For 

lower temperatures the observed deviation may be due 

to imperfect knowledge of the hyperfine coupling. As 

we have seen the sign of the effect alone makes possible 

a determination of the sign of the magnetic moment of 

the emitting nucleus. In this case the sign of the mag- 

netic moment of the Co® nucleus was found to be 
positive. 

Trumpy (Tr 57) using the apparatus shown in Fig. 33 
studied the circular polarization of the gamma rays re- 
sulting from the capture of polarized neutrons by the 
following nuclei: S, Ca“, Tits, Cr, Fe5’, NiŝS, Zn, 
and W!®. The emitting compound nucleus receives its 
polarization from the captured neutron. Again, the 
maximum circular polarization occurs for gamma rays 
emitted along the axis of polarization. Trumpy selected 
the transmission method for the analysis of circular E 
polarization. Thermal neutrons emerge from a neutron 
collimator and pass through a neutron polarizer, con- 
sisting of a small iron block mounted in the gap of a 
magnet in which a field of 14 900 oe is produced. The 
polarized neutrons impinge on a target and the capture 
gamma rays, emerging along the directions parallel and 
antiparallel to the polarization, reach two sodium iodide 
detectors after being transmitted through the two 
analyzing magnets. The path through each magnet core 
is 8 cm long. The heavy shielding of lead and boron __ 
carbide reduces the intense background of radiation — 
arising chiefly from neutron capture in the polarizing 
iron block. Particular gamma rays could be selected 
for study by means of energy discrimination in the d 
tectors. The experiment was performed by recordi 
the counts in both detectors as a function of the direc 
tion of magnetization in the analyzers and the directii 
of polarization of the neutrons as determined by t 
polarizing magnet. Rr 

Trumpy uses an explicit formula given by Biedenharn 
et al. (Bi 51) for the capture of totally polarized tl l 
neutrons followed è 
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right circular polarization, the coefficient A; is given by 


L(L+1)+9 (+1) —jo(je+1 
q Aye je (111-8) 


where 71, 7 and 72 are the spins of target nucleus, com- 
_ pound nucleus and final nucleus, respectively, and L is 
_ the multipole order of the gamma-ray transition. This is 

equivalent to the result obtained in Part I, C. Since one 
ar more of the angular momenta in (III-8) are often 
Own and 7=7;-35, a measurement of A; can yield 
e assignments of spin (and parity). For example, 
0, j7=4, and L=1, then A;=1 or —} for j2=4 
t 3, respectively. 
Phe theoretical value of the circular polarization, 
en by Ai, is reduced in practice because of the in- 
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nucleus constituted a verification of the direction and 
order of magnitude of the expected circular polarization, 

In another part of their work, discussed in Part I, 

A(f), Huiskamp ef al. (Hu 57) measured the circular 
polarization of the three unresolved gamma rays of 
Cr®? (0.73, 0.94, 1.43 Mev) resulting from the decay of 
Mn’. The Mn® nuclei were polarized by the Rose- 
Gorter method. The arrangement for the circular 
polarization measurement was identical to that in the 
experiment of Wheatley ef al. (Wh 55) shown in Fig. 32. 
The range of forward scattering angles defined by the 
geometry and the energy bias of the detector were such 
that all three gamma rays could be counted. The degree 
of circular polarization found for the Mn” gamma rays 
as a function of 1/7* is shown in Fig. 34. No attempt 
was made to determine the intensity contributed by 
each gamma ray to the total since all three radiations 
were expected to have the same degree of circular 
polarization. Thus it was clearly possible to determine 
the sign of the effect which was all that was desired. 
As in the case of the experiment of Wheatley et al. 
(Wh 55), the sign of the effect indicated that the sign 
of the magnetic moment of Mn® is positive. 

Discovery of nonconservation of parity in weak inter- 
actions prompted Wilkinson (Wi 58) to re-examine the 
extent to which parity is conserved in the strong inter- = 
actions. He tested the conservation of parity in strong 
interactions by three classes of experiments. One of 
these involves a search for circular polarization of : 
gamma rays from a nuclear reaction in the initial state” 
of which all particles are unpolarized. Wilkinson looke 
for circular polarization of the 2.14- and 
ays from the B“(,p')B™*(y)B™ 
4S. .reaations, respectively. He 


transmission through magnetized iron as the method 
of analysis for circular polarization. Since very small 
effects, if any, were expected, considerable precautions 
were taken to eliminate systematic errors resulting, for 
instance, from the influence of magnet reversals on the 
photomultipliers or from electronic drifts. The scintil- 
lating crystal was mounted at the end of a light pipe 
either two or four feet long. The two-foot pipe was used 
with the F!°(p,v)O!™* reaction for which greater resolu- 
tion was needed. The photomultiplier was surrounded 
not only with a mu-metal shield but also with two co- 
axial iron pipes. To prevent the degraded radiation 
emerging from the magnet from masking the effect 
under study, the bias of the electronic apparatus was 
set to accept only the very high-energy end of the 
spectrum. Since the search was for such small effects, 
over five-hundred runs were made on each reaction. 
- The results of the runs were carefully averaged in 
different ways to eliminate the possible influence of 
systematic errors. Average values of V-+-/N—=1.00025 
+0.00026 and 1.00003--0.00024 were found for the 
boron and fluorine reactions, respectively, where 
N+/N— is the ratio of the counting rate with the 
magnetic field in one direction to that in the other. Thus, 
within the aforementioned limits, there is no circular 
polarization and no nonconservation of parity. 


(c) Beta~-Gamma Circular Polarization Correlation 


The measurement of circular polarization in beta- 
gamma decay was one of the early experiments per- 
formed after the discovery of nonconservation of parity 
in weak interactions (Le 56) and the revival of the 
two-component theory of the neutrino (Le 57). The 


a 


Fic. 34. Circular polarization of gamma rays emitted from a 
polarized Mn*® source. Ordinate R is the counting rate normalized 
to unity at 7=1°K. Open circles correspond to the polarizing field 
and the induction in the scattering iron being parallel, while 
Goce clic correspond to their being antiparallel (figure from 
Hu 57). zi 
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Fic. 35. Experimental arrangement used by Schopper for the = 
beta-gamma circular-polarization correlation. Sc 1 is the beta - 
counter using a plastic scintillator; Sc 2 is the gamma counter 
with a NaI (T!) crystal; L1 and L2 are light guides; S is the source; 
and M is a cylindrical magnet with magnetizing {coils_C} (figure 
from Sc 57). 


experiments have since been carried forward and have * 
provided valuable information on the nature of beta 
decay. The aim of the measurements is to determine 
the asymmetry coefficient 4;= — (v/c)A in the circular 
polarization correlation (Part I, C), 


P3=(v/c)A cos, (III-9) 


which gives the degree of right circular polarization as Z 
a function of the angle between the beta particle and | 
the gamma ray. Both the scattering method and the 
transmission method have been used to good advantage Š 
in this work, the polarizations being obtained by (III-5) 
and (III-7) or their equivalent. The beta-gamma circu- 
lar polarization measurement provides the same in- 
formation as does a knowledge of the electron distribu- 
tion from polarized nuclei. The former technique has 2 
the advantage, however, of being feasible for a large E 
number of different nuclei. 4 

Schopper (Sc 57) first performed such a correlation E 
experiment in the decay of Co® and of Na”. These E 
nuclei have allowed|| || || 6 transitions with AJ=1 fol- 
lowed by pure multipole radiation. The experimental 
arrangement (Fig. 35) is designed to study the circular 
polarization of the gamma rays emitted at an angle 
close to 180° measured with respect to the direction of 
the beta particles. The average forward scattering 
angle was 55°, which is about optimum for circularly 
polarized gamma rays of the energies used in the ex- i 
periment (~1.25 Mev). Electronic discrimination in xg E 
the circuits was provided in order to avoid counting ~ 
backscattered gamma rays, annihilation radiation, and — 
stray beta particles. Although light pipes were 
the detectors, possible variations in the coincid 
rate produced by the changes in the magnetic 
were eliminated by using the ratio of the coinci 
rate to the product of the singles rates. 

In the decay of Co® two gamma rays are « 
a 4-2-0 cascade, but the gamma rays can 
one since in this special case the polari 
second gamma ray is the same as thai 
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Fic. 36. Asymmetry coefficient in the beta-gamma circular 


polarization for a 4*(6)4*(-y)2*(y)0* transition. The experimental 
value for Sc** is indicated. Theoretical curves for J= Imax =0.88, 
J=0.5 and J=0 are shown; J is the interference term that enters 
into A. If J>0, the upper branch of the theoretical curve refers 
to the choice of s=a!M¢r/Mr>0 and the lower branch to the 
case «<0. If <0 the opposite assignment is true. The figure 
shows that a large value for J is required to fit the experimental 
data (figure from Bo 58). 


(To 53). Schopper found 4=+0.39+0.08 and —0.41 
+£0.07 for Na? and Co®, respectively. These results 
can be compared to theoretical values of +4 and —} 
respectively. The result on Co® is in accord with the 
experiment of Wu ef al. (Wu 57) which measured the 
electron distribution from polarized Co® nuclei. The 
opposite signs for Co® and Na” indicate opposite cir- 
cular polarizations and show that the antineutrino 
emitted in negatron decay and the neutrino emitted 
in positron decay have opposite helicity. In a note added 
in proof in his paper Schopper reports a result leading 
to a value of A=—0.07--0.05 for Na”. 

Boehm and Wapstra (Bo 57, Bo 58) using a very 
similar experimental arrangement have studied the 
_ circular-polarization correlations in many beta-gamma 
decays. Their first experiments were performed on Co®, 
Au’ and Hg*, the latter two of which have first- 
_ forbidden beta transitions to the 0.411- and 0.279-Mev 
states in Hg”? and Tl, respectively. A refined expres- 
sion for the efficiency of the analyzer was calculated 
by Alder, 


e= 2.90k (1+0.132)/(1-+0.36k-+-0.09%2), 


where e would be the experimental effect (in percent) 
for a completely circularly polarized gamma ray of 
energy k (in units of moc). Measurements on the circu- 
Jarly polarized bremsstrahlung created by the beta- 
particles from P# and Tm?” were made partly to check 
this formula and partly to establish the v/c law for the 
polarization of electrons emitted in beta decay. The 
<perimental results found for P® and Tm’ were 
c sistent with both the formula and the v/c law. In 

je circular-polarization measurement performed on 
value of A=—0.41+0.08 was found, in agree- 


+ with Schopper’s value. The value of A=0.52 
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spin of 2 and rules out a spin of 3 for the ground state of 
Au’, With the value of A= —0.06-0.22 for Hg® 
(Bo 58) it was not possible to exclude a spin of § for 
the ground state of Hg? but the measurement is con- 
sistent with a spin of 3. 

Boehm and Wapstra studied further the above J-J 
transiton in Au” along with those in Co®® (Bo 57a, 
Bo 58) and Sct: (Bo 57b, Bo 58). The J-J transitions 
are of great interest because of the mixed nature of the 
beta transition. Thus the decay scheme of Sc‘® is 
4+ (8)4+(2)2*(2)0 and differs from the Co case only 
in that AJ=O instead of 1. For this decay scheme, 
Boehm and Wapstra write the asymmetry coefficient 
in compact form (Part I, C) as 


A= (0.083422-+0.745a!Ix)/(1-+42), (II1-10) 


where x=a'M cr/M r, a= (Cer/Cr)?=1.3 (Ko 56) and 
I is the interference factor whose numerator is equal to 
Sı for LL'=01 in Table X. If 7=0, A varies from 0 to 
0.083, as x goes from 0 to ~. If Ix>0, then A>0, but 
if Jx<O, then A <0 except when |Z| <0.098x*. Finally, 
if the two-component neutrino theory with invariance 
under time reversal is valid, the maximum value of |Z] 
is given by a~?=0.88. These properties are illustrated 
in Fig. 36 which gives curves for three values of |Z|, 
namely 0, 0.5 and the maximum 0.88. The experimental 
result for Sc*®, A =0.330.04, is shown on the graph. 
The experimental value indicates a large value of |Z|. 
Boehm and Wapstra give |Z| >0.5 with a statistical 
uncertainty of 1%. Since the dominant term§ $9 in J 
is Re(CsCo’*+Cs'Cr*—CyCa'*—Cy'Ca*), this large 
value of |Z| rules out a pure V, T or a pure S, A inter- 
action in this beta decay. If the maximum interference 
is assumed, a value of 2.2 is found for | M er/M r|. For 
Co®®, Boehm and Wapstra found A = —0.14-£0.07, and 
since the pure GT value is —0.083, no conclusion could 
be drawn concerning the interference term. The as- 
sumption of maximum interference, as suggested by 
the Sct result, leads to |M er/M r |> 8. The value of A 
obtained for Au™® (first forbidden) agreed with the 
earlier result and was consistent with the maximum 
theoretical value. 

In a complete report (Bo 58), Boehm and Wapstra 
add to the treatment of Sc‘*, Au!8, and Co®® a discus- 
sion of their work on Sc“, V48, and Na”. In the case of 
Sc and V8, the values of 4=—0.02+0.04 and 
+0.063-0.05 indicate some interference since they are 
greater than the pure GT values of —0.17 and —0.083, 
respectively. If, as in the case of Sc‘, the maximum 
interference is assumed, | M@r/M r|=5 for both nuclei. 
For Na“ a value of A=+0.07-L0.04 is obtained as 
compared to the pure GT value of 0.08. If maximum 
interference is assumed, |Moer/Mr|> 25. Interference 
between the GT and F interactions was also found by 
Boehm (Bo 58a) in the decay of Mn®. In this example 
A=—0.16+0.05, and the pure GT value is —0.056. — 
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Fic. 37. Experimental arrangement of Steffen utilizing the method in which the electron polarization is 
perpendicular to the circular polarization (figure from St 58a). 


Among the other early investigators in this field were 
Lundby et al. (Lu 57) who also made measurements on 
Co®. Since it was felt the energy discrimination would 
be improved, they used the method of transmission 
through magnetized iron instead of the forward-scatter- 
ing technique. They found a value of A= —0.320.07. 
In a note added in proof a value of A=-++0.340.14 
for Na” also is given. 

DeBrunner and Kiindig (De 57) using the method of 
scattering also measured the anisotropy coefficient for 
Co® and found A=—0.344+0.09 in good agreement 
with other values and with the theoretical value of —3. 
Shortly thereafter Berthier et al. (Be 57a) using the 
same equipment studied Au'®’, the decay of which is 
first forbidden. It was found that A=0.34+40.09 which 
is somewhat smaller than the value above (Bo 58). 

Again using the forward scattering from magnetized 
iron, Appel and Schopper (Ap 57) measured the circular- 
polarization correlations in Co™, Zr, and Sb. The 
measurement on Co was primarily to test the reli- 
ability of the apparatus, but they give a value of 
A=—0.35-+0.05 for this nucleus. In the case of the 
decay of Zr’, the measurements were made on the un- 
resolved 0.722- and 0.754-Mev radiations from Nb®. By 
comparing the experimental value of A= —0.46+0.09 
with possible theoretical values in the manner discussed 
in the foregoing (Al 57), Appel and Schopper conclude 
that there is probably present an ST or VA interference 
term. The measurements made with Sb", which has a 
first-forbidden transition to the 0.605-Mev state in 
Te, yielded a value of A=—0.13+0.06. This is 
smaller in absolute value than the value of A=-} 
calculated (Al 57) on the basis of a simple once-for- 
bidden transition. 
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In an effort to determine whether or not parity break- 
down is a maximum, Appel et al. (Ap 58) remeasured 
more accurately the beta-gamma circular-polarization 
correlations in Co™ and Na”. The measurement of the 
polarization correlation seems favorable for this pur- 
pose since all the necessary corrections can be calculated 
accurately and systematic errors can be kept small. 
They found A = —0.340-+-0.035 and A = +0.295 0.054 
for Co® and Na”, respectively. These values compare 
favorably with the corresponding theoretical values of 
—1 and +% for the case when nonconservation of 
parity and noninvariance under charge conjugation are 
maximum. 

Steffen (St 58) also studied the circular polarization 
of the gamma rays from Co and Sc**. He verified the 
dependence of the degree of circular polarization on v/c 7 | 
[Eq. (III-9)] and that the angular distribution varies 
as cos@. For Co® and Sct, values of 4 = —0.34--0.02 
and A=+0.24+0.02 were found. Although the Sc*@ 
value is smaller than that given by Bo 58, it verifies the 
presence of a large interference between GT and F a 
interaction. If maximum interference between V and 
A interactions is assumed, | MWaer/M r|œ4.0. With these — 
nuclei Steffen (St 85a) has also used the method (Be 57) 
in which the electron polarization is perpendicular to 
the circular polarization. His experimental arrangement _ 
is shown in Fig. 37. 

In a beautiful experiment by Goldhaber et al. (Go58) 
on the 80-kev isomer of Eul, it was shown that the 
neutrino is left handed (negative helicity). Eut? decays 
by electron capture to the 961-kev state in Sm. T 
experiment was performed by measuring the circ 
polarization of the 961-kev gamma rays whic 
subsequently resonant scattered from Sm‘. T 
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Fic. 38. Experimental arrangement for measurement of circular 
polarization of resonant scattered gamma rays. This arrangement 
was used by Goldhaber et al. to determine helicity of neutrinos 
(figure from Go 58). 


perimental arrangement is shown in Fig. 38. Through 
conservation of energy and momentum the requirement 
of resonant scattering selects those gamma rays which 
are emitted opposite to the neutrino emitted in the 
electron capture. The magnetic analyzer determines the 
circular polarization (helicity) of the gamma rays. The 
helicity of the neutrino is then determined since a 
gamma ray emitted in the direction opposite to the 
neutrino will have the same helicity by conservation of 
angular momentum. The measurement gave negative 
helicity for the gamma ray and thus also for the neutrino. 


E. Tabulation of Polarization Experiments 
on Nuclear Gamma Rays 


The methods and results of the experiments discussed in the 
text are summarized in Table XIII. An explanation of the notation 
used and how the entries were made in the various columns of the 
table is given below. 

Column I. Nacleus.—In all cases the nucleus given in this col- 
umn is the nucleus emitting the gamma ray whose polarization is 
being measured. The experiments are listed in order of increasing 

mass number A. 

Column II. Source of reaction.—In this column is given a level 
diagram of the radioactive decay or nuclear reaction used in each 
experiment. Frequently one diagram is applicable to several ex- 

eriments and so is not repeated. The level diagrams given in the 
table are not complete. Only those levels which directly feed the 
eto a-ray transitions under study and those between the con- 
$ (eamm: Jevels and the ground state are shown in the diagram. 
-tributing te rams show only those transitions in the 


N level diag i Ae 
EE a e dual nucleus which give rise to the polarizations 

ere measured. Thirdly, only those beta transitions which 
whic. 


Kee cascade of gamma rays under 
Jead directly to the gamma gy, Bu 
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study are given by solid arrows. Dashed arrows denote the fact 
that other beta transitions may be contributing to the intensity of 
the gamma ray under study but not through the particular cascade 
under study. Decay by positron emission, electron emission, or 
electron capture are designated by B*, 67, or ZC, respectively. 

In the case of the reactions in this table, the neutron, proton, 
or alpha particle is denoted by , p, or a, respectively. In the case 
of resonant reactions, the resonant level is shown by a solid line. 
Nonresonant reactions show a dashed line, whereas in thick- 
target’ reactions a bracket is shown in place of all the “levels” 
reached in the compound nucleus. The bombarding energy is given 
as a dashed line in the initial nucleus. Neutron-capture reactions 
are shown with the “state” of the compound nucleus at the same 
height as that of the initial nucleus since thermal capture is in- 
volved in all cases. Coulomb excitation is denoted by a double- 
lined arrow. 

Energy levels below 10 Mev are given to three significant figures, 
some of those above 10 Mev are given to four. The energies of the 
levels are given immediately to the right of the level, whereas the 
spins, when known, are given to the left. Gamma-ray energies 
are given between the level values either to the right or left, 
whichever is convenient. When the level scheme is sufficiently 
complicated, arrows are used to indicate the energy of the gamma 
ray. All energies and spins are the most recent given by the 
Nuclear Data Group, National Research Council (Mc 58a). 

Column III. Polarization technique.—The letters given in this 
column denote the basic polarization technique used in each of 
the experiments. They are as follows: 


C=Compton polarimeter detecting linear polarization. 

C= Compton magnetic polarimeter detecting circular polarization 
by forward scattering. 

C.= Compton magnetic polarimeter detecting circular polarization 
by transmission. 

D=Photodisintegration of the deuteron, detecting linear polariza- 
tion by use of nuclear emulsions soaked in D20. 


Column IV. Experimental conditions—Three essential experi- 
mental conditions are given in this column. In order from top to 
bottom, as given with each experiment, they are as follows. (1) The 
physical entity which establishes the direction of quantization 
in the polarization correlation. Thus, if the direction is established 
by the propagation vector of an alpha particle, the entry a is 
made; if by a magnetic field, the entry “mag. field” is made, etc. 
In cases where the direction of propagation of a gamma ray ina 
cascade establishes the direction, the gamma ray is specified by 
energy when it is known. If there has been no clear discrimination 
between which gamma ray establishes the direction and which is 
accepted by the polarimeter, only the symbol y is placed in this 
position. It should be noted in cases where Bleaney’s method of 
alignment is used, that the words “crystal electric field” have been 
used primarily for convenience and do not in general mean that 
this field direction is the alignment direction. In general the align- 
ment direction is established as the result of the coupling between 
the crystal electric field and the atomic orbital motion which in 
turn is coupled to the spins. Thus different alignment directions 
may occur depending on the crystal used. (2) Below entry (1) 1S 
given the angle between the axis of quantization and the axis of 
the polarimeter. When a complete direction-polarization correla- 
tion function is determined, the range of angles is indicated by 
an arrow between the extremes of the angular range. (3) Below 
entry (2) is specified the gamma ray whose polarization is being 
measured, When there is more than one gamma ray being analyzed 
and energy discrimination is made between each, a semicolon 1S 


placed between them. When the gamma rays are not resolved, @ 


comma is placed between them and a (w) is inserted at the end. 
When there is discrimination available, but it is not specified which 
of the gamma rays is being analyzed, an “or” is placed between 
the gamma-ray entries. P 
Column V. Summary of results.—In the case of linear polariza- 
tion measurements, the gamma rays studied are usually list 
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first in vertical order with their multipolarities. The multipolarity course, only relative to other assumed or known parities. Other 
is given even though additional evidence outside of the experiment information such as spins, the sign of a magnetic moment, etc., 
listed usually is required to deduce it. In the case of the circular is frequently given also when it is considered a significant part of 
polarization measurements the anisotropy coefficient A is given the author’s conclusions. As a general rule, the chief conclusion 
for each gamma ray listed. In two instances N,/N_ is used to of the author is the one given. 

indicate the ratio of counting rates before and after reversing the A colon is used to connect an item given in the column with the 
magnetic field. results of the measurements made on it, e.g., “cascade: 

The parity of a state is listed when it is of special interest, 4(£2)2(£2)0.” 
usually as designated by the author. The parity is determined, of A reference is made to other work (usually a measurement of 


TABLE XIII. Summary of experiments. An explanation of the various entries is given in Sec. E of Part III. 


Polarization Experimental 


Nucleus Source or Reaction Technique Conditions Summary of Results Reference 
arn ee 
He? z D p 6.37: El, wi52 
EiMev = --~6.3 90° ~1004 polarized; agrees 
one “6.37 with Fokg 
Htp 
pt 
/ 
2 
He> 
p12 jo 214 Ct p 2.l4}y: no circ. pol. ; Wi58 
2 i 0° N/N- = 1.00025 + 
2.1hy 0.00026, parity con- 
CP] served 
by Bil 
ni4 c He® 1.647: ML or E2, Li58 
90° 3.95 state parity: + 
1.647 
16 c a 6.14y: E3; Fr52 
g 118° Ne2° exc.state and F19 
- 6.147 gnd.state parities: + 


(Ar50, Ba50) 


D P 6.147: unpolarized, Fa53 
90° 6.917: E2, 
6.147;6.917; 7.127: El; 
7.127 agrees with Se52 (Sa52) 
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TABLE NITI (continued). 


Polarization Experimental 
= Source or Reaction Technique Conditions Summary of Results Reference 


Send S Ce P 7.127: no circ, pol, wi58 
ea 23 o 0° N+/N- = 1.000053 + 
A il 7.12 7.127 0.00024 parity con- 
i F'9 20 9l served 
c He? 0.94y: ML or E2, Li58 
909 0.94 state parity: + 


Cr B 1.287: A = -0.39 + 0.08; Sc57 
152° additional proof of 
1.287 nonconservation of 


parity. 


1.28y: A = -0.340 + 
0.035; 
accurate indication 
e AK that nonconservation of 
4 parity and noninvari- 
ance under charge conju- 
gation are maximum 


D p 10.8y: El, 
90° Na®° gnd.state parity: + 
10.87 (Gr55) 


1.377: E2, 
2.757: E2; 
/ agrees with B152 


POLARIZATION ON NUCLEAR GAMMA RAYS 
TABLE XIII (continued). ha 
Polarization Experimental a 
Nucleus Source or Reaction Technique Conditions Summary of Results Referen ce 


Mg?4 Ce Bowe 1.377, 2.757(u): A = Sc57 
$ 152° -0.068 + 0.047 ADST 
1.377,2.757(u) A 
Cr B 1.377, 2.757(u): A = 
159° +0.07 + 0.04; 
4, 1.377,2.75y(u) if max.interference 
assumed, | Mon/Mp|>25 
c p 1.377: E2 
909 
1.377 
+ 
o Mg¢* 
Í A127 N D p 1. BY:EL, 
339Kev 3, 8.58 90° 0.84 state parity: + 
2: 5 7.137 agrees with Da53 (Ru56) 
Mg? p 
7.73 
28 D 1l10.4y: EL 
$ 300 N gnd.state parity: 
10.47 - (Rud54) 
g33 Ct n polarization 3.22 state: 3/2, 
+ yS 0° 180° 3.227: EL (Ho53,Br5+) 
o \—— 8.65 5 iy 
S32+n 
5.44 
37 
ho 3.22 
3 + 
6 


n polarization 1. 
o ,1809 al 
6. hay 
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TABLE XIII (continued). 


Polarization Experimental 
Source or Reaction Technique Conditions Summary of Results Reference 


Cc B 1.527: unpolarized Had3a 


Be50b, S 
1.527 (Be50b, St51) 


Cr B 1.167: A = -0.02 + 0.0k; Bo58 
159° if max.interference 


assumed, |Mor/Mp] = 5 


y 0.897: E2, 
90° + 180° 1.127: E2. 
0.897,1.127(4) cascade: 4(E2)2(E2)0 


(Br 48) 


B 0.897, 1.12y(u): A = Bo57b 
159° 0.353 + 0.04; Bo58 
0.897,1.12y(u) large interference be- 


tween GT and F cou- 
plings; if max.inter- 
ference assumed, 


Be [Mor/Mp| = 2.2 


Cp B 0.897, 1.12y(u): A = 
559 0.2} + 0.02; 
Da 0.897,1.127(u) verifies large interfer- 
ence between GT and F 
couplings; if max.in- 
terference assumed, 


\Mer/Mp| = 


B 0.997, 1.337(u): A = 
159° 0.06 + 0.05; 
0.997,1.337(u) if max.interference 


assumed, [Mor /Mp | 25 


1.33y;0.99y 0. 99 y:E2 
105 133y:E2 
0.99y; 1.33y cascade: 4(E2)2(E2)0. 


SR a TN EET ARE 
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TABLE XIII (continued). 


Polarization Experimental a 
Nucleus Source or Reaction Technique Conditions Summary of Results Reference 


TISS Ct n polarization 1.72 state: Tr57 
0° ,180° 1.35 state: Zo, 
6. hay; 6.757 1.357: E2 (H053 ,Br5}) 
cr52 c mag.field 0.737: E2, Hu57 
o 0.947: E2, 
0.737;0.94y; 1.4357: E2; 
1.4357 agrees with Ke54 
Cr mag.field Mn52 mag. mom:+ Hu57 
0° ,180° 3 
O. 270: 947, ag 
1.437(u) 3 
Cr B 0.737, 0.94y, 1.467(u): Bo58a e 
159° A = -0.16 + 0.05; 
0.737,0.9h47, interference between GT 
1. 43y(u) and F couplings 
Cr54 c mag.field 0.847: E2 B154 
909 
0.847 
= Ct n polarization 8.887: El, Tr57 
1 972 0° ,180° 9.727: El; 4 
Cr53+}n 8.887 59.727 8.88 - 0.84 cascade: 
1(E1)2(E2)0 
8.88 ( (E2) 
2t} | og4 
ot 
cro4 
Fesé c .field 0.8457: E2, 
1.247: E2, 
i 0.8457;1.247; 1.757: ML, 


1.757:2.617: 
5.257 


2.617: Ml, 

J25 E2; 

0. 849 2 a aes 
af 110 

3 oa 

(Gansa) 


—_ — hA 7 
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TABLE NIII (continued). 


4 Polarization Experimental 
Source or Reaction Technique Conditions Summary of Results Reference 


VA c crystal elec- 0.1227: KN = +0.19 + Bi55a 
oor EC tric field 0.02 (A154) : 
5 0.137 90° = 
y 2 0.1227 
f 0.122 
Jo 0.014 
VA Feo" 
Ct n polarization Fe°”’ gnd.state: 4 Tr57 
ot ae 64 0° 180° (B153) 5 3 
Fen 2 7.647 = 
c crystal elec- 0.8007: E2 (Da52) Bi52 
tric field S 
90° 
0.8007 


Ce B 0.8007: A = -0.14 + Bosga 
1592 0.07; Bod 
0.8007 if max.interference 
assumed, |Mop/Mp] = 2.8 
n polarization Ni°° gnd.state: 3/2; Tr57 
0° ,180° confirms Pr54 
9.007 


X 1.177: E2, Me50 
.90 — 1809 W557: E25 ¥ 

1.177,1.337(u) 1.17 - 1.33 cascade: 
4(E2)2(E2)0 


POLARIZATION ON NUCLEAR GAMMA RAYS 


TABLE NITI (continued). 


Polarization Experimental 


Nucleus Source or Reaction Technique Conditions Summary of Results 
nis? c 7 : 1.177: E2, 
90° — 180° 1.337: E2; 

ee 1.177,1.337(u) agrees with Me50 
and Wi50 
c crystal elec- 1.177: E2, 
tric field 1.3357: E2; 
90° agrees with Me50 


1.177,1.357(u) 


Cc y 1.177: E2, 
100° 1.3537: E2; 
1.177 or 1.33y agrees with Me50 


Cf roe Come mag. mom: + 


1.177,1.357(u) 


cf B L.1l7y, 1.337(u): A = 
152° -0.41 + 0.07; 
1.177,1.337(u) v in Bt decay in oppo- 
site screw sense from 
v~ in B“ decay 


Cf B 1.177, 1.337(u): A = 
159° -0.41 + 0.08 
1.177,1.337(u) 
Ct B 1.177, 1.3357(u): A = 
180° -0.32 + 0.07 ? 
1.177,1.337(u) $ 
Ce Bia 1.177, 1.337(u): A = 
153 -0.35 + 0.05 
1.177,1.357(u) 


VM the 35r(u): As 
40.295 + 0.054; 

' accurate indication 
that nonconservation of 5 
parity and noninvari- 


c B 1.177, I salu) | A = 4 
£e ~150° “0.34 + 0.09 
s 1.177,1.5337(u) 


Cf B i, 117 1. 33y(u): 
~150° -0.34 + O. o2 ii 
1.177,1.337(u) 
e5 n polarization 0.052 state: eis 
Zi K 0° ,180° : 


7.93 -7.887 


Tey OE 
Š res. 
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TABLE XIII (continued). 


Polarization Experimental 
Source or Reaction Technique Conditions Summary of Results Reference 


Ç BR 0.557: E2 (Ri52, Wa50a) Ha53a 
90 
0.557 


Cc B 1.08y:E2 (St51,Ri52) Had3a 


1.087 


8 
O 
g ERR WERONA i eS ee Eoo 


c 1.857: E2, ` Bi55 
1. 85y 0.917: El; 
90 cascade: 3(E1)2(E2)0 
0.9ly agrees with Pe48 and 
Me52a 
z 1. 85y; 0.9ly 0.91y:El Cone 
90 -»180 1. 85y:E2 


0.91ly; 1. 85y agrees with Bi55 


Cr B 0.7227, 0.754y(u): A = Apd>T7 
155° -0.46 + 0.09; 
0.7227, S,T or V,A interference 
0. 754y(u) is present 


l 
0.2037: (E2/M1)2 
-0.58 + 0.20 


0.2987: (E2/M1)2 
On 


0.51357: E2, 
0.7537: E2: 
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TABLE XIII (continued). 


Polarization Experimental 


Nucleus Source or Reaction Technique Conditions Summary of Results Reference 
paios c y j 0.5137: E2; _ Wi50 
90° = 180° cascade: suggest 
0.5137, O(E 2)2(E2)0 mixed 
0. 73 y(u) with cascade from 


close lying 3rd exc. 
state (Sp50, K153, Kr53), 


Ag207 - c 0.32hy: (B2/M1)2 = Mc58 
g 3p- — -0.324 Ro Aa 
CP] 0.3247 
y 
ag” 
a 
Ag?o® = c 0.3097: (E2/M1)2 = Mc58 
3 % 0.309 90° ee 
E iil on 
if 
2 ago? 
1 
2212 + c 0.3427: (E2/ML)@ = Mc58 
rG 3, 0.342 300 oa Ta 
[ J 0.3427 
p 
y + 
2 cd! 
1 
113 3,t c 0.3007: (E2/M1)2 = 0.29 Mc58 
ca %2 0.300 500 
0.3007 
[P] 
| t 
2 
ca!!3 
114 + Cc y 0, 566y:E2 Br56 
Cd 5 0.192 90°-> 270° 0. 722y:E2 
0, 566y, 0. 722y(u)cascade:4(E2)2(2E)0 
Tet24 c B 0.6057: E2 (St51) St52 
90° 
0.6057 
c B 0.6057: E2 (K152) K152 
90° - 180° 
0.6057 
c y 0.6057: E2, K152 
90° — 180° 1.697: E2/M = 0.09, 


1.697,0.6057(u) cascade: 3(E2, M1)2(E2)0 


Cc B a 0.6057: E2 (Be50b) Had53a 
0.6057 
CC-0. Gurukul Kangri University Haridwar Collection. 
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TABLE XIII (continued). 


Polarization Experimental 
= Source or Reaction Technique Conditions Summary of Results Reference 


Cr B 0.6057: A = 0.13 Ap57 

o 153° + 0.06 
0.6057 

Cc y polarization consist- Me50 
90° + 180° ent with a composite 
0.5637,0.5707, correlation coeffi- 


0.6057 ,0.796y(u) cient = 


0.6057 ,0.7967(u) 


0.13 
3 
(0 7 a y: unpolarized; Wi50 f 
o 
90° - 180 see K152 which 
BIC en ) agrees with Me50 
-6057 ,0. 7967 (u (St55) 
c 7 0.6057: E2, K152 
90° — 180° 0.7967: E2; 
Oa RDI, 1.402 - 0.605 cascade: 
Samoa”) 4(r:2)2(62)0, (Ro51) 
Cc y polarimeter polarization - polari- Ro52 
180° zation correlation 
0.5637 ,0.5707, observed not in disa- 
0.6057 ,0.7967(u) greement with Wa50, 
N Pe5l and Ro5l 
c B 0.6057, 0.796y(u): un- Had3a ra 
90° polarized (Be50a, = 
0.5637 ,0.5707, St50) E 


1.60y 0.3297: El, Bid) 


90° 0.4877: E2, 
0.3297 ;0.4877; 0.8157: El; 
0.8157 cascade: 3(&1)4(E2)2(E2)0 


crystal electric 0.1457: E2/M1 = 0.08 
field + 0.02 (Ca55) 

90° 
0.1457 


ca55 


0.960y: neg.helicity, 
v: neg.helicity; 
_ GT interaction is A, 


Pa 


hee _. 
= ph At 


Nucleus 


Tał81 


Aut 97 


Hg?98 


m 208 


p 20s 


pb?98 
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TABLE XIII (continued). 


Polarization: Experimental et 
Source or Reaction Technique Conditions Summary of Results Reference 
e yE vgy: (B2aye = 6.3; SeT 
c oO. 8 = 0.5; AO 
Hf 3⁄2 Qil ROAN sD) A 0.482 cascade: x 
(0) 135-2- 0.137 0. 482y 1/2* - 5/2* -7/2*(Pa55, 
s$ 0.482 Me54a, He55) 
+ 
Wo 0.303 
Ve. 0.152 
gà 0.136 
+. 
T2 Tole! 
1 5; 
+ A : 2 = -0. «Mec 8 
5) 0.280 c Des 0.2807: (E2/M1) 0. 41 5 
3g 0.270 0.2807 +0. 04 
[P] 
VE 0.077 
+ 
e797 
Au Cr B O.4l1ly:-A = +0.52 Bo5S7 
159° + 0.09; Bo57a 
0.4117 max.interference between Bo58 
different T (or A) and 
S (or V) matrix ele- 
ments 
Ce B 0.4117: A = 0.34 + 0.05 
~150° 
O.411y 
32 Cf B 0.2797: A = -0.06 + 0.22 
203 \_B- 159° 
Hg > 0.2797 
EN 0.279 
t 
73 
Tiss 
320279 3 
2 : c P 0.2797: (E2/M1)2 = 1.50 
[P] go? ++ 0.08 
n 0.2797 
b 
T1205 
+ @ P 0.2057: (E2/M1) = 1.46 
33-0205 So Poe A aaa 
Ce]. 0.2057 f 
\ 
2 Tilsoe 
Cc y 
90° 
0. 583y, 


2. 6ly(u) 
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TABLE XIII (addenda). Summary of experiments. An explanation of the various entries is given in Part III, E. 


Addenda. In the second column the letters have the same meaning as before. 


Nucleus Method Result and reference 


Nu D 


Measurement of linear polarization of y ray in C¥(p,y)N" at £»=1.76 Mev indicated no parity change in the 
Gan, to the ground state [Strassenburg, Hubert, Krone, and Prosser, Bull. Am. Phys. Soc. Ser. II, 3, 372 
(1958) J. o 

Measurement of 6-y circular-polarization correlation in decay of Na” gave A = +0.05+0.04 in agreement with 
other results [R. M. Steffen and P. Alexander, reported in Proceedings of the Rehoveth Conference on Nuclear 
Structure (North-Holland Publishing Company, Amsterdam, 1958) ]. 

Linear polarization measured at three resonances in Si®(p,y)P®. The 8.04-, 8.20-, and 8.24-Mev y rays are 
E1, M1, and E1, respectively (Tu 57). 

Measurement of y-y direction-polarization correlation in the 1.16-2.10-Mev cascade, with the y-rays resolved, 
confirmed a aaa of 2*(M11,E2)2*(£2)0* and gave 6=0.13 [C. G. Shute and P. S. Fisher, Phil. Mag. 
3, 726 (1958) 

Measurement of -y circular-polarization correlation in the decay of Sc*® gave A = +-0.29-+0.11 in agreement 
with oer results. [Lundby, Patro, and Stroot, quoted by M. A. Grace, Proc. Roy. Soc. (London) A246, 460 
(1958). 

The $-y circular-polarization correlation was measured as a function of v/c. Circular polarizations greater than 
4u/c¢ were observed, especially for small values of v/c [Page, Pettersson, and Lindqvist, Phys. Rev. (to be 
published) ]. 

The experiment of Go 58 was repeated to show that the neutrino has negative helicity in the decay of Eu! 
[. Marklund and L. A. Page, Nuclear Phys. 9, 88 (1958) ]. 

Measurement of circular-polarization of radiation following capture of polarized thermal neutrons indicated 
complete right circular polarization in agreement with spin assignments and the expected £1 character of the 
radiation. Result used by Tr 57 to establish the reliability of experimental arrangement. 

Measurement of 8-y circular-polarization correlation on decay of Au! gave A =0.45-+0.06 for an electron 
energy range of 520-620 kev and A =0.38+-0.10 for a range of 620-850 kev (R. M. Steffen and P. Alexander, 
see Mg” above). 

The 411-kev y ray from Au’ was resonant scattered by Hg”! and the linear polarization of the scattered 
radiation was measured. A multipolarity of #2 was confirmed for the radiation [V. Knapp and B. S. Sood, 
Proc. Roy. Soc. (London) A247, 369 (1958) ]. 

Linear polarization measurements [G. T. Wood and P. S. Jastram, Phys. Rev. 100, 1237(A) (1955) ] along 
with directional measurements confirmed the assignments, of 0+, 37, 57, 4-, and 5~ to the first 5 states in 
Pb*8 (Elliott, Graham, Walker, and Wolfson, Phys. Rev. 93, 356 (1954) ]. 


Cy 


RSI D 


Sy C 
Tit® Ct 
Ni® Ct 


Sm! 52 


W183 


Ct 
Cı 
Hgs Cy 
Hg" C 


Pb*s C 


Ambler, Hudson, an 


the direction-direction correlation) if it seems particularly relevant Am57 Ambler, Hayward, Hoppes, and Hudson, Phys. Rev. 
or related to the polarization measurement. No attempt is made 105, 1413 (1957). ; p: 
at completeness in the listing of relevant references. Ap 57 H. Appel and H. Schopper, Z. Physik 149, 103 0250A 
Column VI. Reference. —The reference corresponding to each Ap 58 Appel, Schopper, and Bloom, Phys. Rev. 109, 2211 (1958). 
experiment listed in the table is given in this column Ar 50 W.R. Arnold, Phys. Rev. 80, 34 (1950). 
P 8 : Ar 53 P. Argyres and G. Kittel, Acta Met. 1, 241 (1953). 
au 2 p, Auger and F. Perrin, J. phys. radium; A EN 
ACKNOWLEDGMENTS a Barnes, French, and Devons, Nature 166, 14 : 
: Be 50 H.A. Bethe and C. Longmire, Phys. Rev. 77, 647 (1950). 
We would like to thank Dr. Joachim Ehrman for Be 50a (RED Berlin and L. Mandansky, Phys. Rev. 78, 623 
many helpful discussions and Dr. Francis E. Throw for ge 50b J. R. Beyster and M. L. Wiedenbeck, Phys. Rev. 79, 728 
his generous aid and advice in writing the manuscript. (1950). 
Dr. Phillip Malmberg, Mr. Ronald Fast, and Mr. Be 56 H.A. Bethe and P. Morrison, Elementary Nuclear Theory 
A > ma? (John Wiley and Sons, Inc., New York, 1956). 
Samuel Chappell were of great assistance in the search Be 57 1). B. Beard and M. E. Rose, Phys. Rev. 108, 164 (1957). 
of the literature. Mr. Chappell also helped with some Be 57a Berthier, Debrunner, Kiindig, and Zwahlen, Helv. Phys. 
: PP > P Acta 30, 483 
of the calculations. Mr. Ralph Klingler provided many pe sg Rees ae SSD: hi d, Vise, and Wu, Phys. Rev: , 
of the calculations on which the tables are based. The (to be published). : ; : 51) 
Graphic Arts Branch of NRL supplied most of the Bi 51 Biedenharn, Rose and Arfkin, Phys. Rev. 83, 683 (1951). 
figures and reproductions. We are grateful also to Bi W RR ET o Kurti, and Robin 
Dorothy Bodden for her careful preparation of the Bi 53 1 A Birdenharn and M. E. Rose, Revs. Modern Phys. 
manuscript. One of E Bone re to ok the Bi 54 Bishop, Daniels, Durand, Johnson, and Perez y Jorba, 
Guggenheim Memorial Foundation for a fe owship ; Phil. Mag. 45, 1197 (1954). 
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Bi 55a Bisho , Grace, Johnson, Knipper, Lemmer, Perez y 
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I. INTRODUCTION 


HEN Goudsmit and Uhlenbeck proposed the 
spinning electron the possibility arose of having 
spin-polarization effects for a free electron. On the basis 
of Dirac’s theory of the electron Mott proposed in 1929 
a double scattering experiment designed to demonstrate 
such (transverse) polarization: the first scattering in 
the Coulomb field of a high-Z nucleus would serve 
partially to polarize an incident beam of unpolarized 
electrons because of the spin-orbit interaction, while 
the second such scattering would serve as the polariza- 
tion analyzer. For a quarter century thereafter this 
quite difficult experiment was pursued in a number of 
laboratories. Approximately midway through this period 
it became evident that the experiment could, ‘in fact, 
demonstrate the sought-for polarization, and at present 
there remains no doubt whatever that it can be made 
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to work.* It is perhaps fortunate that the double 
scattering experiment was not crucial for Dirac theory, 
and that the other verifications came as early as they did. 

An unfortunate situation prevailed during the 1930’s 
and 1940’s in that the whole matter of electron, positron, 
and gammaj polarization in general was viewed rather 
apathetically by many physicists. Such attention as 
there was, was focused almost entirely on the double 
scattering experiment itself. Thus the polarization 
effects already existing in the Klein-Nishina matrix 
element for Compton scattering, in the Møller matrix 
element for electron-electron or positron-electron scat- 
tering, in the Bethe-Heitler matrix element for brems- 
strahlung, to name a few of the common electron- 
photon processes, continued to lie dormant} until quite 
recently. 

Prior to about 1956 there was no supposition that 
“Drepolarized” beams of electrons, positrons, or photons 
would be available from beta decay of nuclei. Therefore 
any experiments then proposed, just as with the Mott 
proposal, would have had to be of “‘second order,” in- 
volving a polarizing unit to be followed by an analyzing 
unit. Such experiments seem to have had no strong 
appeal to experimenters in general. 

When it turned out that parity violation in beta 
decay was in a sense complete, as demonstrated by 
the beautiful cobalt-60 experiment of Wu, Ambler, 
Hayward, Hoppes, and Hudson, it became clear that 
prepolarized beams of particles (and at times pre- 
polarized photons) must exist. This initiated a greatly 
accelerated development of practical techniques for 
handling such polarizations. This paper discusses the 
various methods for analyzing electron and positron 
polarization, most of which were introduced and proved 
during the short period of about half a year following 
the announcement of parity violation. Photon (circular) _ 
polarization§ must be brought into the discussion 


* Tolhoek [reference (T)] covers the double scattering exper 
ment thoroughly. 

} Plane polarization was discussed early, but there is no direcl 
connection between such photon polarization and elec! 
polarization. $ 

} That is, spin polarization was rigorously summed or ave 
at each opportunity in proceeding from matrix element to tra 
tion probability. A rare exception is the paper of Franz (F3: i 
The spin direction having been generally taken out, cir 

larization information was automatically suppressed. 
Justification for such averaging was considered sound 
spin direction is not measured.” We have not far to look 
reason it was not measured, and the circle is thus closed. 

§ Such polarization is dealt with only lightly here. In cor 
to the situation regarding particle polarization, there ap 
have been introduced no new approach to this problem since tl 
time of the Wu experiment. Reference (T) discusses the Comy 
effect as applied to circular polarization measurement. 
recent review (S58) by Schopper. ee, 
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because it connects with the problem of measuring free 
particle polarization. 

The large and decisive effect found in the Wu experi- 
ment was primarily responsible for the apparent ease 
with which the new polarization methods succeeded. 
Knowledge that certain polarizations are quite probably 
present, and in abundance, has made for quick develop- 
ment of suitable experimental techniques. Also this 
knowledge has made for ready acceptance of most 
experimental conclusions relating to beta-decay polari- 
zations, however hastily drawn a few may have been. 

In the beginning of this new polarization era, one 
might have expected that only “first-order” experiments 
would appear. However, with the luxury of generally 
large inherent polarizations to begin with, a considerable 
number of second-order methods have already been 
successfully proved. Here the polarization one wishes 
to know is transferred for strategic reasons from initial 
particle to photon, or from particle to particle, in the 
first stage, and then the polarization analysis itself 
takes place in the second stage. Thus one requires two 
unlikely events in series, which makes the experiment 
second order. 

The methods and the physics on which they are based 
are emphasized here, so the historical order is not 
followed in every case. For a given topic only a sufficient 
number of works is reviewed to illustrate the physics 
involved.|| Closely related ideas (although perhaps not 
proved experimentally) are sometimes mentioned to 
round out the discussion. No pretense is made of review- 
ing all “parity results” obtained by a given method. 
No grand averages are made of the numbers obtained 
with a given radioactive source, and no pronouncements 
are attempted on such matters as a universal beta 
interaction or PC invariance. 

The discussion cannot be divided rigidly into experi- 
mental method versus ‘‘theory of the method.” Measure- 
ment of polarization by counter techniques (the only 
technique described here) is a field where it seems 
particularly safe to predict that a hypothetical ensemble 
of technicians continually trying different arrangements 
of scattering foils, magnets, absorbers, and so forth— 
with no belief in any theory and therefore no real plan— 
would take a long time indeed to happen upon a useful 
method. Each technique had to be planned in advance, 
hopefully at times because polarization measurement 


is not easy. 
II. GENERAL CONSIDERATIONS 
A. Polarization of Free Electrons 
For an electron under no forces, solutions of the Dirac 
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momentum p, of energy [eigenvalues y, where 
y= (1+ 9°)? ], and, if we wish, also of helicity. That is, 
letting p define the z direction, we may have a solution 
which is simultaneously as eigenstate of H=a-p+, 
of p, and of c, (with eigenvalues +1). This is possible 
because the operator e commutes with all of the 
Hamiltonian that matters for this special case, namely, 
pa. +B. 

But (still letting eigenvalue p define the z direction) 
we are not able to construct an eigenstate of oz, for 
oz anticommutes with œ.. However if the coefficient of 
az, namely p, is small, then the effect of this lack of 
commutation becomes correspondingly small. Thus at 
very low energy any component of ø is nearly com- 
pletely measurable. For finite laboratory energy the 
closest we can come to having a state of precise cz is to 
take the “best” linear combination of the two eigen- 
states of e+. Mixing equal amplitudes of the latter two 
states with a plus (minus) sign gives (¢,)\=+1/y, 
which tends to +1 as tends to zero. Another real 
operator, Boz, does commute with œ., and with £. 
Therefore, simply because the maximum expectation 
value of o becomes so nearly zero at high energy, it 
does not follow that transverse polarization is somehow 
“unmeasurable” at high energies. This means only 
that techniques are required that aim at measuring 
Go, rather than measuring c, itself. Otherwise one might 
follow the course of first precessing the spin and then 
measuring oz. 

In speaking of longitudinal polarization of an electron 
or positron we use P = (ø), where z is parallel to the 
velocity. Pp is thus a number which may range from 
—1 to +1. As is well known, a Lorentz transformation 
may be made along the z axis, to the rest system of the 
particle in particular, and P, retains the same value. 
In the rest system we could for the moment suppose full 
polarization is some direction, say at angle 60 (polar 
angle) and ¢o (azimuthal angle), referred to z and 
x directions. The spinor part of the state can then be 
given by |+1) cos(6/2)-+ | —1)e##9 sin (09/2) in the rest 
system of the particle, or in the laboratory. For this 
situation the value of P, is just cos@p. 

Transverse polarization is denoted by P. and 
naturally a certain azimuth has to be implied. We 
shall mean that P,(¢)=(czcos¢+o, sing) computed 
in the rest system of the particle. For the oblique 
situation given in the foregoing, P,(¢) amounts to 
sin Cos(¢—¢o). Often in speaking of how much trans- 
verse polarization exists in a beam, one tacitly maxi- 
mizes |P,| against ¢ before quoting the number, since 
the geometry of the experiment usually involves 4 
natural choice for the reference azimuth ġo (or potr): 


n exist which are simultaneous eigenstates of 


“equatio Suppose we have a partially polarized beam (longi- 
| to list f th . . 9 — 
[he biography makes no stomps flit many of the tudinally polarized, for exami, with OSPE 
excellent eee which are listed. The total literature on the We may consider that the fraction 1—|P| of the 
proved “Pxperimental and computational) J Become gute particles are unpolarized while the fraction |P| are 
See In Bee e Bee iiher ian in a stage of new faperctely polarized, with the sign of P. Alternatively 
sow to be in an ump: t complete listing. the same beam i and 
mow tob jt seems too eagly to-attem Kangri OREN Haridwar Collection. Digitized by S3 L may be regarded as a mixture of H E 
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— polarized components in the ratio of (1+P)/(4—P), 
which holds for either sign of P. Certain mathematical 
approaches favor the first view. On the other hand, most 
calculations proceed just as easily using the second 
view. Since no experimental detector can be devised to 
differentiate between the two situations, there is no 
point in debating which of these two views is the correct 
one. 

To specify the total polarization in a beam it suffices 
to know three components of polarization in mutually 
orthogonal directions.{[ Polarization is averaged as 
follows: If beam number one of intensity m, has an 
amount of polarization P(1) and a second beam in- 
coherent with the first, of intensity m, has P(2), then 
as a whole the polarization is 


P=[mP (1)+m:P(2)]/ (m+n). 


B. Polarization Efficiencies 


One is interested in the relative response of a polariza- 
tion detector for two orthogonal spin orientations (that 
is, mutually orthogonal states). For example, for 
detecting a nonzero Pp one requires an apparatus that 
responds differently to a particle with P,=-++1 than to 
a particle with Pp=— 1. All the detectors reviewed here 
are reversible.** Suppose for + setting of the detector 
that the probability, given a + particle, that it count 
is C}, and the corresponding probability for a — 
particle to count is C;_. Let the — setting of the detector 
have respective probabilities C- and C__. The order 
of the subscripts is important. Then, if C__=C,, and 
also C_,=C, _, the detector is called reversible. We 
write Ci./C,-=r=C__/C_¥. (This definition of r is 
more useful than using Cy,/C_ or C__/C,— as appears 
in a moment.) The counting-rate ratio, + setting 
divided by — setting, will be R=(1+eP)/(1—eP). 
Here the efficiency «e of the detector is e= (r—1)/(r+1). 
This, efficiency applies of course to one component of 
polarization, that associated with the + setting of the 
detector.tfT 

In practical circumstances, due to fringing fields, for 
example, a counting efficiency (not a polarization 
efficiency) may change from + setting to — setting, 
violating the condition for reversibility. But since the 
physics upon which the polarization sensitivity is based 
is independent of such systematic effects, it may still 


T We thus have a polarization space of three dimensions for 
spin-} particles. For photons (which are not spin-1 particles) a 
similar space can be used: The momentum direction k is special 
and pertains to circular polarization ; normal to this, plane polariza- 
tion may be represented, providing that azimuthal angle in this 
polarization space is twice the corresponding azimuthal angle in 
the laboratory. 

** Actually, experimental situations exist and may play a role 
in the future where in practice the detector is not actually reversed. 
For example, it may have one state where it responds preferen- 
tially to P>O, and the other state where it is insensitive to P. 

tt We could represent any polarization detector as a vector € in 
a three-dimensional efficiency space. The efficiency for detecting 
arbitrary polarization P in a beam is then £-P/|P|. The same 
applies for photons. 


be that the ratio r as stated above is unambiguous. We 
then keep the same value of e independent of the system- 
atic effect. In such a case the counting-rate ratio has to 
be written more generally as R= (C,,/C__)(1+eP)/ 
(1—eP). If it should be possible exactly to reverse P, 
or to render P=0, say (and neither of these things may 
be easy to do), then the fact that C,,/C__=C,_/C_~ 
is not unity need not hurt the final result, since this 
ratio can be determined empirically. 

To keep the discussion simple, reversible detectors 
are assumed from now on. A good reason for generally 
thinking in terms of an efficiency e instead of the corre- 
sponding ratio 7 is that the law of propagation for a 
number of “‘in series” effects is most clearly expressed 
in terms of es and not 7’s. For example (if we may 
adhere to the nonpositivistic approach), consider a 
particle or a beam which is polarized by an amount Po. 
Suppose the polarization were transferred in the first 
stage of operations to some intermediate particle or 
beam with transfer efficiency e, the intermediate 
polarization, perhaps of a different kind, being now 
€,P. Let this beam be now analyzed with efficiency e2. 
The over-all efficiency e,¢2 is the quantity that defines 
the change in counting rate on reversing either Po, «1, 
or e2. The rate ratio on such reversal is R= (1+ €1¢2Po)/ 
(1—€,e2Po). Depolarization could be taken into account 
by using a series of “efficiencies for preservation of 
polarization,” numbers eg’s less than +1 which would 
then multiply the ee: in this example. All of the es 
mentioned, including that of the detector, can as well 
be regarded as transfer efficiencies, since the over-all 
product-e represents numerically the efficiency for trans- 
ferring the original polarization information into a 
counting-rate asymmetry. 

Here we have chosen to carry through a pair of 
probabilities rather than a pair of amplitudes. It has 
been tacitly assumed that physical judgement has been 
used throughout in choosing at each juncture a pair 
of orthogonal states for which any coherence between 
their amplitudes is of no consequence. The point is that 
the relative phase of their amplitudes is thrown away 
each time we force the calculation to tell us probability 
throughout the chain of events. Any blunder in such 
choice of states leads to grief. Often, there is considerable 
latitude here and we are free to choose a reasonably 
simple pair of states for algebraic convenience. If there 
is any doubt as to which pair of states to choose, which 
is tantamount to having doubt as to which kind of 
polarization it will suffice to talk about, then the 
amplitudes themselves could be carried through. 

To return to the counting-rate asymmetry, experi- 
menters usually prefer to discuss results in terms of the 
fractional change in rate between the + and — settings 
of the detector. This fractional change is often denoted 
by 6. One has 


R RR 
§=2——_ =) E ep. (1) 
R+1 R +R 
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Concerning the factor 2 that appears in expression (1) : 
given a pair of numbers, 100 and 101, one physicist 
may recognize and comment that they differ (to first 
order) by one percent. Another may prefer to state that 
these two numbers exhibit an asymmetry about their 
arithmetic mean of 1/201, thus a nominal ohe-half 
percent effect. We prefer the first method in discussing 
experimental results, even though 6 will exceed 100% 
if experiments improve to the extent that eP becomes 
larger than 0.5. 

It may be felt that density matrices and Stokes 
parameters are not accorded their fair place in the 
discussion. A number of review articles have treated 
these matters [T, (F57) ]. The basic ideas of the experi- 
ments to be described can be put in quite simple 
language. This reviewer can readily believe (but would 
find it difficult to prove) that the majority of these 
experiments were conceived, made to work, and inter- 
preted correctly, without conscious recourse to the 
density matrix. 

Before ending the general discussion of polarization 
measurement we might inquire a little into measur- 
ability of more than one component of polarization. 
Consider some state a which means definite spin direc- 
tion for the particle (in its rest system). Only one 
orthogonal state exists, for a spin-3 particle; call it £. 
Let us have a and f normalized in the same way, with 
the phase between them fixed. For any state of affairs, 
the polarization of the kind a is computed from the 
probability of finding that the particle is in state a com- 
pared to the probability it is in state 6. Thus 


E _Prob(«)— prob (g) 


— nN (2) 
prob (æ) +prob (6) 


and Pg is — Pa. 

Instead of states a and 8 we might use a state called 
= A, which is a+e*#6 and its orthogonal state B which is 
a+e‘(*t™g Then the polarization of kind A, denoted 
P4, is given in terms of prob(A) and prob(B) by an ex- 
pression such as (2). So long as ¢ isa real number, polari- 
zation P4 is complementary to polarization Pa, since if 
P,?=1, then P4?=0, and conversely. We may generate 
a third polarization Pa: which is complementary to Pa 
and to P,(A’/=a-+ie*B, B'=a+ iett), Fora particle 
polarized by an amount P, we have 


P2+P 2+ P42= PS, (3) 


course, expression (ô) is a statement of probabilities, 
with expression (2).° If one measures, by utterly 
ns, Pa for a particle, one finds either +1 
o et, and praon (3) is then amended to read 

= 2— P4⁄=0. 
4 e a polarization detector gathers all the 
Pie e aa] polarization information from a beam, 
k: long „ transverse detector, leaves unexplored the 
hereas y component of transverse polarization. 
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all the transverse polarization information in a beam of 
particles or photons, in which case the detector is not 
characterized by a single angle ¢ (supposing a to corre- 
spond to the beam direction) but strictly consists of 
two or more detectors operating in parallel. 


C. Right versus Left Helicity 


A right-handed particle or photon means one that 
“rotates” as it moves in the same manner as a standard 
right-hand screw. This can also be stated : When angular 
momentum and velocity are parallel the particle is 
right-handed, when antiparallel, left-handed. 

In classical optics the opposite definition is in use: 
Right circularly polarized light is that in which the 
electromagnetic vectors show clockwise rotation as 
viewed by an observer into whose eyes the light enters, 
that is, an observer facing opposite to the direction of 
propagation of the light. 


Ill. EXPERIMENTAL METHODS 
A. Classification of Methods 


Three different ways of analyzing spin polarization 
have been experimentally proved. One method makes 
use of the spin-orbit interaction when the particle is 
scattered by the Coulomb field of a nucleus, the method 
proposed originally by Mott (M29). Here any transverse 
polarization generally leads to a right-left asymmetry 
in the scattered intensity. This amounts to talking about 
a “force” — V (o-H’) for e~, where H’ is the magnetic 
field experienced by the particle in moving through the 
Coulomb field E, namely —(v/c)XE. In the second 
method one makes use of the ø- H energy, where H now i 
stands for an applied magnetic field. This method finds 
application where positronium is formed from polarized 
positrons. The third method is where one compares 
the polarization of the particle to be analyzed with a 
reference polarization residing in atomic electrons which 
are already aligned. In this category belongs use of 
electron-electron scattering (and positron-electron scat- 
tering), annihilation of very low energy positrons in a 
ferromagnetic material, or use of annihilation-in-flight 
of more energetic positrons in a magnetized foil. Also 
there belong here the second-order techniques where the 
particle polarization information is transferred to 
helicity of a photon, this helicity being then analyzed 
by Compton scattering against the polarized reference 
electrons. 


B. Mott Scattering (and Spin Precessors) ; 


Properly, the Mott scattering method applies to 
transverse polarization analysis only. For reasons ot 
Coulomb repulsion it is not well suited for positron 
analysis. As one suspects from contemplating the fingers, 
an “up” spinning electron approaching “straight ahead” 
a nucleus of (positive) charge Ze should prefer to scatt 
to the right rather than to the left because of the forc 
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in the steeply graded field B'=EXv. The quantum 
mechanical results, first estimated by Mott and more 
recently computed by Sherman (S56), predict generally 
large right-left asymmetry in the scattered intensity. 
The azimuthal variation of the scattered intensity at 
angle ¢ is sin(¢—¢o) for an incident electron polarized 
at azimuth ġo. Conversely, the asymmetry in intensity 
between azimuth ¢+90° and the conjugate azimuth 
¢—90°, defined as 


T(¢+90°) —I(¢—90°) 
I(¢+-90°)-+1(¢—90°) 


5o(¢) (4) 


is a measure of the transverse polarization at azimuth ¢. 
A coefficient of proportionality relates the thing we 
presume to be measuring, the transverse polarization 
P.(), and the observed ô. We call this coefficient So 
following Sherman, and write 59(¢)=S»P,(¢). 

For given energy, there is a best region in polar 
scattering angle @ at which to work, from the point of 
view of size of the asymmetry Ss. Some of Sherman’s 
results for the right-left asymmetry for an unscreened 
point scatterer (Z= 80) are given in Table I. This shows 
at each of several speeds v/c, the maximum So, the 
angle max at which it occurs, and the two adjacent 
angles called 6; at which the asymmetry is down to one- 
half maximum value. At a certain angle @ (~60° at 
v/c=0.4 and shifting to ~30° at v/c=0.9) the asym- 
metry Sə becomes zero and changes sign. To avoid 
ambiguous factors of two, note that S=-+-0.4, for 
example, means that the intensity to the right (see 
beginning of this subsection for orientation) exceeds 
that to the left for an up-spinning electron incident, 
the ratio of intensities being 1.4/0.6. 

What is not shown in Table I, but which could be 
very important when considering possibilities of syste- 
matic error in an actual experiment, is the variation 
with @ of the ordinary differential cross section. This 
gradient turns out to be quite steep at angles of experi- 
mental interest. A representative value, for the speeds 
shown in Table I, is 150% change in differential cross 
section per radian at 6=90°. The gradient at Omax 
ranges from roughly 70% per radian for v/c=0.4 to 
about 150% per radian for v/c=0.9. 

Also important is the fractional change in ordinary 
differential cross section for given fractional change in 
energy. At 0=90° this amounts to about 1.6, and would 
be somewhat larger if put in terms of fractional change 
in momentum. 


TABLE I. 

v/c Smax Omax 0} 

0.9 0.51 150° 101° 170° 
0.8 0.48 137° 90° 167 
0.7 0.45 130° 85° 165° 
0.6 0.43 126° 83° 162° 
0.5 0.40 123° 82° 161° 
0.4 0.37 121° 85° 159° 
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Fic. »1. Electro- 
static spin-precessor 
due to Tolhoek and 
DeGroot (T51). 


In the first successful detection of longitudinal polari- 
zation of beta particles from unpolarized nuclei by 
Frauenfelder and co-workers (Fa57), a means of first 
precessing the spin from longitudinal to transverse had 
to be employed in order to be in a position to use Mott 
scattering as the detector. A device originally proposed 
by Tolhoek and DeGroot (T51) was employed in order 
to “erect” the spins. Such a spin-precessor is shown in 
Fig. 1. It is essentially a nonrelativistic device and so 
the energies at which it works best are well matched 
to a Mott-scattering analyzer. Neglecting relativistic 
considerations, its action is simple and direct: The spin 
has no way to know that the electric field is continually 
changing the particle momentum p, and thus the spin 
tends to retain its original direction in space, inde- 
pendent of sign or magnitude of the g value. (The 
device could more accurately be described as a mo- 
mentum precessor.) 

In the Frauenfelder experiment the Coulomb scatter- 
ing act, (taking place in a thin gold foil) in analyzing 
the transverse polarization as the beam emerged from 
the spin erector, was measuring the longitudinal polari- 
zation that had entered the spin erector. Inasmuch as 
this spin precessor is a reversible device, that is, no 
information is lost through its use, we could classify 
this experiment as first order. 

The angular aperture used was 95°-140°. A pair of 
end-window Geiger counters recorded the right- and 
left-scattered betas. The scattering foil was either 0.05 
or 0.15 mg/cm? gold on a low-Z backing. The results 
of the first measurement, using a Co™ source, are well 
known: These negative betas were predominantly left- 
handed, the magnitude of the asymmetry obtained 
being taken to imply longitudinal polarization at 
emission of magnitude v/c, at least at sufficiently high 
beam energy that all the “dirt” effects that had at times 
plagued certain of the double-scattering experiments 
were probably not serious. There was a drop in apparent 
polarization of over a factor two on changing the mean 
v/c from 0.49 to 0.47, the decrease amounting to 
essentially two standard deviations. 

We continue to introduce Coulomb scattering experi- 
ments in terms of the kind of spin precessor used, 
because the Mott scattering approach to polarization is- 
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Fic. 2. Use of crossed-field spin-precessor. In the apparatus of 
Cavanagh ef al. (CR57) original longitudinal spins entering at the 
right of the spin-precessor are precessed to emerge pointing down- 
ward. Polarimeter unit consists of scattering foil plus scintillation 
counter which can rotate about the beam axis. 


scarcely new in the laboratory, and has been reviewed 
(T). Actual use of spin precessors is however quite new. 

Another spin precessor is the crossed-field device. The 
beam is arranged to travel at right angles to an electric 
field E and a magnetic field B which are themselves at 
right angles. In the type to be described the fields are 
homogeneous over a certain region (it may be clearer 
to say that the geometry is rectangular). The crossed 
fields define a speed Bo= E/B. If one arranges 0<fo<1, 
then in the moving system characterized by speed Bo 
in the laboratory, there is no electric field, and the 
magnetic field is B’= B(1—,?)?. A Dirac particle at rest 
(for example) in this moving system has its spin pre- 
cessed at angular speed w’ = eB’/moc, the sense depending 
on the sign of the g value. In the laboratory system 
this means that a particle that has traveled a distance 
2=2npo(1—Bo?)“} has its spin precessed through an 
angle 0= 2r. Here, Bpo stands for laboratory momentum 
of the central ray. For distances in the laboratory Az, 
differing from integral multiples of the distance just 
cited, the precession angle of a general ray may deviate 
somewhat from that of the central ray, which remains 
A@= (1—B0?)*Az/po. But such deviations are readily 
computed. 

A crossed-field device with suitable slit system can 
be a mass spectrometer,{t and not simply a “‘velocity- 
filter.” It might be possible to make practical use of ‘this 
fact in dealing with polarized particle beams. 

Cavanagh and co-workers (CR57) have used a 
crossed-field device as a spin-erector. The beta energy 
was preselected by means of a magnetic lens (Fig. 2). 

This lens does nothing to alter P, since o-y in a region 
where E=0 is a constant of the motion. Next the 
crossed-field precessor interchanges longitudinal and 
transverse polarization components in the beam. 
Finally, Mott scattering at nominal 6=90 is used to 
analyze for transverse polarization. The scintillation 
detector (of size characterized by 0.3 radian) plus the 
S -vold scattering foil form a unit, the polarimeter 
4 which may be rotated in azimuth ¢. The results 
spmpare the nonrelativistic applications by Bleakney and 
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show the expected right-left asymmetry [the observed 
6 as in Eq. (1) being as large as 20%], but in addition 
a curious and hard to explain up-down asymmetry 
presumably of instrumental origin. It was possible by 
reversing fields to split off this latter (orthogonal) 
asymmetry from the desired one. In the first experi- 
ments Co® and Au!’ negative betas were shown to 
have longitudinal polarization consistent with —d/c. 
Alikhanov ef al. have also used a crossed-field spin 
precessor (A58). 

Regarding attenuation of the sought-for asymmetry 
due to the scattering foil having finite thickness, Fig. 3 
shows an extrapolation against foil thickness. This 
gives a graphic demonstration of how finite foil thick- 
ness, presumably through plural scattering and possibly 
depolarization, reduces the observed asymmetry. 

A third device for transferring polarization informa- 
tion from longitudinal to transverse (or vice versa) has 
been used specifically to prepare a beta beam for Mott 
scattering analysis. This we call a “spin reprojector.” 
It consists in scattering the original beam by atoms, 
whether singly or multiply does not matter to a first 
approximation. The transverse polarization of the elec- 
trons after scattering at 0=90° represents the input 
longitudinal polarization, to the extent that the original 
spin direction is unchanged on scattering.$§ The action 
is similar to that of the macroscopic electrostatic spin 
precessor. 

The resulting transverse polarization lies in the plane 
of the scattering. Even taking into account the fact 
that the spin-orbit effect may not be entirely negligible, 
this cannot lead to any polarization in this plane. The 
spin-orbit effect and any helicity memory act only in 
competition to dilute and dull somewhat the transfer 
efficiency of interest, since the sum of the squares of 
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Fic. 3. Effect of scattering foil thickness on the measured 
asymmetry, from Cavanagh et al. (CR57). 


§§ Bernardini, et al. (B58) treat this matter quantitatively: 
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three mutually complementary polarizations cannot 
exceed unity. A fairly low-Z scattering material is 
preferable from this point of view. 

Spin erection by means of this reprojection method 
has been used by Pond (Po057) on P* electrons. It has 
been used in a very nice geometry by Deshalit and co- 
workers (De57). Their apparatus is shown in Fig 4. 
The spin reprojector is in the form of a sheet of alu- 
minum (denoted by symbol o;) forming an arc of a 
circle. From the geometry of the circle one may then 
have (for point source and point second-scatterer) 
constant “total” scattering angle 8, in this case 90°, 
over a large range of emission angle at the source. This 
affords a gain in intensity which is needed in a second- 
order experiment. The resulting asymmetry for P?? 
electrons of energy exceeding 900 kev, at nominal 
second-scattering angle 6=75°, was 5%. This was 
interpreted as compatible with full v/c polarization. 

Spin erection by means of Coulomb scattering was 
suggested independently also by Tassie (Ta57). Another 
possibility for spin erection is based on Mller scattering 
[see Sec. III D (1)]. 

A high-energy precession effect relating to the 
anomalous magnetic moment has been computed by 
Mendlowitz and Case (M55). A spin precessor based on 
such an effect, used in conjunction with a Mott scatter- 
ing analyzer might suffer from a limitation due to 
energy mismatch. To illustrate with a simple situation 
where a Dirac particle travels normal to a magnetic 
field, if the g value were simply two, this would cause 
the preservation of spin polarization projected onto the 
velocity vector. In terms of “time dilation,” the zero- 
energy value of the cyclotron frequency (the frequency 
of rotation of the velocity vector) and the spin precession 
frequency are each multiplied by the same factor, 
(1—*)}. The two quantities, velocity and spin direc- 
tion, then stay in phase for a hypothetical Dirac 
particle. Otherwise said, ¢-y=o-(p—eA) is a constant 
of the motion because it commutes with the Dirac 
Hamiltonian. The interesting result of the Mendlowitz 
and Case calculation is that the additive “Schwinger 


Fic. 4. Apparatus of DeShalit et al. (De57), illustrating “spin 
reprojection”’ by a first scattering of the betas from source S in the 
‘curved aluminum sheet, o;. Analysis of transverse polarization 
lying in the plane of S—a,—az is effected by Mott scattering at g2. 


part” of the magnetic moment (or of the g value) is 
predicted to be immune to this time-dilation factor. 

If the velocity of the particle is effectively along the 
magnetic field (so that we might transform along the 
field, find the particle moving only normal to the field, 
but with practically nonrelativistic energy) then the 
spin ought to precess at the rate of weye (1+a/2r). Higher 
terms are not mentioned because free-particle experi- 
ments are still concerned with the first|||| non-Dirac term. 
It would then take the order of 137 cyclotron revolutions 
for the spin vector relative to the velocity vector (in the 
moving system) to change by one radian. However, 
for the case of primary interest here, where the particle 
with substantial energy moves normal to the magnetic 
field, it would require only about 137/E cyclotron 
revolutions to precess the spin one radian further than 
the velocity vector. Here E is the total energy of the 
particle in units of its rest energy. 

This computation has recently been done again by 
Carrassi (C58) using a somewhat different approach, 
with the same result obtained by Mendlowitz and Case. 
Such an effect would be of interest for itself, independent 
of any “practical” use. For de-erecting transverse 
polarization of highly relativistic particles it could 
possibly be useful, since P, becomes increasingly difficult 
to measure directly as kinetic energy increases above 
the rest energy. 

To return to the Mott scattering method itself. It 
would seem that the effect of the screening of the nuclear 
charge by the atomic electrons will have to be under- 
stood quantitatively, as will plural and multiple 
scattering effects in finite scattering foils, before pre- 
cision polarization numbers can be obtained with 
confidence. These matters are under active scrutiny in 
a number of laboratories, and therefore after a time 
a good understanding of this useful method must 


emerge. {| {] 
C. Energy Term o-H 


The Pauli energy term o-H(eh/2mc) affords a means 
of determining whether the spin o is parallel or anti- 
parallel to an applied magnetic field H. For measuring oc; 
of free beta particles it has so far not been employed.*** 
Niels Bohr showed that a Stern-Gerlach experiment on 
bare electrons is not a good way to go about such a 
measurement. This has been discussed thoroughly by 
Mott and Massey (MM). 

One may however capture the particle of interest 
into a bound system (see again the discussion by Mott 
and Massey) and then measure ¢-H. Concerning the 
preservation of the spin direction during capture, one 
notices that it is the charge e of the particle and not its 


\|l| Note added in proof —However, sce recent precision results 
for g factor of a free electron obtained by the University of 
Michigan group, Bull. Am. Phys. Soc. Ser. II, 4, 250 (1959). 

TT It will then (by definition) be possible to perform a double- 
scattering experiment to high precision, with no anomalies. 

*** A resonance experiment on free electrons has been performed 
by Dehmelt (D58). 
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magnetic moment that makes the process go. Hence, so 
long as the energy transfers involved (usually tens of 
volts only) are small with respect to moc? of the particle, 
there is no spin-flipping mechanism during the act of 
capture. 

Let us specialize to formation of ground-state posi- 
tronium since this has been employed for polarization 
measurement. Here an “up” positron allowed to form 
positronium (say by radiative capture with an effec- 
tively free electron) will not leave electron holes more 
“up” than “down.” The fine structure of the resulting 
positronium is too small to cause selective capture to 
singlet rather than to triplet states (the former being 
bound more tightly by AZ of only about 10~ ey). It is 
necessary to examine a little more closely to find out 
where the original positron polarization information 
goes. 

Consider still a free “up” polarized positron (that 
is with its spin along +z) which eventually is captured 
by one of a group of electrons which as a whole had no 
spin preference with respect to the z direction. Suppose 
there are no applied fields. On a time scale, the triplet- 
singlet energy difference AE is by no means incon- 
sequential: AE corresponds to ~10- sec. Thus at a 
time later than capture of the order of 107? sec, a 
(resonably adiabatic) inquiry whether the positronium 
state is singlet or triplet makes physical sense. This 
arises spontaneously in the process of decay by annihila- 
tion into two or three photons, respectively. At this 
stage, having allowed singlet and triplet state amplitudes 
to have lost their relative phase, the original spin in- 
formation is largely lost: To be sure it could be partially 
retrieved if one looked only at certain kinematically 
special triplet (three-quantum) annihilation events, 
measuring photon helicity—rather difficult experi- 
mentally. For singlet annihilation (into twophotons) 
there is no information that can be traced back to the 
original positron polarization, either in the angular 
correlation with respect to the z=0 plane, or in the 
helicities of the photons, the parent state being J=0 
[isee, for example, Yang’s paper on dematerialization, 
(Y50) ]. In the case of the singlet state, the original 
polarization has not actually been lost, but resides in 
the sign of the hole where the captured electron was, 
with complete preservation of information. For the three 
triplet substates as a whole the information resides 67% 
in the helicity of the residual photons, and 33% in the 
sign of the hole.t{{ The important point is that by 

examining only the two-photon residue, no information 

remains. This is in contrast to higher energy direct 
annihilation where quite large retention of polarization 

_ jnformation is possible [see Sec. D (5) J. 

‘Tf the positronium is formed in the presence of an 
applied magnetic field, along 2, say, then in effect some 
aa . = -thi 

TF Tius oo thins of the time, J Mak hand apeina 
the inl positron polarization to hole polarization is thus 
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phase coherence between singlet and triplet states (of 
same Jz, thus only M=0 states are affected) can be 
“frozen in” until the annihilation takes place, and not 
only for the 10~” sec permitted by the triplet-singlet 
energy difference. Under these circumstances some 
polarization information will now be retained. We 
should recall how the Zeeman splitting goes for ground- 
state positronium [see the positronium review articles 
(D53) and (D54) ]. Here, J is no longer a good quantum 
number in the presence of the field H, but only its 
projection along the field. Let us use an arrow for up 
or down positron spin, and likewise for the electron spin. 
Let the positron’s spin state stand first. The state #4 and 
the state J) are each unperturbed by the applied field. 
The M=0 part of the Hamiltonian, normally diagonal, 
now has, in virtue of the o-H energy, part of which 
belongs to the positron and part to the electron, matrix 
elements connecting triplet and singlet. On rediagonal- 
izing the Hamiltonian, the nominal triplet state is 
(4+4) —e(t)— 4t), and the nominal singlet state is 
e(M+dN+(t¥— Jt). The mixing amplitude e 
practically linear with H, below 20 kilogauss, and 
becomes unity at extremely high field. The signs are 
such that positive H, means positive e. 

In the high field limit, the two states of interest 
become (for positive H.) | for triplet and 4) for 
singlet, indicating complete spin memory. Even at 
modest field values, say about 15 kilogauss, the dis- 
parity in (ø=) for the positron (or the electron) between 
triplet and singlet is about a factor two. We can think 
of this as a capture efficiency ecap = — 0.33. This means 
that initial positron polarization (called P.) would be 
33% preserved in the capture act, residing finally in the 
asymmetry between the population of triplet and singlet 
states. As explained earlier, capture to M=1 states 
becomes generally a three-quantum annihilation prob- 
lem and is not considered. 

The discussion now divides according to whether 
the positronium is formed in a gas or in condensed 
material. 

Positronium in gases.—At a field strength of 15 kilo- 
gauss about 0.97 of the triplet (M =0) states annihilate 
via two photons, and for practical purposes, all the 
singlet states do. Except for a peculiarity relating to 
the kinetics of the positronium atom just formed, the 
polarization information (all but some 3% of it) would 
now be lost, since it was agreed earlier that we should 
not attempt to measure helicity of the three quantum 
events. The triplet state has at this field value a life- 
time ~3X10~ sec, the singlet state its normal value 
~10-” sec. Nascent positronium in argon is known 
to have recoil energies of 1 or 2 ev. The magnetically 
quenched triplet states yield two photons observed to 
have such strong angular correlation at 180°, when 
suitable slowing (thermalizing) agents are present m 
the gas, that undoubtedly the positronium has losti 
essentially all its original recoil energy in 3X10~ sé 
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This behavior was known some time before polarization 
became of interest (H56, He57). 

One may judge from the observed mutual angular 
correlation for the two-photon events whether the 
parent state was nominal triplet or singlet, the con- 
nection between an observable and the original positron 
spin direction being thereby completed. With reasonably 
narrow slits (about 1X10- radian of projected angle) 
the former states can be counted about five times as 
efficiently as the latter, at first glance implying the 
large efficiency for this stage of the operations of 0.67. 
However, inasmuch as there are many competing 
annihilations recorded (about five times as many posi- 
trons as we have just discussed are at the same time 
making competing direct annihilations in the gas 
sample) the effective efficiency because of the diluting 
action of such events drops to about 0.20. 

To summarize: Let a hemisphere of positive betas 
emitted in the general z direction come to rest in a 
gaseous annihilator without depolarization. The geo- 
metrical efficiency (for projecting any longitudinal 
polarization P, onto the z direction, calling the latter 
polarization P+) is then 0.5. The capture efficiency ecap 
to ground-state positronium as discussed earlier would 
be 0.3 (at 15 kilogauss), the detector efficiency as dis- 
cussed immediately above could be typically 0.2. The ex- 
pected 6 (on reversing H) is then 2X0.5X0.3X0.2X P, 
=0.06P,. In the initial experiment of Page and Hein- 
berg (PH57) it was found that Na” positrons character- 
ized by v/c at emission of 0.75 were polarized with 
P,>+0.30, the lower limit being considered conserva- 
tive in that no backscattering corrections from the gold 
backing nor depolarization estimates were made (Fig. 5). 
The detector efficiency was purposely overestimated. 
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Fic. 5. Use of positronium in a gas as polarization detector 
(PH57). For longitudinal positron polarization at emission of 
+0.75, measured 5 would follow oblique line whose slope is 
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Fic. 6. Positronium in a gas (PO58). Lower curve: extrapolation 
of measured 6 to zero effectively yields a measure of width of 
singlet contribution to the two-photon angular correlation. Upper 
curve: quenching ratio Q at infinite slit width yields fractional 
positronium formation in gas sample. Variation of Q with slit 
width yields width of triplet events. 


Later data (Fig. 6), where effectively this latter effi- 
ciency is measured for the particular gas sample in use, 
by extrapolating measured 6 against reciprocal angular 
resolution, gave P.=0.25. This implies a lower limit on 
P, of +0.50. The low-Z source-backing in the latter 
experiment (quartz) improved matters somewhat. 
Detailed angular correlation measurements (PO58) 
showed clearly the transfer from “‘broad component” 
to “narrow component” when the H field was changed 
from parallel to the positron momentum to antiparallel. 

Inasense this kind of experiment is a comparison one: k 
between the sign of (e)- H and the theoretical sign of the 
triplet-singlet energy difference for ground-state posi- 
tronium. Here it was of practical concern for the first 
time which lay higher in energy, the triplet or singlet 
states. 

Positronium in condensed materials —Since the time _ 
when DeBenedetti and Richings (D52) found positrons 
to live longer in certain dense materials (specific gravity — 
unity or more) than free positrons should, a number of — 
diverse experimental results have been in accord with — 
the hypothesis that positronium may exist in a 
class of substances. We mention the detailed measu: 
ment of these long lifetimes (Bell and Graham B 
the measurement of the abnormally high thre 
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the abnormally narrow component in the two-photon 
angular correlation [ (P55) (S55) ], measurement of the 
magnetic quenching of the triplet state showing that 
this state provides broad-component two-quantum 
events [Page and Heinberg (PH56) ], detailed time- 
delay measurements as a function of magnetic field 
showing that the ft and the JJ states (in Teflon) were 
not spontaneously mixed with others [Deutsch (D56) ], 
the observed decrease of three-quantum yield in a strong 
magnetic field giving a consistency with ordinary posi- 
tronium [Telegdi ef al. (156) ]. Admittedly, we are 
now on less firm ground than when dealing with bona fide 
positronium as in a gas, but it is interesting that posi- 
tron polarization measurements can be made utilizing 
the concept of positronium in condensed materials. The 
class of materials which exhibits positronium effects 
has been called the “long component class.” Otherwise 
it might be called the narrow component class. It might 
suffice for the remainder of the discussion to call these 
materials “plastics.” tt} 

We might assume for argument that the triple-singlet 
energy difference for positronium in the condensed 
materials is the same as for free positronium. The 
transfer efficiency from initial polarization of the slowed 
positrons to the population disparity between triplet 
(M=0) and singlet states would then be the same 
figure at the same magnetic field as for free positronium. 
From this point on, the picture is quite different. 

For our purposes, each positron which comes to rest 
in the annihilator gives two photons. Ignoring polariza- 
tion for the moment, let us discuss a typical “plastic” 
such as fused quartz. The original conjecture was that 
the narrow component occurred in parallel§§§ with the 
delayed component (P55). If we retain this view, and 
also pretend that all measurement is ideally precise, we 
shall have the singlet states giving only narrow com- 
ponent, denoted N, the triplet states in absence of 
applied field giving only broad component, denoted B, 
with nonpositronium events giving also a broad angular 
distribution, B’. In actual measurement, B’ has never 
been distinguished from B. Discussion is now restricted 
to the two M=0 states, singlet and triplet, unless other- 
wise stated. 

Application of a strong field H, strong enough 
to compete successfully with the rather fast (one 
or two by 10° sec lifetime) “pickoff annihilation,” 

causes a detailed transfer of events from B to N 
[(Pa55) (PH56) (W57) ]. Fields of 10 to 20 kilogauss are 
typically required. If an angular correlation apparatus 
is set to accept only 180° events, within about 1073 
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radian, the coincidence rate increases with H, as in the 
so-called quenching curves of Fig. 7. Here the detector 
is perhaps four times as efficient for N events as for B 
events. 

Regarding the possibility of positron polarization 
measurement, the situation in one respect is not as 
fortunate as with positronium in gases: “up” positrons 
in an “up” field ought to form more singlet states than 
triplet, the singlet states give only WV events, but the 
triplet states instead of giving only B events (as at zero 
applied field) now might divide 0.6 of the time into Band 
0.4 of the time into N, for example. In other words the 
H field which is required to furnish the initial “capture 
efficiency” acts unfortunately to dull the action of the 
detector from distinguishing triplet from singlet events. 

In order to see what response to expect for plastics 
let us supply some rather rough numbers, and assume 
that the field is such that one-third of the triplet 
(M=0) decays appear as Ņ events. For orientation, 
at H=0, one has 0.5 of the positrons of interest forming 
singlet states and 0.5 triplet (M =0) states. The fraction 
fx of these yielding NV events is thus 0.50. For un- 
polarized input positrons, at H=#Hy, one has still the 
equal division into singlet versus triplet states, but fy 
becomes now 0.67. For completely polarized stopped 
positrons, at H= Hj, the capture efficiency for keeping 
the polarization information might be 0.33 (depending 
indirectly on the lifetime of the delayed component). 
Thus, for field H parallel to the spins, a singlet/triplet 
population ratio of 0.67/0.33 is implied, and fy is com- 
puted as 0.78. For the opposite field setting one gets 
fx=0.55. In order to arrive at a final number, suppose 
that 0.35 of all incident positrons go into (M=0) 
positronium, and that the detector response NV: B is 4:1. 
Using such numbers one would “predict” a measured ô 
in the coincidence rate on reversing field of 0.14, given 
positron polarization P, to be unity. Having field H 
and spin parallel should yield the greater coincidence 
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Fic. 7. Quenching curves for positronium in plastics. Fractional 
increase in two-photon coincidence rate within 1.5 milliradian> 0 
180°, as a function of applied field H, corrected for focusing effec ; 
Actual transfer from broad B component to narrow V componer 
on applying field is less than one-fourth of Q—1. From Page %4 
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rate, which is opposite to the response in the gas experi- 
ments just described. 

The magnitude of the response of plastics to polarized 
positrons is next to impossible to estimate a priori 
because so many essentially unknown numerical factors 


` enter. But the sign of the response may be traced back 


to one hypothesis, that the normal fate of the triplet 
states is broad annihilation (a half-width like 1010-3 
radian), and the normal fate of the singlet states is 
narrow annihilation (half-width less than 310-3 
radian). 

To come finally to an actual experiment, Page and 
Obenshain (PO57) used a hemisphere of Na” positrons 

gold-backed) which impinged on the plastic sample. 
The 6 in the coincidence rate at 180° with angular 
resolution 1.5 10 radian was measured on reversing 
H. Only for the narrow-component materials was a 
nonzero 6 found, and its sign agree always with that 
anticipated in the foregoing discussion. With no absorber 
between source and sample, and running typically at 
H=13 kilogauss, 6 was measured for fused quartz, 
polystyrene, DC-200 high viscosity silicone oil, poly- 
ethylene, and Teflon. Expressed in percent the results 
in order were 1.7-40.5, 1.70.7, 1.30.6, 1.30.4, and 
1.90.5. With an interposed absorber of 0.0005-in. tan- 
talum, the Teflon response increased to 481.1%, 
while a (low Z) Mylar foil of like density in mg/cm? in- 
creased the Teflon response to 4.5-++0.9%. The very 
roughly estimated response quoted above would yield 
7% expected 6 for a hemisphere of fully (longitudinally) 
polarized positrons, neglecting back reflection. 

Hanna and Preston (Ha58) have checked the sign of 
the response of polyethylene against the “iron detector.” 
Page and Obenshain have checked the sign of the iron 
detector against the plastics [see Sec. III D (2)]. 

The fact that plastics in general give a detectible 
response to positron polarization must be taken as 
evidence that depolarization on slowing is small (even 
in tantalum, Z=73) and as supporting evidence that 
something very much like positronium must exist in 
this class of materials, and finally that the separation|||||| 
into broad and narrow components for triplet and 
singlet states is rather clear-cut. 


D. Comparison Methods 


As discussed earlier (Sec. III A) a comparison method 
is one where the spin direction to be analyzed is com- 
pared somehow with the spin direction of atomic 
electrons already known. It is sometimes implied that 
comparison methods in general are rather inefficient. 


|||||| Further results by Bell and co-workers (GB57) on the 
fraction of positrons in the delayed component for fused quartz 
in particular (now reported to be 0.51 instead of the earlier 
estimate of about 0.3) have removed the numerical disparity that 
existed for some time between fractional narrow component and 
fractional delayed component. The recent polarization applica- 
tions however, being based on a balance method, might be argued 
to lend strong support to the contention that this separation is 
clear-cut. 


However, if one is discussing the particular physical 
act which is the heart of a given method, then it often 
turns out that efficiencies are closer to unity than to 
zero. It is best to keep separate the physics of the 
analysis act from the question of what degree of 
(reference) polarization might be achieved in practice 
for the average electron in a typical ‘‘iron” foil for 
example. In the discussion of several comparison 
methods, efficiency means efficiency per unit degree 
of polarization of the target electron, unless some state- 
ment is made to the contrary. 


(1) Møller and Bhabha Scattering 


The most direct and intimate comparison of spin 
directions is afforded by the electron-electron and 
positron-electron scattering processes, which we label 
“Møller” and “Bhabha” scattering respectively. The 
matrix element of Møller (M32), from which have 
stemmed the well-known Mller and Bhabha scattering 
formulas, has within it much polarization information 
which, prior to about 1957, was not much considered. 
Bincer (Bi57) seems to have been the first to publish 
computations on the polarization aspects of such 
scattering. 

Speaking first about electron-electron collisions, and 
in the nonrelativistic limit, elementary quantum me- 
chanics tells us that if a pair of identical spin- fermions 
are to undergo mutual scattering (letting all spins be 
random), one-half of the time the spins are antiparallel, 
the particles therefore distinguishable after the collision 
(they are always distinguishable before the collision) 
and the scattering cross section has its classical value 
at all angles in that there are now no interference 
(exchange) effects. For the other half of the time the spins 
are parallel, in which case we have indistinguishability 
after the collision, interference effects come into play, 
and, for 90° scattering in the center-of-mass system 
in particular, there can be no scattering. On the whole 
there is a progressive shift from the classical cross 
section at small angles to one-half of the classical value 
in the large angle limit, 90°. 

The alternative view (which is not particularly useful 
for polarization discussions) is that three-fourths of the 
time the pair of spins form a triplet state, nullifying 
the scattering at 90°, while the remaining one-fourth 
of the time they are in a singlet state, now with con- 
structive interference for fermions which doubles the 
cross section at 90°. The same numerical result is 
obtained at any angle as with the previous view. This 
result hinges on there being no spin-flips during the 
collision and again no spin sensitive forces in the 
scattering “force,” which guarantees the first proviso. 

Longitudinal polarization analysis—For the relati- 
vistic discussion, let us focus attention on helicities. 
Writing the appropriate free particle spinors for 
definite initial and final helicities, then applying the 
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Fic. 8. Differential Møller and Bhabha scattering. For a com- 
pletely polarized target electron, the efficiency e for detecting 
longitudinal polarization of e~ or e* respectively, is plotted against 
fractional energy transfer w. Total energy of incident particle is 
ymo. Møller and Bhabha expressions coalesce as y > ©. Solid 
curves, labeled Bhabha,* apply if final et and e~ are not distin- 
guished by the apparatus. Inasmuch as cross section generally 
changes rapidly with w, partial areas under curves may have but 
limited practical significance. Curves have been plotted from the 
formulas of Bincer (BiS7). 


that is, writing down the Mller matrix element, the 
following facts emerge. We refer mostly to 90° scattering 
in the center-of-mass system where those polarization 
effects which arise entirely from the subtraction of 
diagrams are generally a maximum. 

In the first place, single flips of helicity are taken out 
by the interference (at 90° scattering angle). In the 
second place, particle helicity tends to be better con- 
served as the energy increases. I| 1 1 We are not referring 
to the more trivial effect where at small angle the helicity 
naturally is well preserved. The helicity of one of the 
input particles would be in fact 80% conserved at 
only 0.6-Mev laboratory energy, in the sense that either 
of the output particles viewed in the center-of-mass 
system preserves 0.8 of the input helicity (for 90° 
scattering). This fact is used later. In the third place, 
and this is the important point upon which the polariza- 
tion experiments have been based, the relative cross 
section (at 90°) between input spins parallel and anti- 
parallel, for longitudinal polarization only, which was 
zero in the low energy limit climbs only to 4 in the high- 
energy limit, the corresponding efficiency being there- 
fore 7/9. The complete expression for this cross-section 
ratio as given by Bincer (Bi5S7) is 
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polarized target electron. The abscissa is the conven- 
tional fractional energy transfer, w=3(1—cos@). The 
two dashed curves labeled “Mller” and the limiting 
shape at high energy labeled apply here. 

Bincer has also computed the companion expression 
for Bhabha scattering (Bi57). A few curves computed 
from these results are also shown in Fig. 8. These are 
the dotted curves labeled “Bhabha.” At first glance, 
the et and the e~ are distinguishable and thus one 
expects no polarization sensitivity in the sense of a 
correlation between positron and electron helicities for 
a given probability of scattering (of course, both initial 
and final e+ helicity tend to be correlated, and the same 
applies for the e~, on any simple theory). The point here 
is that there exists the virtual annihilation process, 
and this accounts for a good part of the ordinary 
(unpolarized) positron-electron scattering cross section 
(A54). This process is represented by one of the 
Feynman diagrams. It manifests itself now in making 
possible the polarization effect shown in Fig. 8. The 
effect dies out at quite low energy, where the annihila- 
tion amplitude has diminished. 

Inasmuch as the Bhabha polarization efficiency «e asa 
function of w gives a rather unrealistic picture of how an 
experiment might be designed (because the differential 
cross section typically changes by a factor 30 between 
w=0.25 and w=0.75), we show also some modified 
Bhabha curves. These efficiency curves, labeled 
Bhabha,* would apply if the two particle detectors, 
operated in coincidence (or a single detector, if such an 
experiment were done), were completely insensitive to 
whether the counted particle is a final e+ or a final e. 

To summarize the picture presented in Fig. 8, the 
Meller method for longitudinal polarization measure- 
ment could be classed as a low-energy method, based 
on symmetry (Pauli principle)—the fact that a high 
efficiency persists to all energies can be regarded as good 
fortune. As for the Bhabha method, which again rests 
on symmetry, high energy is required to provide 
sufficient annihilation amplitude that the symmetry 
may come into play, so it is essentially a high-energy 
method. The left half of Fig. 8 clearly shows these 
energy trends. Another view is that at high energies the 
two methods are just the same (the infinity curve), but 
that at lower energies different details appear, simply 
because the forbidden zone of width 2mc? is no longer 
negligible. 

The Mller scattering method for the purpose of 
polarization detection was first employed by Frauen- 
felder and co-workers (Fb57) on negative betas from 
P® and from Pr“ with the result that both beta groups 
were left-handed, and by an amount consistent with 
v/c at emission. Their apparatus is shown in Fig. 9. For 
reasons of impingement of beam, the scattering foil 15 
inclined to the beam direction, so the efficiency has t0 
be multiplied by the cosine of the small relevant angle, 
What is measured here is not an absolute different? 


cross section, or relative cross section against angle 
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lic. 9. Møller scattering setup of Frauenfelder 
and co-workers (Fb57). 


collision, but only a comparison of a coincidence rate 
proportional to cross section between the two opposite 
states of magnetization of the scattering foil. For this 
reason the apparatus need not be especially elaborate. 

The full definition of a true Møller event is that 
(i) pitpe= po. That is, momentum of particle No. 1 and 
particle No. 2 (labeled by which counter they enter) shall 
add up to the incident momentum. (ii) Ki +K2=Ko. 
The kinetic energies likewise shall add up to the 
original kinetic energy. Coincidence detection of Møller 
events has been effected in the past by requiring for 
example that (pitpe)-po= po? (P51), (A54), which is a 
compromise between complete rigidity of definition and 
counting rate. At energies of say 0.5 to 2.0 Mev, at the 
fairly high fractional energy transfers, say 0.4<w<0.5, 
both pı and ps make sufficiently large angles with po 
that this simple criterion rules out multiple or plural 
scattering in the target foil, particularly after the 
Møller event, as well as against energy loss on the part 
of either particle No. 0, 1, or 2. Thus by setting 
(besides po) a certain value of pi-po, coincidences are 
found only at the expected value of p2-po if the experi- 
ment is clean. See in this connection “coherence 
curves” (A54). 

Returning to the experimental setup of Fig. 9, where 
a considerable spread was accepted in the incident 
energy Ko, typically a factor two or three, one would 
lose essentially all opportunity to check for “coherence”? 
between K, and Ke. There does still remain the possi- 
bility for an “angular coherence” check, which however 
due to the relativistic narrowing is sharpest only for 
fixed Kı+K2ə. 

For the isotope Au", results obtained with the setup 
of Fig. 9 were interpreted (Fc57) as implying negligible 
longitudinal polarization for these particular negative 
betas (some seven standard deviations less than the 
expected v/c value). The Mller method was subse- 
quently proven, for the case of Au'’, by Benczer-Koller 
et al. (BW58). The apparatus used by the latter group 
did not differ in essential details from that in Fig. 9. 
A magnetic lens system was used to bring the beam into 
the apparatus and focus it on the “iron” foil. It has 
been pointed out (W58) that precautions were taken 
in the latter experiment to discriminate against true 
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time-coincident Compton electrons (due to the accom- 
panying Au”! gammas) which could otherwise produce 
false Møller events, and that these precautions could 
probably account for the difference between the two 
results. The latter result was v/c polarization to seven 
standard deviations, and left-handed. 

Use of the Bhabha method for longitudinal polariza- 
tion measurement was first attempted by Frauenfelder 
and collaborators (Fd57) on the high-energy positron 
group from Gas. The results were interpreted as 
meaning that these positrons were probably not polar- 
ized (a matter of about three standard deviations 
below the expected v/c value). However, three different 
methods [see Sec. III D (2), (3), and (5) ] applied to 
these same betas have since yielded large (right- 
handed) helicity. Since we know there is nothing wrong 
with the ordinary Bhabha differential cross section 
(A54) which itself is highly sensitive to whether one 
takes into account the interference effects, it must be 
presumed that the Bhabha polarization method is 
intrinsically sound. 

Transverse polarization analysis.—It is interesting to 
contrast the possibilities for transverse polarization 
analysis with the longitudinal case just discussed. 
Specific discussion is for the electron-electron case. 
One may examine again the basic Møller matrix element, 
and restrict again to 90° scattering angle in the center- 
of-mass system where the algebra is easiest and the 
physics most meaningful for our purposes. For the 
efficiency of a transversely polarized target electron we 
have the nonrelativistic value unity at sufficiently low 
energy. With increasing energy, the efficiency falls off 
very rapidly compared to the longitudinal case just 
described: if we average over azimuth of the plane of 
collision the efficiency «is down to 0.5 at only 0.65-Mev 
laboratory energy, 0.3 at 1.25 Mev, 0.1 at about 3.4 Mev 
and ultimately becomes zero. If, instead, the plane of 
the collision is restricted to include the initial “spin 
vectors” involved (to include the spin vector of the 
polarized target electron will suffice) then the efficiency 
falls off almost as fast with energy, but does have a 
nonzero residue in the high-energy limit of +0.11. Or, if 
the plane of collision is required to be at right angles to 
this direction, e saturates at —0.11 in the high-energy 
limit, having passed through zero at about 3-Mev — 
laboratory energy. -a 

For practical purposes, the transverse Møller meth ds 
does not compare favorably with the longitudinal Møl] 
method. The details for the transverse Bhabha method 
can be had by inspecting the matrix element aga’ in. A 
paper by Stehle (St58) lists in convenient form b oth 
Møller and Bhabha matrix elements. 

Spin erection.—A further comment on Meller S 
ing relates to its hypothetical use as a spin 
(compare the electrostatic spin-precessor and 
projection spin erector discussed in Sec. III | 
helicity memory through the Møller e 
rather large: given an incident longitu 
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let it make a single collision in an unpolarized target. 
Then either of the offcoming particles, as viewed in the 
center-of-mass system will have a certain memory of 
the original helicity. Speaking as usual of 90° scattering, 
this efficiency for retention of helicity is +0.6 at 0.25 
Mev, +0.8 at 0.6 Mev, and +0.9 at 2 Mev in the 
laboratory. The high-energy limit is 8/9. If then we 
transform either of the final electrons to the laboratory 
system we find that the over-all efficiency for transfer 
of longitudinal polarization P, into transverse polariza- 
tion P,(lab) departs from zero ~0.71E, reaches the 
value 0.4 at 0.25 Mev and 0.5 at 1.0 Mev. It then falls 
with increasing laboratory energy E, approximately as 
0.89(E+2 Mev)=}, decreasing to 0.25 at 10 Mev. The 
direction of the resulting P, is clear from the descrip- 
tion—it points away from the line of collision for right- 
handed input. Longitudinal target polarization likewise 
could be transferred to P,(lab) using the same numbers, 
given an unpolarized incident beam. Again, the transfer 
of polarization from P, to P,(lab) can be had from the 
transfer efficiencies just quoted by multiplying in each 
case by (E+1 Mev)?. 

If one asks now for the transfer efficiency from P, to 
P,(lab) in a 90° Mller scattering act from an un- 
polarized target, which is the inverse of spin erection, 
one finds quite different numbers from the spin erection 
case. Thus in contrast to the spin-erectors already 
described: macroscopic electrostatic device, crossed 
fields device, or the use of atomic scattering for spin 
reprojection, this is an illustration of a nonreversible 
device. For P,— P,(lab), the spin direction of the 
participating target electron and also of the conjugate 
recoil electron are (individually) about 90% certain. 
For P;— P,(lab) these spin directions are about 90% 
uncertain, except at very low energy. 

As for use of Bhabha scattering for spin erection or 
de-erection, the various transfer efficiencies at high 
energies will be nearly the same as for the Møller case. 
For low energies, even for 90° scattering some memory 
of “which one is the positron” will persist through the 
collision, and at very low energy the various transfer 
efficiencies at any angle will be apparent from the 
geometry of the collision. The discussion has by now 
become quite academic. For positrons, or indeed elec- 
trons, except at quite high energy one would generally 
be better off to use single, plural or multiple Coulomb 

scattering for de-erection of spin. 

Finally, neither Møller nor Bhabha formulas show 
any right-left asymmetry as there is in Mott scattering, 
even allowing both target and beam to be (transversely) 


polarized. 
(2) Annihilation of Slow Positrons in a Ferromagnet 


Suppose that initially polarized positrons could be 
brought to rest (that is to essentially thermal energies) 
ina ferromagnetic material, and still retain some of the 

‘original polarization, so that finally (c.)0. We consider 


complete polarization 
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of over 99% of these stopped positrons is two-quantum 
annihilation, and against electrons of a definite spin 
polarization, namely opposite to that of the positrons, 
The total annihilation gamma momentum represents, 
to the extent that the positrons themselves contribute 
negligible momentum, the momentum (Fourier trans- 
form of wave function) of those electrons susceptible 
annihilation (for example, La57). 

In iron, one believes that the magnetic electrons (the 
“two” out of the 26 electrons per atom) surely must 
have different momentum components than the average 
electron. Thus, quite aside from positron polarization, 
there would have to be some correlation between amount 
of departure from collinearity of the two annihilation 
photons and whether or not a “magnetic electron” had 
been annihilated. Since the deeper bound electrons, 
which are not polarized, also are presumably not very 
accessible to a slow positron for annihilation, this may 
tend to enhance the correlation. What could be measured 
in the two-quantum angular correlation is the relative 
abundance of the higher momentum portion (thus, in 
the wings of the curve, say from 5 to 10107 radians 
away from 180°) for the two analyzer settings: mag- 
netization along +z and along —sz. Thus if (c+) for the 
positrons were positive, then -++ magnetization must 
yield more of the higher momentum events than — mag- 
netization, providing the magnetic electrons tend to 
have higher momentum than the average electron with 
which a positron may annihilate. 

This method was devised and used successfully by 
Hanna and Preston (Ha57). Their first polarization 
results showed (accepting that the magnetic electrons 
have generally higher momentum) that Cu® positrons 
are right-handed. In later work (Hb57, Ha58) various 
ferromagnetic materials and a number of different 
positron emitters have been compared. Figure 10 shows 
the apparatus used in certain of the experiments of 
Hanna and Preston. Instead of using the more standard 
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Fic. 10, Angular correlation apparatus of Hanna and Preston 
(Ha58) for measuring two-photon angular correlation fro 
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momentum measurer as originally introduced by De- 
Benedetti and collaborators (D50) which essentially 
defines projected angle between photon momenta —k, 
and ks, they used an annular detector slit located 
opposite to a point detector so that the actual angle 6 
between —k, and ky is defined. The particular choice of 
geometry for the gamma detection is not crucial for 
making polarization measurements however. If this 
technique eventually makes contact with any anisot- 
ropy of the momentum of the polarized electrons (with 
respect to crystal axes) then choice of detector geometry 
can be quite important. 

The measured 6 on reversing the state of magnetiza- 
tion, as obtained by Hanna and Preston, range from 
5% to 10% depending on controllable circumstances, 
such as inserting absorbers between source and iron 
analyzer to “clean up” the positron beam. 

There is no reason to suppose that the positrons on 
slowing have partaken of the polarization of the ambient 
electrons to any extent. To consider how annihilation- 
in-flight competes with the ordinary energy loss proc- 
esses, the fact that a positron has survived so as to 
finally appear within tens of milliradians of 180° in the 
angular correlation predisposes it to be polarized, by a 
fraction of a percent, and parallel to the predominant 
electron spins. However, such added polarization would 
be the same for both states of magnetization (if referred 
to the magnetizing field direction) and therefore could 
not influence the measured 6 to first order (and to any 
order, no matter how large such added polarization 
might be, it could not give a false sign to the measured 6). 

As with analysis of positron polarization by means of 
positronium in the “plastics,” absolute polarization 
measurement by the present technique awaits further 
development. But relative polarization measurement is 
feasible now. 

The sign of the response of annihilation-at-rest in 
iron, the sign of the response of positronium in plastics, 


IRON] WITH ABSORBER 
IRON,| NO ABSORBER 


© MILLIRADIANS 


Fic. 11. Response of iron annihilator to polarized sodium-22 
positrons compared to response of a plastic, both run at same 
resolution A@ in same apparatus. Measured 6 on reversing mag- 
netic field is opposite in sign for the two materials. Positive ô means 
higher coincidence rate for H parallel to positron spins. (PO58). 
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and the sign of the response of positronium in gases, 
are mutually consistent (the iron method has, for 
entirely different reasons, the same sign as for gases; 
the method using plastics has the opposite sign). 
Figure 11 shows a comparison plot between measured 
ô for iron (mild steel) and for a plastic (Teflon), run in 
the same apparatus at the same angular resolution. The 
resolution chosen (4 10- radian) seemed to be suited 
to the iron, but is known to be too small for optimum 
response in Teflon (III C). In any case the opposite re- 
sponse to the same positrons is evident. Hanna and 
Preston have checked (using Cu™ positrons) the re- 
sponse of polyethylene with respect to iron and it was 
opposite (Ha58). 

The function of the absorber, which has been applied 
both at Argonne Laboratory and at Pittsburgh, is 
simply to enhance the measured ô, by selecting the 
higher energy portion of the actual positron spectrum 
which emerges from the source material plus the source 
backing. Thus, over and above the obvious fact that 
the effective v/c is slightly higher, clearly one is selecting 
particles of better polarization in the z direction. 
Depolarization of a positron beam in flight appears not 
to be a very serious matter. 

A further comment on the three annihilation-at-rest 
methods for positron polarization measurement, is that 
it is generally quite difficult to join on in series a spin 
erector or de-erector, or an energy analyzer, principally 
because of the large ambient magnetic field in the 
immediate vicinity of the active region. For transverse 
polarization measurement, the annihilation-at-rest in a 
ferromagnet would clearly be the most easily adapted, 
since the magnetic field strength just outside the 
active sample need not be objectionably large. 


(3) Annthilation-in-Flight in Polarized Electron Target 


A method which is intrinsically nonrelativistic, but 
which can be used at fairly high energy nonetheless, is 
based primarily on the fact that at rather low energies, 
positron and electron helicities must be the same for 
two-quantum annihilation in flight to take place (three- 
quantum annihilation-in-flight is smaller by ~1/137 
and does not concern us here). There is, however, a 
strong departure from the nonrelativistic situation as 
regards efficiency for a longitudinally polarized analyzer 
to detect longitudinal beam polarization, at least 
assuming any and all two-quantum events to be re- 
corded. This behavior is shown in Fig. 12, the solid 
curve. A crossover in sign of response occurs at about — 
4-Mev beam energy, so that at higher energies annihil: 
tion takes place more readily with spins para 
(positive efficiency e€). 3 

However, as was realized by Frankel and co-wo 
(Fr57), the low-energy behavior (that is, e= —1) 


photons define a line of annihilation) lies ; 
the line of collision. In practice one choos 
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Fic. 12. Total cross section for annihilation-in-flight. Efficiency 
per polarized electron in target for analysis of longitudinal 
positron polarization (solid curve). Efficiency for transverse 
polarization analysis (dashed curve). Positive e means parallel 
spins are favored to annihilate. From (P57). 


narrow cone in the center-of-mass system that the high 
relativistic type events (those tending to make e= +1) 
are partially excluded from being counted. Thus, by a 
judicious choice between loss of efficiency for too large a 
cone and loss of counting rate for too narrow, a first- 
order experiment can be done on rather high-energy 
positrons. Frankel ef al. used Ga®® (energies up to 
4 Mev) with the result that these positrons are quite 
highly polarized, and right-handed, in line with the 
result of Deutsch eż al. using another method [ Sec. III D 
(5) ]. In this type of experiment one necessarily runs 
into the oblique foil situation as in the Møller experi- 
ments (compare Fig. 9). 

For energies much above 4 Mev, say 40 Mev, one 
would probably be better off to accept all annihilation 
events. 

Transverse polarization.—The analog of the longi- 

= tudinal case just discussed is mentioned briefly, for 
completeness sake, by examining the cross section for 
total annihilation-in-flight for a transversely magnetized 
foil. The efficiency per polarized electron is shown in 
Fig. 12 (as the dashed curve) as a function of positron 
energy in the laboratory. It is down to 0.5 by about 0.25 
Mev and is essentially zero by about 3 Mev. Even 
restricting to the line of annihilation parallel to the line 
of collision, the efficiency falls off equally fast with 
increasing beam energy. 

Tf the line of annihilation is restricted to be at right 
angles to the line of collision, the azimuthal dependence 
of the photon intensity becomes essentially cos?6 or sin?ð 
at high energy, depending on whether the input spins 

antiparallel. This effect is about 50% 


efficient at 3 Mev and 90% efficient at 30 Mev (P57). 
Regarding counting rate such an approach would be 
ae nsive for two reasons: restricting the azimuth, and 
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unfortunately are fewer than the others (the ordinary 
Dirac differential annihilation cross section varies with 
beam energy so that a factor two is lost at 6 Mev, anda 
factor three at 20 Mev). 

A second-order (hypothetical) method for transverse 
positron polarization analysis using annihilation-in- 
flight is discussed in context in Sec. III D (5). 


(4) Polarized Bremsstrahlung 


Longitudinal.—A spectacular and useful method for 
analyzing longitudinal polarization of energetic e~ and 
e* was devised largely intuitively by Goldhaber, Grod- 
zins, and Sunyar. The method is to extract the longi- 
tudinal particle polarization information from the beam 
by means of the usual “forward” bremsstrahlung. This 
affords a natural, convenient, and (by polarization 
standards) copious source of circularly polarized 
photons. The secondary beam of photons is analyzed 
by comparison with atomic electrons (see Appendix 
C on the Compton method for photon helicity 
measurements). 

Figure 13 shows the 6 obtained by Goldhaber et al. 
(G57) on reversing the (transmission type) Compton 
analyzer through which the bremsstrahlung spectrum 
from Y® betas was counted. These photons were found 
to be highly polarized. The Compton analyzer was here 
throwing in no spurious 6 since the crossover in sign 
of the response is observed to come almost exactly 
where it should according to the Compton formulas, 
at 0.64 Mev. 

Early calculations by McVoy (M57) showed that 
the transfer efficiency P, to photon circular polarization 
P, should be ~0.95 at least for the truly forward 
photons, and near the tip of the bremsstrahlung spec- 
trum. One of McVoy’s curves (M57) implied**** that 
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Fic. 13. Polarized bremsstrahlung from Y% betas. Measured è 
on reversing transmission type Compton analyzer as function Q 
photon energy accepted. From Goldhaber et al. (G57). 


**** See corrected curve (Mc58). The main effect in correcting 


the curve is that as k/(#,—1) decreases from unity the Ge 


efficiency begins to fall off considerably sooner than was at fi 
doueauibtadaypically one should expect some, 
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such high efficiency should persist over a rather wide 
range of fractional photon energy, k/(Z:—1). Later, 
bremsstrahlung computations were made over a range 
of photon angle as well as fractional photon energy by 
Fronsdal and Uberall (F58). Computation on this 
problem has also been done by Claesson (C57) and by 
Bébel (B57). The problem of particle depolarization 
during such slowing as takes place before the brems- 
strahlung act seems not to be an all-important problem, 
but must ultimately be faced. 

On the basis of the first curve of McVoy (M57) some 
precise measurement of longitudinal particle polariza- 
tion has been reported. For example, P® betas were 
measured somewhat in the nature of a by-product in a 
series of experiments by Boehm and Wapstra (B058). 
Here the implied longitudinal polarization was —0.97 
+0.06 of v/c. Until such time as (i) the bremsstrahlung 
method has been seriously checked, experiment versus 
theory, and (ii) the number of polarized electrons/cm? 
in a given ferromagnetic material under given conditions 
is truly known, polarization measurement by this 
method to absolute precision approaching 5% should 
be considered as somewhat on the optimistic side. As 
matters stand, item (i) cannot be divorced from item 
(ii), since the former is currently based upon the latter. 

Aside from their intrinsic interest, bremsstrahlung 
sources are being widely used as calibrators for photon- 
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Fic. 14. Bremsstrahlung calculations. Theoretical transfer 
efficiency from longitudinal electron polarization to circular 
polarization of photon (lower curves) as function of fractional 
photon energy. Transfer from transverse particle polarization to 

hoton helicity (upper curves). Photon emission angle @ is shown. 
eeident electron energy is 2.5 Mev. From Fronsdal and Uberall 
(F58), with some details omitted. 


less circular polarization, by perhaps 10 to 15%, unless the condi- 
tions correspond to the tips of both the beta spectrum and the 
bremsstrahlung spectrum (which is scarcely practical). 
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Fic. 15. Bremsstrahlung calculations. See also Fig. 14. Incident 
beam energy is 52 Mev. Photon emission angle in radians given 
by U=103¢. From (F58). 


helicity measuring devices. The bremsstrahlung process 
occupies a strategic position among the various polar- 
ized sources and measuring devices which eventually 
must come to be well understood. 

Circular polarization of bremsstrahlung was the 
method used in the first successful helicity measurement 
of betas from muon decay. The experiment of Culligan 
et al. (Cu57) which yielded right-handed positrons, was 
the first suggestion that came from a helicity measure- 
ment that either lepton conservation or the tensor 
interaction in certain beta-decay processes would have 
to be given up. Since then, bremsstrahlung measure- Fa 
ments on the helicity of both the negative and the 
positive betas from decay of the negative and positive 
muon, respectively, has been performed (Ma58). Both 
signs turn out to be the same as for the corresponding 
beta from nuclear beta decay, and the sign obtained by 
Culligan is thereby confirmed.tttt 

Typical results from the computation by Fronsdal 
and Überall (F58) are given in Figs. 14 and 15.tttt To a 
first approximation, all of the forward cone of brems- 
strahlung has about the same transfer efficiency (from 
P, to Px) at a given fractional photon energy. E 

Transverse polarization —The transfer efficiency fro 
P, to photon helicity P; has been computed by Cl 
(C57), by Böbel (B57), and by Fronsdal and | 
(F58). The efficiency here is considerab 
from the longitudinal case, Pp — Pj, a 
from the upper curves in both Figs. 4a 
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Fic. 16. Annihilation-in-flight. Computed transfer of helicity 
from positron to the annihilation quantum of higher energy in the 
laboratory system (upper curve). Right-handed positron yields 
right-handed photon (positive P+). If target, instead of beam, is 
polarized, lower curve applies. From (P57). 


(5) Circular Polarization of Annihilation Photons 


As has been stressed before, there is no helicity 
information in the photons from nonrelativistic annihila- 
tion of positrons (nor is there in the photons which lie 
strictly along the line of collision, at any beam energy). 
However, for annihilation in flight at quite modest 
energies, the photon in the “opposite hemisphere” to 
whichever of the particles is longitudinally polarized 
takes on to some extent the same helicity as that particle 
had. This transfer efficiency is implied by the upper 
curve of Fig. 16, which specifically gives the circular 
polarization of the higher energy member of the pair of 
photons, per unit polarization of the beam. The target 
is assumed to be unpolarized. In actual circumstances, 
depending on geometry and energy discrimination in 
the photon detector, one may effectively accept only 
photons at center-of-mass angle 0<@<@max, and usually 
with varying efficiency as a function of 0. What is shown 
here is simply the situation where all photons 0<@<90°, 
and only those, are presumed to be analyzed (all counted 
and analyzed with equal efficiency). The process here 
is typically relativistic, and thus the higher the beam 
energy the more convenient the method becomes (with 
regard to energy discrimination, and geometrically). 
This may be compared with the experimental situation 
in Sec. III D (3), where a balance had to be maintained 
between competing effects. 

The case where the target and not the beam is 
polarized is also indicated in Fig. 16, but is of no 
particular concern to us here. One sees however how 
the crossover in sign at 4 Mey relates to the similar 
crossover in the solid curve of Fig. 12. 

Deutsch and collaborators (DSS7) have used anni- 
hilation-in-flight in an unpolarized converter (ucite) 
to measure the helicity of the positrons from Ga®’ (the 
nee roup has 4-Mey end point). Their appa- 
high-energy Cine 17. A selected portion of the beta 
ratus is shown in Fig. 1/- 
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spectrum is focused onto the converter, the high-energy 
(forward) photons being then analyzed for circular 
polarization in the transmission Compton analyzej 
shown. The computed ô, based on fully polarized (right- 
handed) positrons compared with the measured 6 as a 
function of photon laboratory energy (for given input 
beta energy) implied essentially 100% longitudinal 
polarization. The measured values of ô ranged to 8%. 

This second-order method has been used without 
preselecting the energy of the positrons by Boehm 
et al. (Bo57). 

Transverse polarization.—For the process just con- 
sidered (annihilation-in-flight against an unpolarized é 
target) there is generally a nonzero transfer from any 
P, in the particle beam to P, of the annihilation radia- 
tion. This is a maximum at center-of-mass angle 90°, 
that is when the line of annihilation lies crosswise to the 
line of collision (P58). The magnitude of this efficiency 
generally exceeds the analogous transfer term for the 
bremsstrahlung process. At 3 Mev the transfer efficiency 
is a maximum of 0.55, and it is rather symmetric about 
3 Mev on logarithmic energy scale, falling to half of 
this value at 0.3 Mev or at 60 Mev. This hypothetical 
method suffers from intensity limitations for the same 
two reasons mentioned in Sec. III D (3). 


IV. CONCLUDING REMARKS 
A. On Calculations 


So far, all helicity calculations cited have proceeded 
from theory that is already very well believed, and 
reasonably well checked experimentally. For example 
(to include Compton scattering for completeness), the 
“spin effects” were always there, both theoretically and 
experimentally, in the nonhelical details of the differ- 
ential cross section. This includes loss of definition of 
plane polarization in Compton scattering. The same is 
true of the existence of ground-state positronium and 
the details of its fine structure. The spin terms were 
known to be present in Møller and Bhabha differential 
scattering, in the annihilation-in-flight cross section, and 
in the Bethe-Heitler approximation for the bremsstrah- 
lung process. No really new physics is being used here 
in making use of standard processes for polarization 
measurement, 
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Fic. 17. Experiment to measure positron helicity by trans 
ferring polarization information to annihilation (in flight) photon, 
which is analyzed for helicity in Compton analyzer. From Deu 
and colloborators (DS57). 
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In order to make practical use of these spin effects, 
lists of matrix elements (transition amplitudes) from 
fully specified state to fully specified state are good 
things to have accessible. Then none of the polarization 
implications in the original theory are averaged out until 
the people devising a particular experiment are ready 
to do so. . 

For fast particles, transverse polarization is generally 
less amenable to measurement than is longitudinal 
polarization. Perhaps it is not too early to begin looking 
beyond those polarization effects which are already in 
the extant theory. Under this heading would come an 
enquiry into the possibilities of right-left asymmetry in 
bremsstrahlung intensity from transversely polarized 
particles. Here one would have to go beyond the Bethe- 
Heitler approximation, reminiscent of how, in Coulomb 
scattering without radiation, one has to go beyond the 
first Born approximation in order to find any polariza- 
tion effects other than “reprojection” of spin. 


B. On Experiment 


Of great assistance in succeeding with the experiments 
discussed here has been the fact that all the methods are 
really null methods. In these experiments, a large 
number of parameters such as solid angle, energy 
straggle, backgrounds, are arranged to be the same for 
both “states” of the analyzer. By and large it is 
nowadays as easy to demonstrate spin-helicity effects, 
using the large input polarizations now known to be 
available from beta decay, as it is to demonstrate in a 
nonpolarization experiment some relativistic feature of 
a differential cross section, or to measure an absolute 
cross section, to compare with Dirac theory. The 
reason is that none of these latter experiments can be 
made into a null method. In exploring details of a 
differential cross section there is always some important 
parameter that changes: the energy of a photon, an 
angle of obliquity in a scattering foil, signal-to-noise 
ratio. These things that matter will change always in a 
systematic way with the independent variable of the 
particular experiment. Hopefully, these systematic 
changes are computable, measurable or otherwise 
understood. As for absolute cross-section determinations 
(except perhaps for total Compton cross section), these 
are notorious for having many factors which must be 
controlled. 

With the labor and care necessary for say a 10% 
measurement of differential cross section, one now gets 
typically a 1% measurement on a change in counting 
rate which relates linearly to the polarization being 
measured. The intrinsic reliablity and power of the 
balancing or null approach makes this possible. Suppose 
the polarization effect causes a 10% change in counting 
rate, then one would have a measurement of polariza- 
tion good to 10%. For smaller output asymmetries, one 
must try to balance more thoroughly. In any case, it 
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needs to be stressed, especially in telling of polarization 
measurement to persons outside the immediate field. 


C. On Calibration 


We look forward to the time when a number of 
standard (that is, reproducible) beta and bremsstrah- 
lung sources and analyzers are available. Numbers will 
have to be attached, these numbers will have to be 
cross-checked with each other somehow, and of course 
“theory” and the numbers will have to be compatible. 

Aside from precision “numerical” calibration which 
is the course presently being followed, there is a 
different approach which might be mentioned. Just as 
an unpolarized beam has no polarized part, a fully 
polarized beam has no unpolarized part. This is not 
trivial: It should be recalled that if the problem were 
to demonstrate that a certain beam is unpolarized to 
within an infinitesimal, perhaps 0.5%, then a polariza- 
tion analyzer whose efficiency is known only crudely 
(say within a factor two) might suffice—one performs 
a null experiment. In dealing with degrees of polariza- 
tion P which are much closer to unity than to zero, it 
might be worthwhile to consider possibilities for 
measuring 1— | P|, instead of P itself. Thus, if | P| were 
~0.97, a measurement on 1— | P| good to only 30% is 
equivalent to a measurement on |P| good to one 
percent. The sign of P would of course be known before 
one embarked on this hypothetical precision experiment. 
Admittedly, such experiments would have to be of 
second order and therefore quite difficult. Yet, it 
would be aesthetically pleasing to have proof of, say, 
100% polarization presented in the form of a null 
experiment, to complement the more traditional 
approach which would hinge upon a chain of calibration 
numbers. 
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APPENDIX A. BETA DECAY AND POLARIZATION 
Allowed Fermi Decay 


The connection between neutrino polarization and 
beta polarization can be illustrated in simple sketches 
(Fig. 18). We assume a pure V (vector) interaction, and 
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Fic. 18. Relations between polarization vectors and momentum 
vectors. The emitted neutrino momentum q defines direction 6=0. 
Length of a given radius vector to the dotted envelope is propor- 
tional to the probability that the beta momentum p shall be in 
3 that direction. Spin-polarization vectors are shown by short 
= arrows. Beta-polarization arrows (each representing degree of 

= polarization unity) generally have meaning only in the rest system 
of the beta particle. Neutrino helicity —1 is used for illustration. 
Sketches apply for pure V or pure A interaction. 
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cle emitted is called a neutrino and not an anti- 
no. In sketches (a), (b), and (c) the neutrino 
eds along the line 0=0, to our right. Suppose it is 
anded. The sketches show (i) the angular correla- 
f the beta momentum p, with respect to neutrino 

momentum q (the envelope drawn is the relative 
_ probability of beta emission for given angle of emission, 
o be multiplied by d*p); (ii) the polarization vector in 
stem of the beta particle, where the magnitude 
yolarization is always unity since in this situation 
ree of freedom remains. 

> sketches rest on the fact that the amplitude 
beta to be at angle @ and to be right-handed is 
0 (6/2), while the (coherent amplitude for it to 
anded is (1—8) sin(@/2), where b=p/(y+1) 
eans total energy of the beta. It follows that the 

sutrino correlation (unpolarized) is 
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F TOS cos 


RIO 


(A2) 


Beginning again with these amplitudes, the longitudinal 
polarization, independent of 0 is +v/c. We could have 
reversed the roles of beta and neutrino by requiring 
say a right-handed beta, then the angular correlation 
of the neutrino with this beta is just (1-+cos0). A 
neutrino of zero rest mass is taken here. 

Expression (A2) shows there is a “‘partition angle” at 
which the beta polarization is exactly transverse. This 
angle, defined by (v/c)+-cos#=0, moves closer to 180° 
as v/c increases. The drawings show how P, tends to 
unity in the high-energy limit. 

To pass from pure V interaction to pure S (scalar) 
interaction, one may reverse q and the neutrino helicity 
and otherwise keep the same patterns es sketched in 
(a), (b), and (c) of Fig. 18. 

Nowhere in this discussion was anything said about 
nonconservation of parity: We could argue that what 
was plotted or given in the foregoing expressions was 
intended to represent only half of the Einstein-Podolsky 
situation. Then, by supplying the recipe for the other 
helicity of the neutrino, which says to reverse all 
polarization vectors, the picture would be completed. 
[Expression (A1) would be unchanged. J 

It seems that only one-half of the picture occurs in 
nature, but this need have nothing to-do with our 
trying to illustrate the connection-between the momenta 
and helicities. It does supply the motive and explains 
our choice of longitudinally polarized neutrino in these 
figures. 


Allowed Gamow-Teller Decay 


In the foregoing section, although we pretended to be 
discussing only the pure Fermi case, drawings (a), (b), 
and (c) apply equally well to pure G7—to the AM=0 
part at least. For the A (axial-vector) interaction we 
may keep the drawings as they stand, for pure T 
(tensor) interaction we can reverse q and the neutrino 
helicity. To depict the situation for an offgoing right- 
handed neutrino, reverse all polarization vectors. To 
complete the picture for GT, a AM=-—1 transition is 
sketched as part (d) of Fig. 18, only the high-energy 
limit being shown for illustration. A AM = +1 transition 
would be represented by reflecting picture (d) in the 
plane 6=90° (including the polarization arrows). 
Throughout we have been taking the axis of quantiza- 
tion for J to be along 6=0. 

As regards the unpolarized angular correlation, We 
now have competing parts: the angular correlation for a 
picture like (b) is 1+v/c cos#, and for a picture like ~ 
(d) is 1— v/c cos@. However, transition (d) occurs twice 
as often as (b), at least for unpolarized initial and 
nucleus. By not forcing a definite helicity on 
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AM=0 transitions, as required by spectroscopic 
stability. Or, requiring definite helicity for the neutrino, 
picture (b) represents “half” of a triplet M=O state 
(of neutrino plus beta). Thus, either way we look at it, 
the angular correlation is the same, 


v 
W (6) =1—- cosh. 
c 


(A3) 


Picture (a) and a nonrelativistic version of (d) are 
what apply in a pure GT K-capture event. With respect 
to q, the AM imparted to the nucleus is either —1 or 
zero (for a left-handed neutrino emitted), the former 
transition occuring with twice the probability of the 
latter. 

Now if we require, for the first time, a given “AM” in 
the laboratory, the neutrino will seldom proceed along 
the axis of quantization, and simple sketches such as 
these are no longer feasible. So many important experi- 
ments are related to J. in the laboratory that the 
discussion should be pursued a little further. Inspection 
of equation (1) of Jackson et al. (J57) shows, for pure A 
interaction, say, that the AM =0 transition has angular 
correlation (z axis fixed in the laboratory) 


(A4) 


Of course, p and q separately are isotropic in the lab- 
oratory since one cannot “align” a spin-} particle but 
can only polarize it. From (A4) the correlation between 
p and q can be readily worked out to be 1—p-q/3£.E,, 
which is the same as (A3). 

For comparison the AM = — 1 correlation [again from 


(J57) is 
(ee) 


and, although p and q are now each tied closely to the 
z axis, the correlation between them is the same as for 
the AM=0 case just treated. 

If we had the simple case J, M=1, —1, making a 
transition to J=0, the neutrino (taken to be left- 
handed) would be emitted as 1++cos0, the beta (inde- 
pendently) as 1—v/ccos@, the factor 3 being now 
absent since only one possibility for AM has been 
allowed. On the other hand, if we ask for the polariza- 
tion of daughter nucleus (let us now go from J=0 to 
J=1 for illustration) having found a beta along s, then 
the AM=O contribution is again in competition with 
|AM|=1, and the factor å often associated with GT 
transitions reappears. The beta—circularly polarized 
gamma correlation is treated in detail by Alder eż al. 
(A57) for example. 


(AS) 
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APPENDIX B. SHORT COMPENDIUM 
OF HELICITY SIGNS 


The experiment of Wu ef al. (Wu57) showed that 
the negative beta particles from polarized Co® nuclei 
come off predominantly opposed to the field H. Thus 
these betas as predominantly opposed to the direction 
of the original nuclear spin, because ground-state Co™ 
has a positive g value (W55). This decay has a decreas- 
ing spin change (5 — 4) and therefore the beta and the 
emitted antineutrino jointly carry off angular mo- 
mentum parallel to the original nuclear spin, unless we 
suppose that conservation of angular momentum is in 
question. Since none of this angular momentum 
carried off is orbital (this is an allowed transition) it has 
to be carried entirely by the spins of the two particles. 
Suppose, for argument, that the original nuclear polar- 
ization were complete. Then the two particles must 
carry off one unit of angular momentum referred to the 
z axis, the field axis. They are both spin-3 particles and 
so must ‘‘aid each other.” The Wu result implies then 
that the betas from Co™ have to be predominantly left- 
handed. To summarize, granting the sign of g, the sense 
of the spin sequence here, and granting the conservation 
of angular momentum, the “sign” that emerges from 
the Wu experiment is just the helicity of those betas. 

Soon after the Wu experiment, the left-handed 
helicity of the Co® betas was checked by Frauenfelder 
and collaborators (Fa57) by means of Mott scattering. 
Page and Heinberg found that positrons from Na* were 
predominantly right-handed (PH57). Simultaneously 
Schopper (S57) showed that Na” and Co® (both having 
the same sense in the spin sequence) emitted photons of 
opposite handedness, when measured at nominal 180° 
to the line of the emitted beta, the sign for Co® being 
in accord with the Wu experiment. Gas believed to 
decay via a, pure Fermi transition was shown by 
Deutsch and co-workers (DS57) to give highly polarized 
right-handed positrons. [No attempt is made to record 
many of the early polarization experiments which 
quickly extended the list to include, for example, mixed 
(F and GT) transitions, forbidden transitions, etc. ] 

Strictly, none of these experiments shows anything 
about neutrino helicity (that is, whether it is left or 
right), since the neutrino direction is not measured. 
What is measured in the three kinds of experiment just 
mentioned, Wu experiment, polarized beta experiment, 
and beta—circularly polarized gamma experiment, is a 
a sense of circulation and coincidentally a deta direction, 
nothing more. 

The helicity of the neutrino emitted in K capture — 
(presumably the same particle as the one emitted 
together with a positive beta) was measured directly by 
Goldhaber eż al. (G58) and found to be left-handed.$§§$ 


One concedes again conservation of angular momentum 


and certain j values of the levels involved. as 


§§§§ The sign and order of magnitude of the neutrino he 
for the decay used by Goldhaber (Eul®" — Sm!*) has now 
checked at Uppsala (M58). : = 
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To review further would lead into the status of beta- 
decay theory. The algebraic signs relating to a small 
and specific set of decays are given, signs which have 
not been in jeopardy and presumably will stand. This is 
done for the benefit of any person who may have had 
the notion that all had been confusion with regard to 
helicity measurement. 


APPENDIX C. COMPTON ANALYSIS OF 
PHOTON HELICITY 


The use of a Compton analyzer for circularly polar- 
ized radiation rests on a spin-helicity correlation present 
in the matrix element of Klein and Nishina. Details|||||||| 
of the spin-helicity part of differential or total cross 
section are given in (G53), or see the review article (T). 
The Compton analyzers may be divided into two types 
as indicated in the following. 


Transmission Type 


Here the spin-polarization sensitivity of the {otal 
Compton cross section is utilized. The analyzing 
efficiency, per polarized “iron” electron, has a faint 
unexplored maximum of 0.06 at about 0.2 Mev, and 
is zero at 0.64 Mev. From this energy to 14 Mev, where 
the efficiency is 0.4, it increases almost as the logarithm 
of E/0.64 Mev, and at very high energy becomes unity. 
Before helicity measurement became of general interest, 
this type was used in second order (that is, the two 
sets of spins in an iron analyzer acted as the polarizer- 
analyzer combination to verify that the spin-helicity 
term could be seen experimentally). A transmission 
analyzer was first used in first order, to measure degree 
(and sense) of polarization in a beam, by Trumpy 
(T57). This was a “nonparity” experiment, as it used 
capture gamma rays produced by prepolarized slow 
neutrons. The first application of this analyzer in a 
“Parity” experiment was by Goldhaber eż al. (G57), to 
demonstrate and measure the polarization of brems- 
strahlung by Y® beta rays. Lundby and co-workers have 
used it in beta—circularly polarized gamma correlation 
work (L57). 

Owing to the inescapable null in the efficiency curve 
at 0.64 Mev, the transmission analyzer is best suited for 
rather high energy work, but has been used successfully 
on a gamma line as low as 0.9 Mev (G58). T9 Nor- 

mally, a gamma ray is not seriously degraded in energy 
in passing through the iron absorber. Finally, this type 
of analyzer has zero sensitivity to any competing plane 
polarization in the photon beam. 


Differential Type 


In this type one counts, at a certain angle, photons 
scattered by polarized electrons. Some years before 


i [|||] For reasons which are not clear, expressions applicable for 


erimental design are occasionally attributed to an older 


j on C 
IT os compare Fig. 13. 
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parity work, the differential analyzer was proven at 
Leyden (W55) on the Ni® gamma rays following the 
beta decay of polarized Co". The method has since 
been revived by Schopper (S57) and by Boehm and 
Wapstra (BW57) for use in beta—circularly polarized 
gamma correlation measurements. Scattering at very 
large angles has recently been employed by Bernardini 
et al. (Be58) to measure the helicity of inner-bremsstrah- 
lung. Unless one is prepared to make absolute scattering 
measurements (which are quite out of the question, 
compare Sec. IV B) a differential analyzer is quite 
sensitive to plane polarization,***** this sensitivity 
being independent of spin direction. Therefore, for a 
given geometry of source, scatterer and photon detector, 
a flat statement of “circular polarization efficiency” in 
terms of fractional change in intensity of scattered 
photons on reversing magnetization cannot strictly be 
given, unless some proviso is made regarding ellipticity 
(or planeness). This feature was at once recognized by 
the Leyden group, but appears to have largely escaped 
mention since. 

Owing to counting rate considerations the scattering 
geometry is often such that any lines in the incident 
gamma spectrum are quite spread in energy as seen by 
the detector. 
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Invited Papers from the Conference on Weak 
Interactions Held at Gatlinburg, Tennessee, 
October 27-29, 1958 


Introductory Note 


OMEWHAT more than two years have passed 

since the realization came to physicists that the 
reflection law known as the conservation of parity 
possesses limits in the generality of its application. A 
paradox in the decay of the 7 mesons had led to some 
daring theoretical speculation, promoting direct labo- 
ratory tests which led in crucial experiments of rare, 
illuminating character to a re-direction of thought 
concerning those interactions which we call “weak.” 
For these physicists, the assumption of conservation of 
parity suddenly lost its sacredness. The result was a 
flurry of action; experiments of new types were started, 
and theoretical channels were explored along new lines. 
Last summer, new observations were being reported in 
abundance, and it occurred to members of the Physics 
Division of the Oak Ridge National Laboratory that a 
meeting of those involved and interested in the subject 
might be of profit. A local conference committee was 
established,* the American Physical Society was 


approached and granted its endorsement and help in 
the welcome form of two additional members for the 
Committee,f and the conference took place at Gatlin- 
burg, Tennessee, on October 27, 28, and 29, 1958. 
The conference comprised twelve sessions, in which 

the guidance of the following experienced contributors 
to the field, acting as chairmen, was a vitalizing 
influence: M. Goldhaber, M. Deutsch, M. E. Rose, 
J. M. Robson, E. J. Konopinski, E. C. G. Sudarshan, 
S. B. Treiman, H. L. Anderson, R. W. Thompson, 
R. G. Sachs, and E. P. Wigner. Abstracts of the 
contributed papers have been published in the Bulletin 
of the American Physical Society (January, 1959) and 
their position in the total conference can be gauged by 
their numbers in relation to the summary of sessions 
given below. The substance of the invited papers is set 
forth in the pages that follow. 

ARTHUR H. SNELL 

Chairman 


Summary of the Conference Sessions 


Session A. Parity Experiments. Invited paper by C. S. Wu. 

Session B. Polarization following Beta Decay. Contributed papers. i 

Session C. Experimental Beta Decay. Invited paper by J. S. ALLEN; contributed papers. 

Session D. Beta Decay of the Free Neutron; Time Reversal. Invited paper by V. L. 
TELEGDI; contributed papers. 

Session E. Theoretical Beta Decay. Invited paper by R. P. FEYNMAN; contributed papers. 

Session F. Theoretical r and u Decay and u Capture. Invited papers by M. L. GOLD- 
BERGER and H. PRIMAKOFF. 

Session G. ~ and u Decay and u Capture. Contributed papers. 

Session GA. Recent Observations of m-e Decay (unscheduled session). 

Session H. p Mesons—Experimental. Invited paper by R. L. GARWIN; contribuled papers. 

Session J. Strange Particles—Experimental. Invited paper by J. STEINBERGER; con- 
tributed papers. 

Session K. Strange Particles and Weak Interaction—Theoretical. Invited paper by R. 
H. DALITZ; contributed papers. 

Session L. Weak Interaction—Summary. Invited paper by M. GELL-MANN. 


* Members: J. L. Fowler, A. T. Galonsky, M. E. Rose, A. H. Snell, T. A. Welton, and H. B. Willard. 
t C. S. Wu and E. J. Konopinski. 
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REVIEWS OF MODERN PHYSICS 


INTRODUCTION 


HE frontier of parity study has now advanced to 

the field of strange particles. The atmosphere in 

the field of beta decay appears unusually calm and 

quiet after the storm. I will try to piece together the 

jigsaw picture and to see what sorts of puzzles in beta 

decay have fallen into shape. Most urgent of all is the 

question, whether there are still any missing pieces, and 
if there are, what are they? 

In principle, parity experiments are simply experi- 
ments designed to study the screw senses of particles, 
decay processes, or nuclear reactions. This all originated 
from our concept of left and right symmetry; that is, 
left and right are indiscernible.' To scientific minds, 
from the time of Leibniz to 1957, there was no inner 
difference, no polarity between left and right. The 
inner structure of space does not permit us, except by 
arbitrary choice, to distinguish a left from a right 
screw. However, a right screw or a right spinning 
particle shows up as a left screw or a left spinning par- 
ticle in a mirror reflection. So from the right-left 
symmetry we arrived at the space-reflection symmetry. 
This symmetry implies that if a particle exists, the one 
obtained by reflecting it in a mirror must also exist. If 
a decay process can take place, the one seen in a mirror 
is also a possible one. The symmetry of space-reflection 
clearly requires that events or particles do not exhibit 
definite screw senses. That is why all parity experiments 
are concerned with the measuring of the screw senses. 
This screw sense is now christened with the elegant 
name of helicity and is defined as o-p/|p|, the spin 
along p. 

The first parity experiment? using polarized Co® was 
designed to test whether there are any screw senses in 
beta decay. The essence of the experiment was to line 
up the spins of the Co® nuclei along the same axis and 
then to determine whether the beta particles were 
emitted preferentially in one direction or the other 
along the axis. The results showed that the electrons 
were emitted preferentially in the direction opposite 
to that of nuclear spin and therefore conclusively 
proved that the beta decay of Co® behaves like a 
lefthanded screw or possesses a negative helicity. So 
parity is not conserved in beta decay. 


* Work partially supported by U. S. Atomic Energy Com- 
mission. 

1H, Weyl, Symmetry (Princeton University Press, Princeton, 
New Jersey, 1952); E. P. Wigner, Proc. Am. Phil. Soc. 93, 521 
(1949); C. N. Yang, Science 127, 565 (1958). 

2 Wu, Ambler, Hayward, Hoppes, and Hudson, Phys. Rev. 105, 
1413 (1957). 
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Moreover, the asymmetry observed is as large as 
possible. In the electron angular distribution Z (0) 4 
=1+A(Jz)/J(v/c) cosð (where 0 is the angle between 
the nuclear spin and electron momentum direction), 
the measured asymmetry parameter A is nearly equal 
to —1. This implies that the parity interference effects 
are about as large as they could be. So this first experi- 
ment on parity tells us not only that parity and charge 
conjugation are not conserved in beta decay but points 
to something even more drastic and significant. 


TWO-COMPONENT THEORY OF THE NEUTRINO 


When the experimental value of the asymmetry 
parameter (A=—1) in Co® beta decay was made 
known to Lee and Yang, they immediately realized E 
that here one had to consider an extremely simple and 
appealing theory of the neutrino.* This theory requires = 
that the spin of a neutrino always be either parallel K 
or antiparallel to its momentum and the helicity of an 
antineutrino be opposite to that of a neutrino. Inci- ; 
dentally, this theory requires also the rigorous massless- 4 
ness of the particle. If there were any mass associated 
with the particle, the particle could therefore be at rest 
or with momentum P reversed in a certain frame of 2 
reference. In that case, it is rather meaningless to 
impose the necessary requirement of alignment of spin 
o and momentum P for such a particle. From experi- 
mental evidence, the mass of the neutrino is indeed 
vanishingly small. The most sensitive method of esti- 
mating the mass of the neutrino is to investigate the 
slope of the upper end of a beta spectrum. The low- 
energy beta spectrum of H? has been investigated for 
this purpose by many laboratories.‘ The upper limit of — 
the mass of the neutrino m, is around 250 ev — 
(~1/2000 m.) and the evidence is also not inconsistent 
with mass m, equal to zero. aa 

All through the years, theoretical physicists had 
entertained the idea of associating such unique proper- 
ties with a massless neutrino. It had to be abandoned ~ 
lest the law of parity be violated. Clearly, the two- 
component theory of the neutrino violates P and C 
invariance separately. A left-handed neutrino | 
into a mirror and finds a right-handed neutrino 
according to the two-component theory of the ne 
there is no such possible state. The charge co! 
of a neutrino is an antineutrino; one behaves lik 

*T. D. Lee and C. N, Yang, Phys. Rev. 105, 1671 (1957). 

*Curan, Angus, and Cockcroft, Phil. Mag. 40, 36, . 49) 
G. C. Hanna and B. Pontecorvo, Phys. Rev. 75, 98 


Hamilton, Alford, and Gross, Phys. Rev. 83, 215 (1951 
M. Langer and R. J. D. Moffat, Phys. Rev. 88, 689 (195: 
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spinning particle, the other like a right spinning particle. 
This is a clear-cut violation of invariance of charge 
conjugation. In 1929, Weyl? proposed the mathematical 
possibility of such a particle, but it was discarded 
because it violated the law of parity and therefore 
could not be a physical reality! Landau® and ‘Salam? 
reinvestigated this possibility shortly before the dis- 
covery of parity nonconservation. With the removal of 
both of these obstacles, interest in this theory was 
suddenly revived. 

So in the first parity experiment of beta decay of 
polarized Co®, not only was the screw sense in beta 
decay found, but even more dramatic was the realiza- 
tion of the possible existence of the two intrinsic 
opposite screws associated with the neutrino and anti- 
neutrino. The success of the two-component theory of 
the neutrino greatly facilitated our understanding of 
many phenomena in weak interactions. 


LAW OF CONSERVATION OF LEPTONS 


At this point, we should review another important 
conservation law known as the law of conservation of 
leptons which was suggested by Konopinski and 
Mahmoud* to explain the nonoccurrence of certain 
decay processes. The law states that if a leptonic 
number is assigned to each particle, then the sum of 
leptonic numbers must be conserved in all reactions. 
The assignments generally agreed upon are 


lepton /=same (say +1) for e, w, v 
l=-1 
1=0 


for et, ut, > 


for r, y, K and all heavy 
particles. 


The v or ý expected in various decays are shown in the 
following equations: 


n— pre-+o po ntet+yp (1) 

T = DS Y (2) 
y+ — et+y+5; uW —> +r (3) 
R n+7; ut+n— p+r (4) 
K; K-—>p +? (5) 
mt — et+v; T —e-+3. (6) 


The confirmation of the assignments given by these 
decays should constitute a strong proof of the law of 
conservation of leptons. Particularly if the helicities of 
these two neutrinos are opposite, then the predictions 

= of these decay processes become unique. 
k G The possibility of a two component relativistic theory of a 


article was first discussed by H. Weyl, Z. Physik 56, 
Jt was rejected on the ground of parity violation. See 


lear Phys. 3, 127 (1957). 
Tandau, Nuc i sey 
d H. M. Mahmoud, Phys. Rev. 92, 1045 
d C. N. Yang, Phys. Rev. 105, 1671 (1957). 
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EVIDENCE OF TWO-COMPONENT NEUTRINOS 
AND LEPTON CONSERVATION OBTAINED 
FROM PARITY EXPERIMENTS 


I. From Nuclear Beta Decays 


In the nuclear beta decay as shown in (1), an anti- 
neutrino is expected to be associated with 6- decay 
and a neutrino with + decay. Furthermore, if the two- 
component theory of the neutrino is valid, then there 
are several unique phenomena predicted theoretically. 
The validity of the two-component theory and lepton 
conservation depends largely on the closeness of the 
agreement between the experimental evidence and the 
theoretical predictions. Suppose now we examine each 
phenomenon separately. 


(A) Beta Asymmetry from Polarized Nuclei? 


The beta asymmetry of allowed transitions from 
polarized nuclei can be expressed in the following form: 


A | do Re (Cr*#Cr'— CISCA) 


Ze 
hcp 


Im (Cr*C4'+Cr’*C 4) | l Mer | 2 


} 


J 2 
+677 (—) [Re (Cr*Cs/+ Cr’*Cs =r EPE 
Jl 


9 


Ze 
—Ca*Cy)+— Im(Ca*Cs'+Ca'*Cs—Cr*Cy’ 

hep j 

7 


2 
&(1+6/W) 
E=(|Cs?|+|Cy?2|+|Cs?|+|Cv?|)| Me? 
+(|Cr?|+|Ca2|+|Cr?|+|Ca?|)| Merl? 
b= +2y Re[(CsCy*+Cs'Cy’*) | Mr]? 
+ (CrC4*+Cr’Ca/*)|Mer|"], 


where C; and C; are the even and odd beta coupling 
constants. This measurement gives unique information 
about the pure G-T interaction. Neglecting the imagi- 
nary term for the time being and setting the Fierz term 
b/W=0, then one can obtain a simple relation for “A” 
for the J > J’=J—1 transition, ji 


A (Bx) = +2[Re(Cr*Cp’—C4*C 4’) ]/ ; 
|Cr?| + |C] + | Cr’ [2+ [Ca7l- 


Furthermore, for the two-component neutrinos the even 
and odd beta coupling constants are related by 
Cr=—Cy' (left-handed antineutrino) or Ca=Ca 
(right-handed antineutrino). Therefore, A (6,.)= F1 for 
the above pure G-T transition. As we all know, the £7 
asymmetry A of polarized Co® where the spin an 


—crires)| | Mr] > | M er| x 
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parity change is 5+— 4+ is indeed nearly — 1 to within 
a systematic uncertainty not larger than 20%. This is 
the first strong evidence of the possible existence of the 
two-component neutrino. On the other hand, what 
evidence do we have of the 8+ asymmetry in a pure 
G-T transition of 6+ decay? Unfortunately, with the 
present technique of nuclear alignment, the nuclei 
which can be substantially polarized belong to a very 
exclusive club. In order to obtain the same type of 
information which can be extracted from polarized 
nuclei, we have to resort to an indirect method which 
is easier to apply although far less sensitive. This is 
shown in the following. 


(B) B-y (Circular Polarization) Correlation 


It is quite obvious from the observed beta-asymmetry 
distribution of polarized nuclei that the 6 decay should 
* leave the nucleus partly polarized with respect to the 
direction in which the £ particle is detected. If a y ray 
follows immediately after the 8 decay, it should have 
circular polarization proportional to the cosine of the 
angle between the $ particle and y ray. The circular 
polarization of the y ray can be analyzed with a 
cylindrical electromagnet which could be magnetized 
to saturation either parallel or anti-parallel to the 
photon direction. The principle of this analysis is based 
on the existence of a spin dependent part of the Compton 
cross section of circularly polarized photons and is 
treated in detail by Gunst and Page.’ The correlation 
for the most frequent decay sequence such as 


allowed 
J =d 
@ transition 


2" pole y 
JAS >J” =J'—L 


can be expressed by 


v 
W (6) = 1-A- cosp, 
c 
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where 
1 
A+——| uyu!| Re(Cr*Cr! carcan) 
L+1 
Ze = 
—— Im (CCa Cr’*Ca) | |Mer|? 
hcp 
J+1\3 2 
+675 (—) | Re(CatCs/+Cr/*Cs—CsACy/ 
Ze 
—Ca/*Cy)4— Im(Ca*Cs'+C4/*Cs—Cr*Cy’ 
hcp 
— Crise) || ate| -|Mer| | (8) 
E(1+b/W) 


Schopper" first applied this B-y (circular) correlation 
method to Co® and obtained the parameter A=— 4. 
This is in excellent agreement with the conclusion 
derived from the polarized Co® experiment, that is, | 
Cr=—Cr’ or Ca=C4'. When this method was applied 
to Na”, which is a pure G-T positron emitter (3+ — 2+), 
the sign of the circular polarization was found to be 
opposite to that of the electron emitter Co® as theo- 
retically predicted. However, none of these experi- 
ments! has an accuracy of much better than 20%. 


(C) Longitudinal Polarization of B Rays" 


Since parity is not conserved, the pseudoscalar term 
(o: p.) formed from the measured spin and momentum 
vector of electrons may occur. In other words, the decay 
electrons from unpolarized nuclei can be longitudinally 
polarized. The general expression of the 6+ polarization , 
is 
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4) 2 Rel (CsCs'*—CyCy'*) | Mr|?+ (CrCr'*—CaC a’*) | Mer|? (9) 
P (gt =Sse— 19 DY 9 D 12 9 
) c (|Cs*|+|Cv?|+|Cs?| +C IMr PHC Cell Cal Hl Merl? 
If we assign Cs=—Cs'; Cr=—Cr (for left-handed behave like left-handed screws and all g+ particles F 


antineutrino) or Cy=Cy’; Ca=C4 (for right-handed 
antineutrino) then P(6*)==Fo/c. First to observe this 
polarization was the Frauenfelder group. There are, at 
present, three major methods being used for the deter- 
mination of electron polarization. They are: (1) 
Coulomb scattering from heavy nuclei (Mott scat- 
tering); (2) circular polarization of forward Brems- 
strahlung or annihilation radiation; and (3) free 
electron-electron scattering (Møller or Bhabha scat- 
tering). The results are startling and simple, and all 
agree that ~ particles emitted in radioactive decay 

9S, B. Gunst and L. A. Page, Phys. Rev. 92, 970 (1953); 


Wheathly, Huiskamp, Diddens, Steenland, and Tolhoek, Physica 
2, 841 (1955). W. J. Huiskamp, thesis, Leiden, 1958; and H. 


Schopper, Nuclear Instr. 3, 158 ASB rut Kangri University Haridwar SRP nea hebreu vsa an 


behave like right-handed screws. For relativistic energy 
1/c=1, we have practically completely polarized elec- 


10H. Schopper, Phil. Mag. 2, 710 (1957). tn 

u F. Boehm and A. N. Wapstra, Phys. Rev. 106, 1364 (1957); 
P. Debrunner and W. Kiindig, Helv. Phys. Acta 30, 261 (L 

2 Proceedings of the Rehovoth Conference on Nuclear St 
(North-Holland Publishing Company, Amsterdam, 1 
Mott Scattering: H. A. Tolhoek, Revs. Modern Phys 
(1956); Frauenfelder, Bobone, von Goeler, Levine, Le 
Rossi, and De Pasquali, Phys. Rev. 106, 386 (195 
Yeliseyev, Linbimov, and Ershler, Cavanagh, Tt 
Gard, and Ridley, Phil. Mag. 2, 1105 (19. 
O. J. Poppema, Physica 23, 597 aren 
Lipkin, and Rothem, Phys. Rev. 107, 1 
Joliot, Marty, and Sergent, Compt. ren 

(b) Circular polarization of bremsst: 
radiation: K. M. McVoy, Phys. Rev. 


4 decay : n—>p +8+0 


G-T Interaction 
ATI=1 


Fermi Interaction G-T 


T(o) >I (0) 


V:14+% cos (6,9) 3 


4 
Hg=-1 Hg=+H 
Hg =71 


Ss 1-¥ cos (4,/7) 


A:1- 4 Y cos (4,7) 


tron or positron beams. We have been working with the 
polarized beam of 8 particles for the past sixty years, 
= yet we were completely unaware of it because of the 
i faultiness of our left-right symmetry conception. How- 

ever, many systematic uncertainties in the measure- 
= ments, such as the backscattering effect, depolarization 
= effect, instrumental asymmetry, and the screen cor- 
rection factor, etc., are rather difficult to assess or to 
estimate. Probably it is fair to say that in the high- 
“energy region of v/c>0.6, the polarization is nearly v/c 
with an accuracy of not better than 10%. Below 
=0.6 very few results have been reported and the 
ure in that region is not clear. 


= (D) Correlations between the Helicities of Leptons and 
Beta Interactions 


e helicity of the neutrino can be correlated to the 
on polarization and the form of beta interaction 
own in Fig. 1. 

a G-T 6 interaction, the angular momentum 
ried away by the leptons is one unit. In the tensor 
action, both leptons are emitted preferentially in 
direction. Since the electron is found to possess 
at helicity (left-handed screw), the antineutrino 
st have the same helicity. On the other hand, in the 
vector interaction, the electron and the anti- 


and negative helicity for the neutrino. The 
of the neutrino observed in the electron 


(1957); Deutsch, Gittelman, Bauer, Grodzins, and Sunyar, 
ss. Rev. 107, 1733 (1957); and Boehm, Novey, Barnes, and 
. Rev. 108, 1497 (1957). 

‘Scattering: C. Møller, Ann. Physik 14, 531 (1932); 

s. Rev. 106, 394 (1957); A. M. Bincer, Phys. Rev. 

. W. Ford and C. J. Mullin, Phys. Rev. 

j) anson, Levine, Rossi, and De 

910 (1957); and Benczer- 
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Ardecay : pn +Boy 
Fermi Interaction 
I (0) Ilo) 


V 1+¥% cos (6,))) 


É opez 


Hg =+ 

Fic. 1. Correlations of 
neutrino helicities and 
8 interactions. 


C ) Hp=+1 


S: 1-¥ cos (At) 


capture process of Eu!** is indeed negative and there- 
fore supports the axial-vector interaction. In a similar 
manner, one can figure out the helicity of the antineu- 
trino for scalar and vector interactions. The relations 
between the helicities of various leptons in different 
beta interactions are shown in Table I. This table also 
points out two more significant conclusions: First, the 
much used Fierz interference term between S and V 
expressed as (CsCy*+Cs’Cy’*) and A and T such as 
(CrC4*+Cr'Ca'*) now automatically vanishes because 
the neutrino or antineutrino associated with S and T 
or V and A has opposite helicity. Therefore, no inter- 
ference occurs. Secondly, for the same reason, com- 
binations of (V and T) and (S and A) will not result 
in interference between Fermi and Gamow-Teller terms. 

Many beautiful -y (circular polarization) correlation 
studies! on the mixed (G-T and Fermi) transitions such 
as Sci, Au!%8, etc., showed nearly maximum amount of 
interference between G-T and Fermi interactions. This 
evidence strongly ruled out a pure VT or SA combina- 
tion. Incidentally, this conclusion had been reached also 
by Mahmoud and Konopinski!® by the study of the 
first forbidden beta spectra alone in the pre-parity era. 


TABLE I. Helicities of v, 7, 67, and £+. 
Helicity: o- p/| $|. 
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Helicities y Dp B Bi 
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(E) Inverse Beta Decays and the Double Beta Decays 


There are also three experiments, initiated long 
before the parity crisis, which originally had no bearing 
on the parity question. However, it turns out now that 
they have important implications for the two-com- 
ponent neutrino and the conservation of leptons. As a 
matter of fact, a voice of dissension fron these inves- 
tigations could cause serious troubles. 

1. Capture cross section for the antineutrinos in the 
Cowan and Reines Experiments."—i+p > n+6+. For 
two-component neutrinos, the outgoing neutrinos from 
the reactor have only one spin state instead of the 
usual two. By a detailed balancing method, the absorp- 
tion cross section will be twice as great as the old one. 
The latest experimental results!’ are no longer in dis- 
agreement with the two-component theory of the 
neutrino. The cross section per fission (assuming 6.1 7 
per fission) for the inverse beta decay of the proton is 


Ooxp= (1144) 10 cm?, 


which is comparable to the theoretical cross section 
calculated for the two-component theory of the neu- 
trino and based on the antineutrino spectrum from the 
fission fragments. 


T tho = 9.5 X 1074 


=12%10-# 
=15X 10-4 


Reines ef al.!8 
Muehlhause and Oleksa!® 
King and Perkins.'8 


2. Capture process for the neutrino—In the Davis 
experiment! y+Cl" — A’’+6-. Here a neutrino, not 
an antineutrino, is required for the absorption process. 
The intensive flux of antineutrinos pouring out of a 
power nuclear reactor does not provide the right kind 
of neutrino for this process and therefore no A% activity 
should be attributed to the reactor neutrinos. The 
very small A?” counts (0.33.4 counts per day) which 
Davis observed in his one thousand gallons of CCl, are 
equivalent to a cross section for neutrino capture of 
(0.10.6) X10 cm?. This could be accounted for by 
the muon activation from the cosmic radiation, and no 
evidence for a positive effect from the reactor neutrinos 
exists at the present time. At least, there is no dissension 
about the two-component theory of the neutrino from 
the capture process of the neutrino. 

3. Double beta-decay2%— The theoretically calculated 
rates of double beta-decay depend on many assump- 


11 F, Raines and C. L. Cowan, Phys. Rev. 92, 830 (1953); 
Cowan, Reines, Harrison, Kruse, and McGuire, Science 124, No. 
3212, 103 (1956). 

18 F, Reines et al., Second Atoms for Peace Conference, Geneva, 
1958, Paper No. 1026. 

1 R, Davis, Phys. Rev. 97, 766 (1955); Bull. Am. Phys. Soc. 
Ser. II, 1, 219 (1956); and private communication (1957). 

2 (a) H. Primakofft and S. P. Rosen, “Double beta decay,” 
Washington University, 1958 (unpublished). (b) Cat: M. 
Awschalom, Phys. Rey. 101, 1041 (1956); Dobrokhotov, Laza- 
renko, and Luk’yanov, CERN High Energy Conference, 1958. 
Nd!®: Cowan, Harrison, Langer, and Reines, Nuovo cimento 3, 
649 (1956). 
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tions concerning the properties of the neutrinos and 
whether one or both kinds of neutrinos are emitted in 


B- decay. The existence of the conservation law of 


leptons, together with the two-component neutrino, 
limits the possible alternatives and demands that the 
rate of double beta decay be its minimum value. Ex- 
perimentally no single electron line of discrete energy 
equal to the sum of the two disintegration energies has 
ever been observed. The observed upper limit of the 
rate of double beta decay also strongly rejects the short 
lifetime based on Majorana neutrinos and favors the 
longer time of Dirac neutrinos. For example, the pre- 
dicted half-life of double beta decay of Ca*® (4.30.1 
Mev) is 4X 10" yr for no neutrino emissions and 4X 10'8 
yr for Dirac neutrinos. The observed decay rate of Cats 
is >2X10!82% or 6X10!8 yr.2% In the case of Nd! 
(3.70.1 Mev), the observed decay rate is >2108 
yr in comparison with that predicted for Dirac 
neutrinos of 2X10!8 yr. However, it is desirable to 
increase the sensitivity of the detector so one can 
actually observe the Dirac type double beta decay. 

So, all in all, the experimental evidence in favor of 
two-component neutrinos and the conservation law of 
leptons is overwhelming. Not even a faint voice of 
dissension has been raised. On the other hand, neither 
can a great accuracy be claimed. A greatly improved 
accuracy to strengthen its ultimate validity is much to 
be desired. 


II. From x—u—e Decays 


Here we review briefly how successfully the two- 
component theory of the neutrino and the conservation 
of leptons can interpret the observed phenomena in 
m—pu—e decays. Particularly, we show that the observed 
helicities of the neutrino and the antineutrino in 
m—u—e decays are the same as those found in nuclear 
beta decays. 


(A) Polarization of Muons and Electrons 


Consider the decay of a positive pion in its own rest 
frame [Fig. 2(a)]. Since + is spinless, the neutrino 
angular momentum (its spin 1/2) about its direction of 
motion must be balanced by the muon angular mo- 
mentum about its direction of motion. Now the neu- 
trino in nuclear beta decay is found to possess a negative 
helicity, so negative helicity is predicted for the ut. So 
far no direct determination of pt helicity has been 
reported. 

However, the information may be inferred from the 
helicity of the decay electrons within the two-component 
theory. In the extreme case [Fig. 2(a)] when the 
neutrino and the antineutrino go in the same direction, 
the electron which goes in the opposite direction has to 
carry away the angular momentum of the muon. If p+ 
has a negative helicity, the decay e+ must possess 
positive helicity. The experimental resuli i 
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A H (Helicity) 
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muon decays obtained by measuring the circular 
polarization of the bremsstrahlung and annihilation 
radiation’ or by the Møller scattering method” con- 
clusively show positive helicity for e+ and negative 
helicity for e~. Here the results on the helicities of the 
neutrinos in nuclear beta-decays and from m—u— e 
decays are in excellent accord. 


(B) Michel Parameter “p” 


The shape of the energy spectrum of e+ from u+ decay 
is characterized by a single parameter ‘‘p” discussed by 
Michel. In the two-component theory of the neutrino, 
p must equal zero if the electron is accompanied by two 
neutrinos or two antineutrinos, and equal 3/4 if one 
neutrino and one antineutrino are emitted. The present 
experimental values” of p vary from 0.68 to 0.72, which 
is close to 0.75, for one neutrino and one antineutrino 
emission, and confidently reject the assumption of two 
neutrinos or two antineutrinos in muon decay. The 
ultimate determination of the value of p is highly 

desirable because of its significant theoretical impli- 


cations. 


=e 
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(C) Energy Dependence of the Asymmetry Coefficient” 


The two-component theory of the neutrino predicts 
closely the energy dependence of the asymmetry coef- 
ficient of electron distribution in muon decay. In the 
high-energy end of the spectrum, the data are in good 
agreement with the theory. At the low-energy region, 
measurements are difficult and the accuracy has been 
poor. However, as a whole, the energy dependence of 
the asymmetry coefficient in u~ decay agrees with the 
prediction of the two-component theory. 


III. Muon Capture 


The process of muon capture has been studied only 
in complex nuclei and very little information is available 
in the literature. At present one side of the Puppi 
triangle, which represents the muon capture process, is 
attracting much attention. Before the conference is 
over we will hear several interesting papers% and com- 
munications” on this topic. 


IV. K—u—e Decays™ 


The K—y—e decay is analogous to m—u—e decay 
[ Fig. 2(b)]. If the conservation law of leptons holds, 
the same kind of neutrinos are expected in these two 
parallel cases. This should, therefore, result in the same 
asymmetry distribution of electrons with respect to the 
direction of u motion. The observance of this predicted 
asymmetry in K—y—e decay further strengthens the 
validity of lepton conservation. 


V. x—e Decays” 


The existence of m—e decays is finally confirmed by 
several laboratories, and the measured ratio R of re 
to m —> w+» decay is very close to that predicted by a 
simple calculation which gives R= (r—e)/(r—y) = 1.36 
X10-*. It is interesting to note that the helicity of the 
electron from m—e decay should be opposite to that 
from u—e decays. 

So there is overwhelming evidence in favor of the 
two-component theory of the neutrino and the con- 
servation of leptons in all the beta-decay phenomena. 
To establish firmly the validity of these two assump- 
tions, more precision measurements are required. 


INTERACTIONS RESPONSIBLE FOR BETA DECAY 
I. From Classical Beta-Decay Experiments 


Prior to the discovery of parity nonconservation in 
beta decay, the (5,7) combination had been the 


eat oceedings of the CERN Conference on High Energy Physics, 


26 Proceedings and Bulletin of Conference on Weak Interactions, 
1958, Sessions F and G. 

26 Coombes, Cork, Galbraith, Lambertson, and Wenzel, Phys. 
Rev. 108, 1348 (1957). 

21 T. Fazzini, Fidecaro, Merrison, Paul, and Tollestrup, Phys. 
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favorite choice based mainly on He’(8-y) angular cor- 
relation results.’ In fact, the (6-v) angular correlation 
was the only means used to investigate the type of beta 
interaction in those days. But this type of experiment 
was known to be difficult. Wu and Schwarzschild? made 
a detailed examination of the old He® experiment and 
pointed out that the effective volume of the He® source 
in the hole of the pumping diaphragm was not correctly 
taken into account. Had this been done properly, the 
results of He® would not then have implied the tensor 
interaction. However, in spite of its many limitations, 
-v correlation is still an effective and powerful method 
in yielding information on the beta interactions. The 
first sign of warning against the (S,7) combination in 
beta interaction came in May of 1957 when Herr- 
mannsfeldt ef al.® published their (6-v) correlation 
results on A", which decays mainly through Fermi 
interaction, and the results strongly supported the 
vector interaction instead of scalar as was once believed. 

Recently,’ this group has remeasured the (8-v) cor- 
relation in He® with the same apparatus and obtained 
the correlation coefficient \=—0.39+-0.02, which cer- 
tainly favors axial vector in He’. 

Meantime the @-v angular correlation in 


B 
Li > Be! — 2a 


was reinvestigated with greatly improved technique at 
California Institute of Technology? and at Heidelberg 
University. They measured either the angular dis- 
tribution of the two a particles or the « spectra in coin- 
cidence with the 8 particles emitted at either 180° or 
90° from the a direction. The results from Heidelberg 
strongly favor the axial vector, and the California 
Institute of Technology results are accurate enough to 
put the upper limit of tensor mixture in G-T interaction 
to less than 10%. 

Thus the (V,4) combination is strongly favored 
from (6-v) correlation experiments. Let us now look 
into the information on beta interactions obtained from 
parity experiments. 


II. From Parity Experiments 
(A) Electron Capture Process in Eu'®* * 


In an electron capture process, a neutrino and the 
recoil nucleus are emitted in opposite directions 


28 B, N. Rustad and S. Ruby, Phys. Rev. 97, 991 (1955). 

2 C, S. Wu and A. Schwarzschild, Columbia University Report 
CU-173. 

39 Hermannsfeldt, Maxson, Stahelin, and Allen, Phys. Rey. 107, 
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e+p — n+v. If the capture process is followed by the 
emission of a gamma ray and the spin and parity 
changes are favorable as shown in the following decay 
process 


€ capture y ray 
AWN =e) O) 


then, by applying the conservation laws of momentum 
and angular momentum, one can deduce a simple cor- 
relation that the helicity of the downward gamma ray 
will be the same as that of the upward neutrino. So the 
problem of determining the neutrino helicity becomes 
that of measuring the circular polarization of the gamma 
ray. However, to select only those downward gamma 
rays following the emission of the upward neutrinos, 
many conditions must be fulfilled. First, the gamma 
ray must have an energy comparable to that of the 
neutrino and the lifetime of the excited level B* must 
be very short (~10~" sec) in order to permit the use 
of solid material. Even then one has to detect only the 
resonantly scattered gamma ray. 

The requirements were indeed strict, but the radio- 
isotope Eu!*** seemed heaven-sent to do this job. 
Goldhaber, Grodzins, and Sunyar knew of this radio- 
isotope Eu!®* from their previous investigations and it 
fulfills all the requirements stated above. By measuring 
the circular polarization of the gamma rays from Eul®?* 
which are resonantly scattered by Sm, they found that 
the helicity of the gamma ray is negative! (H= — 0.67 
+0.10). From this result one concludes that the 
helicity of the neutrino in electron capture is negative 
and therefore the Gamow-Teller interaction in electron 
capture is dominantly “A” and not “T.” 


(B) Beta Decay of Polarized Neutrons 


Meanwhile, at the Argonne National Laboratory, a 
highly polarized neutron beam had been successfully 
completed since 1957. Burgy et al. had been measuring 
two asymmetry coefficients from the beta decays of 
polarized neutrons. One is the coefficient “4A,” the B- 
distribution asymmetry ((J}-pe—); the other is the coef- 
ficient “B,” the (v) distribution asymmetry ((J)- po). 
From the value d=—0.11+0.02 and B=0.88+0.15, 
they concluded that the interaction in ~ decay is 
dominantly V and A with opposite phase relations 
(V—A). 

Look back at the history of the theory of beta decay. 
It has been filled with surprises and excitement. Now, __ 
after a period of nearly sixty years of continuous — 
investigation, finally along comes the nonconservati 
of parity. These two (classical beta theory and n 
conservation of parity) joined forces in : | 
conclusion on the beta interaction. 

However, we must be cautionec 
there is any small mixture of 
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III. Universal Fermi Interaction 


The great similarity in the strength of the coupling 
constants in beta decay, u decay and u capture suggests 
that the interaction forms of the three decay processes 
may also be the same. When beta-decay coupling was 
concluded to be (8,7) and u-decay coupling was 
deduced to be dominantly V and A from the negative 
sign of the asymmetry coefficient, this possibility of a 
universal Fermi interaction was naturally ruled out. 
Now that the interaction (V,A) has replaced (S,7) in 
beta decay, the situation is quite altered. This is par- 
ticularly intriguing because this specific form of (V and 
A) has been prophesied independently by Marshak and 
Sudershan** and also by Feynmann and Gell-Mann*’ 
by different deductions. The interactions which are 
responsible for the muon capture process are now under 
intensive investigation. 


CONCLUSION 


We have come a long way since the overthrow of the 
law of parity in weak interactions. A great advance has 
been made in the theory of the neutrino. We have two- 
component neutrinos and the law of conservation of 


36 Ģ. Sudarshan and R. Marshak, Padua-Venice International 
Conference, 1957, Phys. Rev. 109, 1860 (1958). 
37 R. Feynmann and M. Gell-Mann, Phys. Rev. 109, 193 (1958). 
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leptons. A vast amount of experimental data is ac- 
counted for by the V—A theory, and therefore the 
idea of the universal (V—A) Fermi interaction is 
again made acceptable. 

However, there are still many questions unanswered. 
Is there any connection between the beta interactions 
and the gravitational forces? Why should the ratio of 
the coupling constants between G-T and Fermi inter- 
actions |Cer/Cr|? be what was observed? Are there 
any possible theoretical explanations for this? The 
ratio of |Cer/Cr|? calculated from the recent neutron 
half-life®® of (11.740.3) min and the O! ft value’? is 
1.42+0.08. The ratio of |Cer/Cr|? from the asym- 
metry distribution of electrons from polarized neutrons 
is 1.56+0.14.% On the other hand, the ratio of 
|Cer/Cr|? obtained from the analysis of the ft values 
of beta decays of mirror nuclei by the “B-X” diagram 
is only 1.16+0.05. The last method relies largely on 
the evaluation of the matrix elements. Would this 
apparent discrepancy eventually show us a way? to 
improve our evaluation of the matrix elements? We are 
all eagerly anticipating hearing many interesting dis- 
cussions on these subjects at this conference. 


38 Sosnovskij, Spivak, Prokofiev, Kutikov, and Dobrynin, Pro- 
ceedings of the CERN Conference on High Energy Physics, 1958. 

3 J. B. Gerhart, Phys. Rev. 109, 897 (1958). 

©. C. Kistner and B. M. Rustad, Bulletin of Conference on 
Weak Interactions, 1958, Paper ES. 
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Determination of the Beta-Decay Interaction from 
Electron-Neutrino Angular Correlation 
Experiments 
James S. ALLEN 


Department of Physics, University of Illinois, Urbana, Illinois 


INTRODUCTION 


XPERIMENTAL determination of the specific 
form of the interactions responsible for nuclear 
beta decay has remained a critical problem ever since 
the original formulation of the theory by Fermi.! 
Bloch and Moller? probably were the first to point out 
that the electron-neutrino angular correlation in 
nuclear beta decay depends on the type of light- 
particle-neutron interaction assumed in the Fermi 
theory. According to present ideas, a linear combination 
of five relativistically invariant expressions can be 
chosen for the interaction Hamiltonian. The individual 
interactions are the scalar, vector, tensor, axial-vector, 
and pseudoscalar interactions, denoted respectively 
S, V, T, A, and P. In the case of allowed transitions, 
the pseudoscalr interaction usually is small compared 
to the other forms and can be omitted. 

During the past ten years we have been conducting 
a series of electron-neutrino angular correlation experi- 
ments on allowed beta decays in an attempt to obtain 
an unambiguous determination of the form of the beta- 
decay interaction. In general, these experiments have 
tended toward a procedure which requires only a direct 
measurement of the shape of the energy spectrum of the 
recoiling nuclei. This method has eliminated the diffi- 
culties inherent in the experiments which require an 
explicit measurement of either the angle between the 
electron and recoiling nucleus or the energy of the 
electron. Our present method does not require the 
detection of the recoiling nucleus and the decay electron 
in coincidence and, consequently, the counting rate is 
not limited by a decreasing true to chance rate. 

This program finally has been completed with a 
series of measurements of the angular correlations in 
the allowed beta decay of He’, Ne”, Ne” and A35. 
Since the entire set of measurements was carried out 
in one spectrometer, the experimental results may be 
compared directly. The results prove that the dominant 
interactions are, vector for the Fermi type of ‘decay, and 
axial-vector for the Gamow-Teller decays. 


THEORY 


In the usual theory of allowed beta decay the transi- 
tion probability for the emission of an electron-neutrino 
pair is 

1 E, Fermi, Z. Physik 88, 161 (1934). 

2 F. Bloch and C. Moller, Nature 136, 912 (1935). 
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where W and p, respectively, are the total energy and 
momentum of the electron. The angular correlation 
coefficient, expressed in terms of the coupling constants 
and nuclear matrix elements, is 


AE= ([Cr]?+|Cy'|?— |Cs|?— [Cs |11? 
+4(|Cr|?+|Cr’|?—|Cal?—|Ca’|) l (2) 


where 


= (|Cy|?+|Cv’[?+[Cs[?+1Cs’]*) | 4)? 
+(1Cr|?+ [Cr+ Ca Cal] a)l? (3) 


There is also a small additional term in \ which vanishes 
if time reversal is invariant. This term also is identically 
zero with the current choice of coupling constants, 
Cya=+Cvya; and Crs=—Cr’s:. It is interesting to 
note that the Fierz interference term b/W also i on 
identically zero with this choice of coupling constants. 

It is readily seen from (2) that A assumes the values 
+1, —1, +4, and —} for the pure V, S, T, and A 
interactions, respectively. Also, N does not contain 
terms due to interference between the parity conserving 
and parity nonconserving interactions. Therefore, 
electron-neutrino angular correlation experiments of the 
type discussed here are not able to separate the primed 
and unprimed coupling constants. In other words, these __ 
experiments cannot show the degree of mixing of the 
parity conserving and nonconserving interactions. _ 

The energy distribution function for the decay 
electrons is obtained by integrating cos6., in (1) over 
all values of ĝe». Since cos@., is an odd function of 
angle, the integral is zero and consequently the sh 
of the energy spectrum does not depend on A. Howey 
when the distribution in energy of the recoiling nu 
is measured, the integration is over values of Oe» CO 
sistent with conservation of energy and linear 
mentum in the emission of the electron-neutrino 
In this case the term containing À does not v 
therefore the shape of the energy spectrum ba 
the form of the angular correlation. TE 

In our most recent experiments we have measured 


792 


INCHES 


the shape of the energy spectra of the recoiling nuclei 
and have deduced values of A from comparisons of the 
experimental data with the theoretical distribution 
functions. The experimentally determined angular 
correlations have established the form of the dominant 
beta-decay interaction. 


EARLY RECOIL EXPERIMENTS 
Y The beta decay of He® can be represented by 
He®— Lit+e-+7, logft=2.91, (4) 


with a spin change of J=0 to J=1, no change of 
nuclear parity. The maximum kinetic energy of the 
recoiling Li® ion is 1405 ev. The beta-decay process 
produces a singly charged positive ion. However, in 
about 10% of the decays, doubly charged ions are 
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1G. 2. Experimental and theoretical retarding potential curves 


` coil ions from the decay of Het. The angle between the 
ct ee aN the electrons and the recoil ions was 
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Fic. 1. The apparatus used for the first He® neutrino experiment. 
electrons by a Geiger counter at window A or B. The energy spectrum of the recoil ions was measured by the retarding potential method. 
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The recoil ions were counted by the electron multiplier and the 


present as a result of ionization caused by the sudden 
change in the nuclear charge. The small value of ft 
indicates that the He® transition is allowed, and the 
spin change shows that this is a pure Gamow-Teller 
transition. The angular correlation factor will be either 
+4 for the tensor or — 3 for the axial-vector interaction. 

The first He® experiment was carried out by Allen, 
Paneth, and Morrish.* An attempt was made to obtain 
the recoil energy spectrum of the ions in coincidence 
with electrons emitted at angles of 135° and 162° 
with respect to the direction of the ions. A schematic l 
of the apparatus is shown in Fig. 1. A retarding potential 
method was used for the analysis of the kinetic energy 
of the ions. The retarding voltage was applied to grid 4 
located between the grounded grids 3 and 5. 

The experimental data obtained with the Geiger 
counter at position A are shown in Fig. 2. Although 
the statistical accuracy is poor, most of the experimental 
points lie below the \=0 curve and therefore suggest the 
axial-vector interaction. 

The next experiment in our series represented an 
attempt to study the He® decay using a scintillation 
counter instead of a Geiger counter. A schematic 
diagram of the apparatus is shown in Fig. 3. The 
effective source volume was cylindrical in shape and 
was defined at one end by the scintillator and at the 
other end by the sliding gate A. This gate had an open 
and also a closed aperture. An aluminum foil covering 
the closed aperture was thick enough to stop the recoil 
ions, but also sufficiently thin to transmit the decay x 
electrons without appreciable scattering or loss of k 
energy. The difference between the counting rates 
observed with the gate open and closed represents the 
net rate produced by disintegrations occurring in the 


3 Allen, Paneth, and Morrish, Phys. Rev. 75, 570 (1949). 
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Tıc. 3. The apparatus used for the second He® neutrino experiment. The effective source volume was cylindrical in shape and was 
defined at one end by the curved surface of the stillbene crystal and at the other end by the gate A. The velocity spectrum of the recoil 
ions was measured by the time-of-flight method. The sliding gate B supported an alpha-particle source used for testing the two detectors. 


region between the gate and the scintillator. The 
experimentally determined time-of-flight spectrum is 
represented by the histogram of Fig. 4. In contrast to 
the results of our first He® experiment, the new data 
appeared to agree with the tensor rather than the axial- 
vector curve. 

The results of the experiment of Rustad and Ruby‘ 
became available before the completion of our second 
He® experiment. These authors measured the electron- 
neutrino angular correlation in the decay of He® and 
their results clearly indicated that the interaction was 
tensor. Since the statistical accuracy of this experiment 
was much higher than that obtainable with our method, 
we decided to give up our work on He® and to concen- 
trate our efforts on other allowed decays. 


EXPERIMENTS WITH A SINGLE ELECTROSTATIC 
SPECTROMETER 


A recoil apparatus with improved energy resolution 
and transmission designed by Maxson and Allen® is 
shown in Fig. 5. The disintegration electrons were 
detected in a hemispherical plastic scintillator and a 
spherical electrostatic spectrometer was used to 
measure the energy of the recoils. The front surface 
of the source volume was defined, as in our previous 
experiments, by means of a sliding gate. This single 
spectrometer was used in our studies of Ne!’ and A®. 


4B. M. Rustad and S. L. Ruby, Phys. Rev. 97, 991 toast: 
5 Maxson, Allen, and Jentschke Phys. Rev. 97, 109 (1955). 


Ne! is a positron emitter which decays according 
to the scheme 
NEO SECS 


with a spin change of J=} 


logft=3.27, (5) 


to J=%, no change in 
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Fic. 4. The time-of-flight spectrum of the recoil ions f the 
decay of Het. The ante p oil ions from 


ions and the electrons varied from about 100° to 180°. 
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Fic. 5. The experimental arrangement used to study the decay 
of Ne! and A*, The effective source volume is at Æ. A spherical 
electrostatic spectrometer was used to measure the energy of the 
recoils. 
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nuclear parity. The small ft value indicates that this is 
an allowed transition. According to the selection rules 
for allowed transitions, this is a mixed decay with both 
Fermi and Gamow-Teller interactions present. The 
= angular correlation coefficient computed from ft values 
= is approximately —0.04 for the ST and +0.04 for the 
_ VA combination of interactions. The observed energy 
_ spectrum and several theoretical curves are shown in 
_ Fig. 6. A least squares fit of the theoretical curves to the 
= Bec ental data yielded A= —0.21+0.08, where the 
i ndicated error is statistical. This correlation was 
aa. somewhat more negative than the value expected for 
= the ST or VA interactions. 
‘ 85 is a positron emitter which decays according to 
scheme 


d» 


A35 — Cl®+e,+y, logft=3.79. (6) 


or nuclei is /=$ in each case. Although this 
d transition, the ratio of the intensity of the 
-Teller part of the transition to that of the Fermi 
mart is only 0.05. The angular correlation coefficient 


eriments was A= -+0.85-+0.12. This value 
inc icated that the vector invariant is the 
f the Fermi interaction in this particular 


ther with those from the neutron 
apparently demonstrated that 
on is r than VT, a 
riment with a 
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EXPERIMENTS WITH A DOUBLE SPECTROMETER 


A second A% experiment was performed by Herr- 
mannsfeldt el alë The energy distribution of the 
recoiling ions was measured directly without a simul- 
taneous measurement of either the direction of motion 
or the energy of the associated electrons. The results 
were essentially the same as those from our first A? 
experiment. 

A schematic of the apparatus used in the second 
investigation of A** is shown in Fig. 7. The two spec- 
trometers were placed in series in order to provide space 
for differential pumping. The effective source volume 
for the first spectrometer was inside the cone at the 
left of the figure. The outer part of the source volume 
was defined by the metal wall of the cone and the inner 
part by the acceptance angle of the first spectrometer. 
Despite the use of differential pumping, the back- 
ground counting rate was about ten times the maximum 
true rate. In order to measure this background, a set 
of grids was placed in front of the exit aperture of the 
source volume. A retarding potential could be placed 
on the second grid to shut off the beam of ions coming 
from the source volume. The net effect from the source 
was given by the difference between the counting rates 
observed with the retarding potential on and off. The 
intensity of the source was monitored by a scintillation 
counter viewing part of the source volume. 

The experimental points and the theoretical recoil 
energy distributions for the decay of A are shown in 
Fig. 8. The experimental points correspond to an 
angular correlation factor of \=-+0.9740.14 which is 
in excellent agreement with the value of +0.980.08 
predicted for the AV interaction or the value of 
+0.99+0.04 expected for the TV interaction. In each 
case the predicted values of the angular correlation 
coefficient have been deduced from known ft values. 

Since the results of our second A** experiment con- 
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Fic. 6. The spectrum of the F39- recoil nuclei from the Ne” decay. 
The angular correlation parameter is —0.21+0.08. 
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Fic. 7. A schematic of the double spectrometer used in the study of He’, Ne”, Ne”, and A3. The effective source volume is inside the 
cone at the left of the diagram. A retarding potential could be applied toa grid i in front of the exit aperture of the source volume in 


order to separate the background from the true effect. 


firmed those of the first, this was strong evidence for 
the VA interaction. Additional evidence for the axial- 
vector form appeared shortly after the completion of 
our second experiment. The most convincing evidence 
was supplied by Goldhaber eż al.” who measured the 
helicity of the neutrino and found it to be left-handed. 
Their experiment clearly indicates that the Gamow- 
Teller interaction is axial-vector when it is supple- 
mented by the information obtained from measure- 
ments of the longitudinal polarization of decay electrons. 
These developments suggested that a repetition of the 
He® experiment would be highly advisable. 

A new He® experiment was carried out by Herr- 
mannsfeldt eż al.® using the double spectrometer of 
Fig. 7. The He® activity was produced through the 
Be®(2,)He® reaction with fast neutrons inside the 
CP-5 reactor at the Argonne National Laboratory. 
The data from our latest He® experiment are shown in 
Fig. 9. The energy spectrum below one-half the maxi- 
mum energy is distorted by the presence of doubly 
charged ions and cannot be used for a determination 
of à. The angular correlation deduced from the data 
above 700 ev was A= —0.390.05 corresponding to the 
axial-vector interaction. 

This series of angular correlation experiments has 
been completed with a measurement of À in the decay 


7Goldhaber, Grodzins, and Sunyar, Phys. Rev. 109, 1015 
(1958). 


of Ne”. This decay is complex with a 67% pure Gamow- 
Teller transition to the ground state of Na% with 
AJ=1, no change of nuclear parity and a 32% transition 
to an excited state of Na” with a probable spin change 
of AJ=0. Although this second transition is mixed, 
there are theoretical reasons for believing that the 
Fermi matrix element is very small. If this decay 
scheme is correct, the expected recoil energy spectrum 
should be a composite of two Gamow-Teller type 
spectra. Very weak transitions to higher levels of Na” 


R N(R) 


d= 97 


8 Herrmannsfeldt, Burman, Stahelin, Allen, and Braid, Ph ys. l 
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Fic. 9. The experimental data and predicted recoil energy 
distributions for the decay of He®. The sharp rise in the spectrum 
below 700 ev is caused by the presence of doubly charged Lis 
recoil ions. The angular correlation factor is A= —0.39=+0.05 
corresponding to the axial-vector interaction. 


can be neglected in the interpretation of this experiment. 
The measured spectrum is shown in Fig. 10. The 
angular correlation obtained from the upper one-half 
of the energy spectrum is A= —0.37+0.04 where the 
indicated error is statistical. 

The results of the Ne” experiment can be interpreted 
in two different ways. If we assume that the Ne*— Na” 
level scheme is correct, the experiment supplies addi- 
tional evidence that the Gamow-Teller beta-decay 
interaction is axial-vector. The quoted uncertainty in 
the intensities of the two transitions® is sufficient to 
account for the observed deviation in A from the value 
—}4. Alternatively, we may consider the Gamow-Teller 
interaction to be predominantly axial-vector and in this 


3 J. R. Penning and F. H. Schmidt, Phys. Rev. 105, 647 (1957). 
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Fıc. 10. The energy distribution of the recoil ions from the 
decay of Ne”. The spectrum is a composit of the spectra from the 
two principle transitions in the Ne? decay scheme. The angular 
correlation factor A= — 0.37 +0.04 indicates that the axial-vector 
interaction is dominant in the Gamow-Teller type of decay. 


case our experiment proves that the Ne*— Na” level 
scheme is correct. Our results disagree with the value 
=—(.05+0.10 observed by Ridley.’ 


CONCLUSION 


This series of experiments has removed most of the 
uncertainties concerning the nature of the beta-decay 
interaction which have been present for a number of 
years. We conclude that the dominant interaction form 
is VA. However, the present experiments do not 
exclude a possible tensor admixture of 10% or less. 
A scalar admixture of the same magnitude may also be 
present. 


10B. W. Ridley, Nuclear Phys. 6, 34 (1958). 
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j ie the original Yukawa formulation of meson theory, 

the m meson (as we now believe Yukawa’s particle to 
be) was to provide a natural explanation for B decay. 
The process 7— e+ was regarded as an elementary 
interaction and nuclear 8 decay was imagined to proceed 
by the route n — p+r — u+e+>. There are a variety 
of reasons why this scheme fails. Just the opposite point 
of view is now generally adopted, namely, that the 
nuclear 8 decay is fundamental and that the observed 
decay of the m meson is to be explained in terms of it. 
We do not exclude the possibility that 8 decay be 
described in terms of an as yet unknown heavy inter- 
mediate. Nevertheless, the nuclear B decay is to be 
regarded as essentially primary. In order to describe the 
actual dominant z-meson decay mode m — p+? it is 
necessary to assume the existence of another 6 decay 
like process, u-meson capture. The elementary process 
may be described as u+-p— n+», or equally well as 
n+p — p+); the first is the experimentally observed 
u-meson absorption reaction, whereas, the second, the 
annihilation of a neutron and an antiproton, plays an 
important role in m-meson decay. 

Since this is a conference on weak interactions, I shall 
not be able to say that one of the x mesons, the neutral 
one, decays into two gamma rays; electromagnetic 
interactions are too strong to be mentioned! Further- 
more, I will not be able to point out that a theory of the 
x decay can be given which is very similar to what we 
describe for charged pions. 

During the past year or two the field of weak inter- 
actions has become a surprisingly orderly one. The two- 
component theory of the neutrino, as well as the prin- 
ciple of lepton conservation now both seem to be well 
established. Both nuclear 8 decay and u-meson decay 
seem to be describable in terms of a vector (V) and 
axial vector (A) coupling. This statement has to be 
qualified somewhat in the case of nuclear 8 decay. There 
is the additional fact that in both £ decay and y decay 
the vector coupling constants are almost identical. 
Rather less is known of the coupling types for the 
u-meson capture reaction, but the dominant couplings 
seem to have about the same strength as in B decay. 
This is discussed by Primakoff. We tentatively assume 
that the apparently universal (V,A) interaction extends 
also to this Fermi process. 

Precisely what do we mean by a universal interaction? 
This can mean only that the basic interaction Lagrangian 
contains only these two coupling types. Given this basic 
definition let us see whether there is anything surprising 
in the observed decays. First, in a decay the V and A 
couplings are forced to be equal if we adopt the two 
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component neutrino theory. In 8 decay, ga= 1.25 gy, 
which need not be disturbing. The amazing thing is, 
with 8 decay and u capture involving strongly inter- 
acting particles and u decay involving only weakly 
interacting ones, that there is any kind of universality 
whatsoever. One would expect the existence of pions and 
other strongly interacting particles to modify greatly 
the effective matrix element for transitions between 
physical nucleons as compared to the u-decay process. 

Insofar as the vector coupling is concerned, Gerstein 
and Zel’dorich and Feynman and Gell-Mann have made 
a very attractive suggestion: They propose that there 
may be a principle analogous to gauge invariance in 
electrodynamics which would insure that the vector 
coupling constant in £ decay be the same even when the 
strong interactions are turned on. Recall that as a 
result of current conservation, or, if you prefer, gauge 
invariance, the charge of a bare and physical proton is 
the same. In order to achieve this goal the 8 decay 
“vector current density” gyyy,W [y is a nucleon field 
operator] must be augmented by terms which ulti- 
mately couple leptons directly to pions, etc., and such 
that the total “current density” 7,” satisfies 0j,”/dx,. 
The difference between vector and axial vector couplings 
in B decay is attributed to renormalization of the axial 
vector interaction. 

One troublesome point in connection with this pro- 
posal has been raised by Wightman, Telegdi, and 
Michel. When one computes the electromagnetic radi- 
ative corrections for u decay and £ decay, one finds that 
to lowest order in all couplings not a finite correction for 
u decay, but a logarithmic divergence in 8 decay. One 
may argue that if the nucleons are ‘“‘dressed” properly 
and the radiative corrections are then computed (some- 
thing no one knows how to do exactly) the result will be 
convergent. Nevertheless, it is not clear why, even if the 
B-decay effect is made finite, the two radiatively- 
corrected vector coupling constants should continue to 
be equal. 

Let us discuss in a systematic way the role of strong 
interactions in Fermi processes. The work to be re- 
viewed was carried out by Treiman and me and has 
been, for the most part, published elsewhere. I apologize 
for this, but in order to talk about something new. 
would have to make an obviously wrong new theory 
the correct one already having been given. = 

We suppose that 8 decay and u capture are described 
by the Lagrangian density, 


£1=Zofab.(1—ys)ivaxvhi Wnivavsy p) 


+Zofrd(1—ys)ri(nrrav ») ia $ 
+Hermitian conjugate, 


798 Merl 
where fa and fy are the unrenormalized coupling con- 
stants, and Zə is the nucleon wave-function renor- 
malization constant. The y’s are field operators associ- 
ated with the particles indicated by the subscripts; / 
stands for either an electron or a u meson. There may be 
other interactions of leptons. Among these are the 
direct pion couplings of Feynman and Gell-Mann, or 
perhaps couplings to baryons other than nucleons. For 
ao the time being we do not consider such possibilities. We 
| consider the processes (¢,)-+ p — -++v. To lowest order 

in the weak interaction, the matrix element computed 
from (1) is given by 


S=i(2n)'5(n+-p,— p— pi) M, (2) 
a where 7, p», p, pı are the four-momenta of the neutron, 
i neutrino proton, and electron (or » meson), and 

S M=4,(1—ys)iyxyourtn| P| p) 


+t,(1—ys)yxu(n| Val) (3) 


|n) and | p) represent physica] neutron and proton states 
and i 


Py=ZofabniyrxYW pn, Vr=Zofvbnrrw n. (4) 


The lepton spinors have been normalized according to 
Uryu = Ü,ys =l]. 

In the Feynman-Gell-Mann theory, V, would have 
additional terms. We do not use the explicit forms of Vy 
and P. NO 

The general forms of the matrix elements of Py, Va 
required for Eq. (3) may be deduced from invariance 
principles, and they are 


9 aR 2 i 
(al Palp)= (=) i(n){aiyxye—b(p—n)xvs}0(6), (5) 
B ilis nopo 


w. p 2 


$ m? Nà? 
CAPE =) DOO E o O 


Nopo 


ma hese formulas, m is the nucleon mass, and the 
spinors are normalized according to üu= 1. The fact 


46 a consequence of charge Bley and time 
invariance in the strong interactions. at 


m momentum transfer squared. 
Substituting these matrix elements in Eq. (3) and 
eae using: the Dirac equation for the leptons we find for M 


the sult 


4 


m of the usual 


atianecieSOR pee ta.thisanadxix element are shown in Fi ig. £ 
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al (x—p)?] and c| (n—p)?] are, for zero value of the 
momentum transfer, simply the coupling constants ga 
and gy of B decay. In u capture (n— p)?~m,? but a and c 
do not deviate much from g4 and gy over such an inter- 
val. The second term has the form of a conventional 
pseudoscalar interaction with an effective coupling 
constant nub. Barring strong dependence of b on mo- 
mentum transfer, this term is relatively much less im- 
portant in B decay than in u capture. The last term is 
identical with what has been called weak magnetism by 
Gell-Mann and is present whether or not the conserved 
vector current of Feynman and Gell-Mann is assumed. 
The magnitude of d depends critically, however, on this 
assumption. 

This is as far as one can go on more or less general 
grounds. What we have done is to study the four 
functions a, b, c, and d by dispersion techniques. It is not 
practical to discuss this investigation in detail, so we 
outline the elements that go into such a treatment and 
quote the relevant results. We wish to represent the 
functions in the following form: 


& 7” Ima(— £’) 
ee (tee. 
TY (3m)? 


el (e+ Ete) 
Imb(— +’) 
(8) =- e, a e 
E+E ie 
Baii , Ime(— 4’) 
a=- f di re 
T Y (2my)? E (EHE ie) 
f 
il (p= Imd(— ’ 
a=- f dé’ m ( E) 
T V (2m7)? EHE ie 


We have written the dispersion relations explicitly in 
such a way that a(0)=g4, c(0)=gy and make essen- 
tially no effort to relate the renormalized coupling con- 
stants ga, gy to the unrenormalized ones appearing in the 
original Lagrangian. The quantities Ima, etc., represent 
the imaginary parts of the various amplitudes (which i 
are real for positive arguments) and these may be ex- 1 
pressed in terms of the amplitudes for certain real 
physical processes. 

Consider first the vertex (| P,| p); it is slightly more 
convenient to study (0| P,|/p,in) which is related to our 
other amplitude according to 


poño] è 
(=) (0| Px|2,p,in) = 0 ,Laiyxys—b (p+) x15 ]u(d); 
m 


where va is a negative energy spinor and a is now a func- 
tion of (p+71)?. This matrix element may be imagined as 
describing the annihilation of a proton-antineutron pair 
to produce leptons via the interaction Pa. The depen th 

ence on the lepton variables may be factored out so tha t 
they no longer appear explicitly. The sort of things ' that 
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Fic. 1. The structure of the pseudoscalar matrix element 
(0|P,|np) is pictured: The first “term” is effectively the bare 
interaction, the second shows a virtual transition to a x meson 
which decays into leptons, the third shows the reaction passing 
through a three pion state, and finally we have nucleon-antinucleon 
scattering followed by lepton emission. 


The first element is essentially the bare interaction; in 
the second the pair annihilates to form a pion which 
then undergoes m— u decay; the third diagram shows the 
pair annihilating into three pions, which ultimately 
combine to yield the lepton pair; the fourth diagram 
shows the pair undergoing a scattering interaction be- 
fore annihilating to produce the leptons by the very 
matrix element we are studying. We are, thus, generat- 
ing an integral equation. Needless to say, there are an 
infinite number of diagrams which we have not shown; 
we cannot even compute all of the ones we have. The 
intermediate state involving three pions is too hard for 
us to handle, but the remaining three are manageable 
and the integral equation for a and b can be easily 
solved. 

The resulting solutions involve the *P; and 4S9 com- 
plex phase shifts for proton-antineutron scattering, the 
renormalized strong pion-nucleon coupling constant 
(G), the renormalized 6-constant [ga=a(0)] and the 
experimental m — u+v lifetime. The latter enters via 
the one pion intermediate state which contributes only 
to the effective pseudoscalar interaction b. We find 


® | eff 1(y) 
a(£)=ga exp; —— == He 
s ! m Vám? y(y+E—ie) 
£ & vV2GIF(—m,’) 

aol gat — 
E+m,? 


2m 
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m f j o(y) | 
rs m Yam?  y(y—m,?) 
a8 p(y) 
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where F(— m) may be related to the r—uų lifetime; ġo 


and ¢, are related to the tSo and */; phase shifts ôo, 61 9r dr 
according to gy 16 P 0.12 & 
Ree? sind d(§)=1.7—x— aE —— —| 7 
tang =———- 2m 3m 4a 6 mM: af 
1—Ime® sind : Mic 
where /?/4r=0.08 is the effective pseudovector cou plit 
We find, using the experimental value of the m—u constant of pion physics. Pn 
lifetime, F= —0.115(V2Gmg4/2n"). Using this valueand Adopting the Feynman—Gell~Mann theory, 
neglecting the contributions from the proton-antiney On eas s far as the static ti .¢., £=0) oncern 
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scattering (i.e., set Po>=¢1=0) we find “g,”=m,b(m,2) 
~8g4 as the effective pseudoscalar coupling constant 
that would be effective in u capture. The deviations of 
a from the value at £=0 are of order m,?/m?; of course, 
the three pion intermediate state could cause slightly 
larger-corrections but one would expect that for B decay 
or » capture the leading terms 


a~ ga, 
V2GF (—m,”) 
ttm?” 


= 


are certainly adequate. In m decay, one needs a and b 
for values of — > 4m? in which case the neglect of the 
many less massive states (such as 3, 5---pions) could 
be much more serious. It is our feeling that since the 
leptons are coupled directly only to the nucleon pairs 
(or perhaps more generally to other baryon pairs) that 
such pair states are more important than the lighter pion 
states. 

The effective vector interaction matrix element may 
be analyzed in a manner quite similar to our treatment 
of (0| P| 7p,in). We do not go into the analysis in much 
detail since the vector interaction plays no role in m 
decay. The matrix element (0| V,|7%p,7) is identical in 
form to that encountered in the study of the electro- 
magnetic structure of nucleons. If we follow Feynman 
and Gell-Mann we see that this parallel is essentially 
exact. As in the electromagnetic problem, the two-pion 
intermediate state is expected to play a dominant role 
for the relatively small values of (#-+-)?, namely, about 
m, encountered in u capture. In order to evaluate the 
two-pion contribution, we must know the matrix ele- 
ment for pair annihilation into two pions (even when 
the total energy extends into the unphysical region of 
total energy W, 4m?>W?>4uy") and also that for the 
pions to annihilate into a lepton pair. The latter process 
may also be analyzed by dispersion methods and we 
have done so in a rather crude fashion. For its evaluation 
one requires the matrix element for production of a 
proton-antineutron pair by two pions; the pair then 
annihilates via the original matrix element (0|V,|#p,in) 
(strictly speaking, there is a two pion intermediate state _ 
also which we neglect). We approximated all the matrix — 
elements encountered in the problem by lowest order 
perturbation theory and found 
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Fic. 2. Dispersion theoretic diagrams for m decay. The first 
shows a transition to an intermediate state with three pions, which 
we neglect; the second shows the usually contemplated transition 
through a nucleon-antinucleon pair. 


the calculation can be carried out exactly. In their 
theory there is a complete analogy with the electro- 
magnetic problem (except for slight numerical isotopic 
spin factors) so that we have without calculation, 


c(0)=gv, 
d(0) = (up—ux)gy/2m, 


where up, uy are the anomalous moments of proton and 
neutron in units of nucleon magnetons. Thus there is a 
clear-cut difference between the prediction of d(0) made 
by the Feynman—Gell-Mann theory and the con- 
ventional theory: The value of d is about fifteen times 
larger in their case, and raises it out of the undetectable 
range. 

An experiment to test the correctness of the Feynman 
— Gell-Mann proposal for conserved vector currents has 
been proposed by Gell-Mann. It is possible that the 
magnetic moment term, d, as well as the induced 
pseudoscalar interaction, b, may be detectable in certain 
-capture effects as discussed by Primakoff. 

We turn now to a discussion of the decay of the r 
meson. It is some relief to be able to say that only the 
mode m~ — u +7 need be discussed. The correct 
branching ratio for the channel m~ — e~+-7 presumably 
then follows from our basic Lagrangian containing only 
vector and axial vector couplings. It is easy to see that 
the axial vector coupling alone plays any role in r 
decay. The leptons are assumed to emerge from a point 
(in the sense of an arbitrary Feynman diagram) ; hence 
there is only one momentum vector in the problem, say 
that of the pion, p+. The pion is presumably a pseudo- 
scalar and thus it is impossible to construct anything 
other than a pseudoscalar or a pseudovector to be 
coupled to the leptons. Formally, the S-matrix element 
for the transition is proportional to A where 


A=W (py)ivxys(1+75)u(p»)(0| Pr| 7) 
+ (pyu)yr(1+¥5)«(p,)(0| Va |r) 


and the second term vanishes if one assumes parity is 
conserved in the strong interactions. 
We concentrate attention, therefore, on (0|P,|7) 


which we write as 
(0| Pal Tr)=— ilp) E (px?) / (2p x0)}. 


We can easily show that F(p,*) satisfies a dispersion 
=. relation of the form 


z 1 jimF(— #’/) 
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and our task is to express ImF in terms of calculable 
quantities. Analyzing the structure of the intermediate 
states which can contribute, we find the first few 
(judging them in terms of increasing rest mass) are as 
shown in Fig. 2. 

The first diagram shows the uncomputable transi- 
tion from one to three pions which ultimately combine 
to yield the leptons. There should then come states 
with 5, 7, ---pions, perhaps followed by zero strangeness 
states involving K mesons and pions. Finally, one comes 
to the neutron-antiproton state. There are three reasons 
for concentrating attention on this state: (1) it is the 
one conventionally envisaged in a qualitative discussion 
of m-decay; (2) the leptons are directly coupled to 
nucleons, hence, such states might be expected to be of 
great importance; and (3) we can do quite a reasonable 
job of evaluating its contribution (and cannot calculate 
any of the others). 

The individual pieces of our diagram are also treated 
by dispersion methods. We have already discussed the 
weak vertex in detail and so we now concentrate on the 
strong one, describing the virtual dissociation of the 
pion into a neutron-antiproton pair. In Fig. 3 the first 
diagram shows the “bare” interaction, the second, a 
three-pion state which by this time we neglect quite 
automatically, and, finally, the one we retain, namely, 
that involving a neutron-antiproton pair. This pair (in 
the rest system of the pion) is in a 4So state (isotopic 
triplet) and to the indicated approximation can be 
characterized by a complex phase shift, ôo. 

For this vertex function, K(£), say, one finds 


E+m,” 7° polt’) 
K (£) =V2G apl - f dé’ - : | 
T 4m? (¢’—m,*) (E+E ie) 


where ġo is the same function introduced in connection 
with our previous discussion of a— £b/2m, the quantity 
arising from the weak vertex. Putting our dispersion 
pictures together, we find for ImF(£), neglecting small 
terms ~m,?/m?, 


VZG 
ImF(£)= -| meat 
4r 


V2GF (—m,*)E] fét+4m?} 
x |e 
E+m,? 

(for — > 4m?, =0 otherwise), 

where H(¢) is given by 
2(é+-m," 2 polt’) 
H()=exp] of as l, 
4m? 
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Fic. 3. Dispersion theoretic diagrams for the strong pion- 
nucleon vertex. The first one is the direct interaction, the second 
the uncomputable three pion state and finally the important state, 
the one involving a nucleon-antinucleon pair. 
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and P means the principal value of the integral is to be 
taken at the singularity &’=—£. H is thus, aside from a 
factor, | K(é)|*. On substituting Im/(&) into the dis- 
persion relation for F, we find (neglecting m,?/m?) 


mv2 J 
F(—m.)=F(0)=— Gga— ‘ 
4r 1+ (G°?/4r)- J 


where J is given by 
1 7% (&—4m?)} 
J=- f di————]] (— £). 
T Y Am? H 


If there were couplings to other baryon pairs (beside pn) 
we would have 
D GJ; 
t 


m 
F (0) = V2¢4 H 
4r 1+ (1/4r)} G2J; 


where the G; are the strong coupling constants and the 
J; are computed from H; derived from the various ġo;’s. 

If we disregard the denominator and set the function 
H equal to unity, we get the familiar logarithmically- 
divergent result of perturbation theory. It has been 
conventional to say that if one puts the logarithm thus 
obtained equal to unity and computes the lifetime, the 
value of about sixty times the experimental one, which 
is found, is qualitative support for the correctness of the 
basic picture of m decay. The computation of the x-decay 
lifetime is really impossible for a person who believes 
seriously in the renormalization program based on 
perturbation theory. The 7 lifetime is a primitively 
divergent quantity whose presence must be accounted 
for by the existence of a so-called counter term which 
evidently then serves to remove the divergence and put 
in the observed decay rate by hand. One must prescribe 
the renormalized value of this divergent quantity. We 
obviously do not subscribe to this philosophy. Our 
feeling is that the function H(¢) plays a critical role and 
that the perturbation theoretical indications are 
irrelevant. 

We obviously do not know enough about the complex 
1$ phase shift for neutron-antiproton scattering to 
make a real quantitative study of J. What we have 
done, therefore, is to make a few simple models which 
have reasonable low energy behavior, and hope that 
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they are not too insane at high energies. The reason that 
this may not be too unrealistic a procedure is that, 
provided only H(¥) — 0 however weakly for large ¢, the 
integral J exists. Furthermore, since it occurs in the 
denominator, multiplied by @?/4r=15, we see that if 
J21/%5 the J term dominates the denominator; neg- 
lecting the unity, then, J cancels out. This is a kind of 
strong coupling limit leading to (0) inversely pro- 
portional to G, instead of proportional to it, as would be 
given in weak coupling. In the case of several types of 
baryon loops there would evidently be a kind of mean 
value of 1/G defined by >> G:J;/>> G?J;. In the global 
symmetry strong coupling limit (G;=G) the earlier 
result continues to hold. 
The models treated take for ôo the representation 


tando=k(a+ib), k= (E/4)—m? ]}}, 
which leads to 
tan¢do=ka/(1+-kb), 


and we define 69(k=0)=0. Various limiting cases of this 
expression have been studied (a>>b, b>>a) and in every 
case we find J/20.7. In a very unphysical case, namely, 
that of no absorption, 6=0, all integrals may be 
evaluated analytically, and we find 


2 ma+1 
J=- 


m ma—1 


{1—(m?a?—1)—} tan™(m°a?—1)i}, 


for the not unreasonable value of ma~3 we find J=0.7. 
Finally, making our strong coupling approximation, 
we obtain 


F(0)=—(V2Gmga/7) [0.11], 


using G?/4r=15. This is to be compared with the 
experimental value given earlier, namely, 


F(0)=— (VIGmga/t>) [0.115]. 


The agreement is rather impressive. It would be nice to 
hope that the neglect of all of the millions of states 
which we have made is contained in the 5% discrepancy. 
We are not quite so optimistic, but it is our feeling that 
the most important elements of the m-decay problem 
have been taken into account, and that a reasonable 
quantitative understanding of the process has been 
obtained. 
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E present paper deals with various aspects of 
theory of muon capture with emphasis on the 
1 etween theory and experiment. The theory is 
ased on an effective Hamiltonian, Her, which 
es muon capture with subsequent neutrino 
ssion by an aggregate of A dressed nucleons: 
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This Hers corresponds,! in a nonrelativistic approxi- 
mation for the muon and for the nucleons, to the most 
general Lorentz covariant transition matrix element for 
the reaction w-+p—v+mn in a theory where the 
lepton-bare nucleon coupling is V and A, where the 
neutrinos are emitted with unit negative helicity, where 
time reversal invariance holds, and where any bare 
hyperon, bare kayon “currents”? which interact with 
the lepton current, have the same transformation 
property under the charge symmetry operation as the 
bare nucleon, bare pion currents.”"* However we neglect 
in Eq. (1) “many body” terms in Herr arising from 
the possibility of exchange of virtual pions, kayons, etc., 
among the nucleons. Such many body terms depend on 
the relative space coordinates of pairs, triplets, . . ., of 
nucleons and are believed, on the basis of a rough 
analysis of the corresponding beta decay situation, to 
be relatively small. 

In Eggs. (1a) to (1c), gv, ga, and gp are vector, 
axial vector and “induced” pseudoscalar muon-dressed 
nucleon coupling constants effective in muon capture 
while gy and g4® are electron-dressed nucleon vector 
and axial vector coupling constants effective in beta 
decay. The numerical relations in Eq. (1c) between 
gv, gy®; ga, ga® arise from the assumption of 
“universality” between the V, A muon-bare nucleon 
and electron-bare nucleon coupling constants which 
implies that gy, ga differ from gy®, ga® only 
because of the differing nucleon four-momentum trans- 
fers in the muon capture and in the beta decay. The 
numerical relation in Eq. (1c) between gp“ and gay = 
due to Goldberger and Treiman? and to Wolfenstein,* 
is based on the assumption that a reaction such as: 
u-+7+ — v takes place predominantly via the sequence Ee 
of “steps”: rt > p-+ +7 — v which implies the 
possibility of muon capture via the ‘four-step proces: n 


ies up= 1.793, un=—1.913 are the proton, ne 
static) anomalous magnetic moments (in uni 
[2mp with e, mp proton charge, mass and 7=1, 
l pear in the interaction effective in 
(1b)] as a consequence of the Gell- 
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Feynman assumption of a “conserved vector current” 5 
which necessitates the existence, for example, of the 
reaction: y -+r* — v+7° and hence implies the possi- 
bility of muon capture via the “three-step process”: 
w+ p> w+ att — vrn — v+n. Also in Egs. 
(1a) and (1b), v=vv; is the neutrino momentum; 1, 1; 
and ø, ø; are 2X2 matrix unit operators and spin 
angular momentum operators for the lepton and the 
ith nucleon; r and r; are space coordinates of the lepton 
and the ith nucleon; r+), r;© are isobaric-spin 
operators which transform a lepton muon state into a 
lepton neutrino state and an ith nucleon proton state 
into an 7th nucleon neutron state; the factor 1/v2 
arises from the normalization of the neutrino rela- 
tivistic wave function; the factor (1—o-y;)/V2 is a 
consequence of the assumption of a maximum parity 
nonconserving two-component neutrino type muon- 
neutrino-nucleon coupling (~ (Wt (1+-75)/V2 rY) 
X (Ynt YYW p), etc.). Though the muon and the nucleons 
are treated nonrelativistically in the derivation of 
Has” all first-order nucleon recoil corrections, i.e., all 
terms in Hett ~v/mp, are nevertheless included. 


2. TOTAL MUON CAPTURE RATE IN CLOSURE 
APPROXIMATION—“ISOTOPE” EFFECT 


With the Horr of Eqs. (1a) to (1c) we can obtain 
the square of the muon capture transition matrix ele- 
ment, (|M.E.“|*), summed over all spin orientations 
of the neutrino and averaged over all spin orientations 
of the muon. A straightforward calculation gives 
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where m,’=m,/[1+(m,/Am,)] is the muon reduced 
mass in the parent mu-mesic atom and the nuclear 
s matrix element is expressed as 


|M. E.n H (a — b) |2 
= (Gy )?| (b| Ds r: exp(—~ivsa: 11) (rs) la)? 
+ (Ga)?| (b| os r: exp(—ivea: r) (ri) o%| 2)? 
+0 (Gp)?—2G4 Gp] 
X | (b| 0s r: exp(—iva:r:) y(r): vı|a)|’. (2b) 


In Eqs. (2a) and (2b), |a}, |b) represent wave 
functions of the two nuclear states involved in the 
capture process; the quantity e(r) i is the muon space 
orbital wave function normalized in such a way that 
g(r) > 1 as Z — 0, i.e., for small Z, 
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where éa is the binding energy of the muon in 
lowest Bohr orbit of the mu-mesic atom and Ee, £ 
are the energies of the nuclear states a, b. Thus, takii 
proper account of the density of final states avail bl 
to the emitted neutrino, the total muon capture ra 
of the parent nucleus in the state a|}, A (a), is 


dy; 
AW (@)=2n- dr X f SIME. |") (a0)? 


X [1+ 0/[(¥ea)?-+ (Amp) PI 
so that using E (2a) to (2c) 
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where 
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or, dropping terms in m,/Amp, : 
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The approximate expression for (nsa)? in Eq. 
involving neglect of the corrections for 
nucleus recoil and parent mu-mesic atom redu 
holds to better than 5% for A>12. 

The sum over b in Eqs. (3a) and (3b) 
energetically accessible states of the daught 
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largest for low lying states b. Thus one may have and which replaces the explicitly H»-dependent quan- 
dence in the accuracy of a closure approximation tities, vèa, va, by suitable averages (v)a, (n)a, which do 
ch extends the sum over all energetically accessible not depend explicitly on Æ». Such a closure approxi- 
ates b to a sum over all states b without restriction, mation applied to Eq. (3b) yields 
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2), (4e) 
a replacement which is justified since the d wave, g wave, --- parts of exp(i(v)avi-7:;) make a relatively small 
contribution to 


dy; j 
—(a NEY Ti Pr exp((v)av1- Tij) g*(r;) olro vı j: Yı »: 
T ij 
The quantity (Zert)‘, introduced by Wheeler in his original estimates of muon capture rates,® describes the vari- 
ation of the muon space orbital wave function over the extent of the nucleus the quantity Da(r) being the (direc- 


tionally-averaged-over) density function of the protons in the parent nucleus—the electrostatic potential appro- 


priate to Da(r) enters into the muon energy eigenfunction-eigenvalue Dirac equation which determines ea and 
g(r). It is clear that ((Zers)*/Z4) <1, and > 1 as Z—0. 


Introducing the expression for the neutron decay rate, 
In2 


1 
— = freul (gv )?-+3(ga®)2];  fneu(74)neu= (118035) sec,’ (5) 
(74)neu (27)? 


we have, from Eqs. (4a) to (4c) and (5), 


2 Ort o* (ri) olroi-o; 
7 


(m,/me)>  m(ln2) 


A‘) (a) = (Zets)4((n)a)» ———-_———(1— 5) 
9 ) ý ) (137)? (fneu(74) new) 
(6a) 
t = (Zets)4((m)a)2(272 sec) R(1— $a) 
with 
R=[(Gy H) +3 (T4) VL (gv )?+3(g4)?], (6b) 
sin ((v)arsj) 


Th lri aj 7:73) ((Gy)2+ Ta) 0; | 
(ol Ex EOE TAO ono] 


edela] a) 
g= ; (6c) 
2(Grye+3(ray)( f | e(r) |? adr) 


The quantity Ja(Ja>0), which would vanish if the nucleus had Z=A [in such a case: (<:-2,;)|a)=|a); 
7;®7;@|a@)=|a)] and also if (v), were >>{(rij)~} mean nucleon momentum within nucleus, describes, within 
the context of the closure approximation, the inhibitory effect of the Pauli exclusion principle on the muon capture 
process. This inhibition may be visualized as arising from the fact that the neutron created in the +p —> n+ 
process cannot be produced in states already occupied by pre-existing neutrons of the parent nucleus, and the 


corresponding Ja may be expressed in terms of appropriate nucleon-nucleon correlation functions in the parent 
nucleus. We have 
D 


(lz 


haley, 
Ei Heri rr OEG H(A) ( 2 ) 


=f =— 
ZEGA FST] 
sin((v)a|r—r’|) 
x f [ot ol Fa (nr )drdr' / [eo loadr 
((v)a| r—1'|) 
NaI, 
(a D4 (er 25-117; ) (Gy P+ (aoro ( 5 ) aX 
TE : 
ZL (Gr)+3(Ta™)?] 

sin((v)a| n= r'|) ‘ 

x f S er earelerindd / flea.) (a 
((v)a| r—r'|) à : 
6 J. A. Wheeler, Revs. Modern Phys. 21, 133 (1949); J. Tiomno and J. A. Wheeler, Revs. Modern Phys. 21, 153 (1949). ANNE 


7Sosnovskii, Spivak, Prokofiev, Kutikov, and Dobrinin, J. Exptl. Theoret. Phys. (U.S.S.R.) 35, 1059 (1958). 
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4 


a 


L {ilr 7: 75) (Gy )?+ La): 0l r)a — 45) ] ( 
J+ Pi; 
Dale r Or OTG H) Ca™)?0;- 05 ]} ( 3 ) a} 


a7 
1+0;:¢; 14<;:4; 
a ( 2 )( 2 ). "o 


F,)(r,r’) are nucleon-nucleon correlation density functions associated with a space symmetric, space 
ymmetric relative motion of the two nucleons since for |a) antisymmetric in rj, rj; 0:5, 6; 75, 7;®; 
= (exchange operator for r;, rj) |a); it is to be noted that Fa (r,r) =0. One has further 


(a| Ea (43-23-73 75) (Gy )?+ Ta H) o: 05} |) 


=—[(Gy™)2+3(P4)2}34+(a| (Gy AE- (TAJ (Pa )2L(¥)24+- (Y)2]] a), (74) 


[2 (4s: 23-7575) [ (Gy™)?+ TaY) 05 ]P35| 0) 


Z(A-—Z | 
=—[(Gy)?+3(La oy ering (Da)? ]{a| (Spr)? (Sneu)?— (Spr+Sneu)?| 2) 


1— 


743) 
Jour (re) 


Ñ TRA 
ee', T=} 3433 WOR OR - Zr, Be;; Sw ( Je; S= ( 7 


2 i 


D ito =D, Bt t53—D alm)’ = (T) — FA. 
ns (7a) to (7e) and (6a) to (6c) yield 


n 


A (a| (Gr JTT) (TORJE TATTO) YH) 2) 


; (272 sec |] Ea 
Š Q sec )Q 1 É ZE(Gr®)?+3(T4™)?] | 


(P) oR (rt) Parr) Med 


> Closur 


ae a Ong a 
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We proceed to evaluate A% (a) for the heavier nuclei, Z>6, A>12. Dropping terms ~1/Z in Eq. (8) yield ‘ 
sine a 


sin((v)q|r—r’|) 
A f noa Oe LCF (r,r')-+FaO (1, ede’ | 
A (a) (Zers)"((n)a)2(272 sec) RY 1 — (=) r—r'|) 


2Z > 
f | (r) |” Da(r)dr 


(y d 
(= a sin ( Ja |r— = o* (r)ol(r) BL Fa (1,1) — Fa (1,1’) ]drdr’ 
2 


((»)alr—¥ 
fleo laar 
Then, adopting physically reasonable functional forms for Fa)(r,r’), Da(r) such as, 


1:|r—r'| Sd 


Fo (r,r')=Ca P Dal Dar) (te fal|r—=r'|)); fallr=r'|)= ; 
0:|r—r'|>d 


1/[(4r/3)ro A rSrA! 
(Ca y= f fo. (r)Da(r’) (1+ fal|r—r'|))drdr'; Da GAG MSNA: 

i> nA 

remembering that [sin((v)a|r— r'|) (al r—1'|) and fa(|r— r'|) are both short ranged functions with comparable 

ranges ((v)a)?~d=roKroA}, and again neglecting terms ~1/Z, we have 


AW (a) (Zets)4((n)a)2(272 sec) R 


sin((v)al r— 
| af [= ) (7) ol) Dale) Dal’) fal lev [dea 


a(S 2A -| on 


7 |e) |2Da(r)ar 


= (Zets)*((n)a)2(272 sea] ‘i= (Z) 


Further, using once more the short ranged character of fa( |r— r'|), we obtain 


(Sa) feood f- hoe / feood 


(Spool Om] 


where we have taken (v)&20.75m, (see below), roœ1.25X 10-% cm=0. 67/m,. Thus, within the present appro 
mation, the “nucleon-nucleon correlation” parameter 6, and the exclusion principle inhibition factor Ja, pr 
tional to the fraction of nucleons which are neutrons and to ôa: $a4((A—Z)/2A)da (Eqs. (6a), (6c), (11a) 
‘11b) ], are both essentially determined by the parameter d/ To, i.e., by the ratio of the characteristic lengths ent 
into fa( [z= r'|) and Da(r). The characteristic length d is the movil of the Pauli-Fermi “correlation 
influence,” e.g., the Pauli-Fermi “correlation hole” surrounding each nucleon, and is determined by the 
of the mucleonentclcon forces and the exclusion principle. It is to be noted that up to terms ~1/Z, Ja 
with vanishing d, i.e., with vanishing nucleon-nucleon correlation. 

To obtain the aeea value of d/ro and so of 5, we consider the expression for the Coulomb en 


parent nucleus. We have ee 
Asin ogee eÈ / ae 
Keou=( 5 aee E (E) : hg oe Se 
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= 808 


‘and using a procedure analogous to that employed in the evaluation of Ja [Eqs. (6c) to (7e) ] we obtain a formula 
first given by Feenberg and Goertzel,* 


2 
Boon =42(Z—1) f f L A (GeP (1,1) 436.0 (1,r))drdr' 


|r—r'| 


LEZ (el Snel) f f a i Ga (r,r')— Ga (r,1’))drdr’, (13a) 


|r—r 
ca e eR 


Ga™(r,r')= (13b) 


FENE 


is the proton-proton correlation density function associated with a space symmetric, space antisymmetric 
relative motion of the two protons. Comparing Eq. (13b) for Ga+(r,r') with Eq. (7b) for Fa‘+)(r,r’) we see that, 
in each case, the same spin and isobaric-spin operators occur in the integrals in the numerator and in the de- 
nominator so that the influence of these spin and isobaric-spin operators may be expected roughly to cancel. ` 
(This is almost rigorously true for nuclei in the 1s shell, e.g., Hes, where |a) factorizes into ®(ry,---,%) 
XKXa(o1™,---, o4®; 71®,---, rs) to a good approximation.) It thus appears reasonable to identity G.)(r,r) 
with Fa® (r,r) so that from Eqs. (13a) and (10), 

Be Gz D) (4 (1) az- el Salo) 
Coul = (34 (Z — Sie +(4Z—(a ylay( 
Bie 5 roA? Ai $ ó SroAt At 


A “best fit” of Eq. (13c) to the experimentally determined Coulomb energy differences of various light and 
medium-heavy mirror nuclei yields 


where 


6 eé a) (130) 


d 

—=1.47; 5.=3.0 [from Eq. (11b)]. (14) 

To 
In this “best fit” the value of 7o used is consistent with the electron-nucleus elastic scattering data*’—the d/ro 
values of the individual nuclei naturally fluctuate somewhat about the aforementioned “‘best fit”? value so that 
"an assignment of, say, 10% uncertainty to the resultant value of 3.0 for 6. appears quite reasonable. Equations 
and (14) yield values of the total muon capture rate, A (a), which can be compared with experiment— 
be noted that the effect of the exclusion principle inhibition is very important numerically since 


A-Z A-Z 1 

1> eja- (——)s|-- 
2A 2A 4 

; =g for Cast; =5/32 for Mos; =19/208 for Pbs?8, etc. The associated isotope effect in the total 
pture rate is thus expected to be quite large, e.g., A“ (Cazo*®)/A ) (Caoo”)=23, and would be interesting 
y ex 1entally using separated isotope targets. 
, one need not appeal to any relation between F.*)(r,r’) and Ga‘+)(r,r’) with Ga) (r,1’) found 
der to determine ôs, but adopt instead the less ambitious procedure of fitting Eq. (11a) for A“ (a) 
for the total muon capture rates, with use of a single Z, A-independent adjustable parameter 
this type would also determine ((n).)?@, i.e., with a reliable estimate of (n)a2<(v)a/m» LEq. 
((Gy)?+3(04™)?)/((gv™)?+3(g4®)?) [Eq. (6b) ]. A more rigorous trea tment 
for A“ (a), without neglect of terms ~1/Z, with S |¢(r)| Da(r)dr= (Zet) /Z 
proton density function Da(r) obtained from electron-nucleus elastic scattering = 
n-nucleon correlation density functions 7,+)(r,r’), again identified with Ga*(1,1') as 
ectron-nucleus scattering.® Such a more rigorous treatment must await — 
the appropriate inelastic electron-nucleus scattering experiments.” 


{1-—JjJjJ= 


ript was completed, has applied the closure approximatio 
xpansion of ex Goa) or of sin((v)ars;)/ (Guay [as 

nly up to Osy 2 in this expansion. It is however to be 

clei (Z>6, A>12) and the results obtained b; 

y Tolhoek, no such expansion is used in 
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TABLE I. 
(»)o/mp ((n)a)? (al (T)?—(T@))?| a) (a| (¥))2-+(¥@))2] a) (a| (Spe)? (Soen)? — (Spe +S 
H 0.94 0.58 1/2 3/2 
H? 0.90 0.65 i f in 3 
He, 0.95 0.78 1/2 3/2 ) 
Hez 0.75 0.50 0 z 0 0 


In concluding this section we apply Eq. (8) for A“ (a) to the light nuclei Hi, Hr, He.*, Hes‘. We have fron 
Table I and Eq. (8), 


A (Hy) = 14X0.58X 272 secXRX 1, 


yo) 2 Ate 2 i Val r — i 
A“ (H2)=11X0.65X272 sexax i [PE AT ff xara | 
(Gy™)?+3(T4™)? ((v)a| r—1'|) i 
1 sin((v),|r—r’|) 
A“) (Hee?) =2!X 0.78 X 272 sec 1-- — — Ff, (r,r’ 5 
(He;?) X272 sec xax] ALS (lear) (r zarar |] 
A (He4)œ21X0.50X272 sec! XRX | 1— l f f Beko Boat | |, 
v)a| I — 


which exhibits forms for the corresponding exclusion principle inhibition factors, a (Eqs. (15) and (6a) to (6 
With regard to the first two columns of Table I for (v). [Eq. (2c) ] and (n)a L[Eq. (3c) | reasonable estimates hi 
been made for (Z,—E,.)=(excitation energy of daughter nucleus 6)+((2s)ground state— Ha); also 
= exp(— Zm,'r/137) has been approximated by 1 within the integrals. Further, from Eq. (7b), the r; being” 
coordinates relative to the nucleus’ center of mass, 


Fa (r,r) = ff |@a(11,42) |26 (11+ 12)6 (r— 11)5(x/— ro) dridre= |Palr, r’) |78 (r+ r): Hi? 
FP (1,7) = fff |q(11,12,73) [76 (r1 Hro ra) (r— 11) 5 (1 — ra)drıdrəd r3= |®aLr, x’, — (r-+r’) ]|?: Hes? 
FaH (r,r')= f f f f (Palt r2 r314) [26 (11+ ret Fat ra)ó(r— 11)8 (x — re) d rid rod rad T4 


= f 00.0, —(rtr’+rs) ]|%dr3: Hest, 


where the &, are space wave functions in the corresponding H£, Hes, Hest space, spin, isobaric-spin wave 
| a). (11,72, 06 -)Xa(o1,02®, - es 71,72), - --) with (11,02, °° -)=, (ri, ae -)=PiPa(t1,82, $; D 
priate spin, isobaric-spin wave functions, Xa, have been used together with Eq. (7e) in order to obtain 
in the last three columns of Table I. Use of simple variational trial forms for ®, enables evaluation of the Fa 
in Eq. (16) which, together with the (v), estimates of the first column, give the integrals in Eq. (1 
also numerical values of Q quoted below [Eq. (19) ] and a value of (T4/Gy™)?= 1.53 calcula 
(1b), (1c), and (4b) with (g4®/gy®)=—1.21," we obtain ‘ 


A (Hy!) = 14X0.58X 272 sec "XRX 1= 158 sec*X R= 169 sec}, 


(Gy)2-+ (Pa)? 
2\owv14 —1 || —— + |? 
A™ (Hy?)=14X0.65 X272 sec xaxlı Pe 


=177 soxax]i-| - 


AW (Hez®)=24X0.18X212 sec XRX (1—4 X0.66)= 23X 10? sec 
AW (Hes) 2224 0,50X 272 sec XRX (1—0.80) 


uÇ, S. Wu, Revs. Modern Phys. 30, 783 (195 
Tennessee (1958). ae KS ; 


The dependence of A“) (H°) on the ratio (Ty /Gy™)? 
is especially to be noted” as is the enormous isotope 
effect—factor =5—between A“) (Hes*) and A“) (Hee!). 
Both of these are essentially manifestations of the 
inhibitions of the exclusion principle on the total muon 
capture rate. Thus in +H; > v+n+n the dineutron 
must be produced in the °P; state if '4“=0 while for 
|'4|2Gy™ production in the dineutron ‘So state 
is possible. Since the dineutron !So state spatially 
H overlaps far better with the deuteron ground *S; state 
than does the dineutron êP; state, dineutron production 
in the !So state when |['4%)|2Gy™ is indeed pre- 
dominant, and one expects A% (H) to be greater 
when (I'4/Gy)*>1 than when (T4“/Gy™)?<1 
just as predicted by Eq. (17). Ina similar way the factor 
of 5 difference between the rates of y-+He2* — »+H?* 
and p-+He!—v+Hi' may be viewed as largely 
arising from the fact that the H,’ may be formed in *S; 
states (bound or unbound) which spatially overlap 
well with the 2S; ground state of He while the H‘ is 
formed in (unbound) 3P» 1,0, 1Po states which all have 
a poor spatial overlap with the 4S ground state of He+. 


a A E 


3. COMPARISON OF CLOSURE APPROXIMATION 
EXPRESSION FOR THE TOTAL MUON 
CAPTURE RATE WITH EXPERIMENT 


Equation (11a) for A% (a) has been compared by 
Telegdi, Sens, Swanson, and Yovanovitch® with their 
definitive measurements of the total muon capture rates 
in 29 elements from C6? to Us2”**. These investigators 
first calculated values of (Zett)* [according to Eq. (4c) ] 
for the various elements studied and then found that a 
plot of their [A (a) Jexper/(Zers)* values versus the cor- 
responding (A—Z)/2A values gave a nice straight line 
in agreement with Eq. (11a); the values of 6, and 
~ [(ne)a)2(272 sec)R], determined by a weighted least 
uares fit of their individual experimental points to 
s straight line, were 


=3.15 [vs 6.=3.0 in Eq. (14) ] (18a) 


oe sec+)R= 188 sec (experimental). (18b) 


n the other hand, Eqs. (1b), (1c), (4b), (6b), and 
3d), and (ga/gy®)=—1.21 " yield: 


1.06: (n)a (v)a/ m, =0.75 


= (19) 
1.07: (n)a (v)a/ m, =0.85— 0.95, 


e Phys. Rev. 91, 480 (A) (1953), and Proceedings 
iih ifth Annual Rochester Conference on High Energy Physics, 

k 41955), p. 174; A. Rudik, Doklady Akad. Nauk. 92, 
1953); H . Ube zal and L. Wolfenstein, Nuovo cimento 10, 


a ; Sens, Swanson, Telegdi 
e co Of Ta 1987); J. Sens, Ph.D, 
Phys. Rey. 113, 679 
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while the best @ priori estimate for 


(Nava /m={ 1— (€a/my 
— (excitation energy of daughter nucleus b)/m, 
J [ (Ep) ) ground state— Ea |/my}, 


[ Eq. (3d) ] supposed valid in the mean for all the various 
pairs of nuclei (b,a) involved, is (7) «&=(v)a/m,£20.75= 80 
Mev/m, with an uncertainty of, say, 10%. This value 
of (v)a/m, corresponds to an (excitation energy of 
daughter nucleus 6)=15 Mev which is of the order of 
those empirically observed. The value of ((m)a)*(272 
sec )®R for (n)c&=(v)a/m,£0.75= 80 Mev/m, is, using 
Eq. (19); 


((n)a)?(272 sec) R= 161 sec (theoretical) (20) 


which, in view of the uncertainty in the value of 
((n)a)?, must be considered in essential agreement with 
the experimental value of 188 sec™ [Eq. (18b) ]—thus, 
for example, the not unreasonable choice of (v)a/m, 
=0.80=85 Mev/m, yields ((n)a)?(272 sec+)R=185 
sec! (theoretical). In this way one finds support for the 
combination of basic assumptions on which our effective 
Hamiltonian Her“) [Eqs. (1a) to (1c)] rests, vis.: 
(a) “universality” between muon-bare nucleon and 
electron-bare nucleon coupling constants which implies 
the numerical relations of Eq. (1c) between the effective 
muon-dressed nucleon and electron-dressed nucleon 
coupling constants; (b) the presence of an “induced” 
pseudoscalar interaction’ with an effective muon- 
dressed nucleon coupling constant gp“ =8g4® LEq. 
(1c) ]; and (c) the presence of anomalous nucleon mag- 
netic moment contributions in the effective muon- 
dressed nucleon interaction associated with the as- 
sumption of a “conserved vector current.” 5 In par- 
ticular, if this assumption of the conserved vector 
current is abandoned and the anomalous nucleon 
magnetic moment contributions ~(up—Hn) to the 
effective muon-dressed coupling constants Ga™, Gp 
omitted from Eqs. (1b), (4b), and (6b), the quantity 
® of Eq. (6b) is 0.90 rather than 1.06 and 


((n)a)2(272 sec!)R= 137 sec" (theoretical). (20) 


It thus appears that somewhat better agreement is 
reached between theoretical and experimental values 
of the total muon capture rates if the assumption of a 
conserved vector current is retained. 


4. MUON CAPTURE TO PARTICULAR FINAL 
STATES OF DAUGHTER NUCLEUS i 


The first investigation in which a partial muon 
capture rate was determined is due to Godfrey. 
Godfrey studied experimentally the rate of that muon 
capture reaction È 


MHC — y+ Bs? 


“T. N. K. Godfrey, Ph.D. thesis, Princeton University (1 
and Phys. Rey. 92, 512 (1953). 
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which was followed by the beta decay of Bẹ? and 
accordingly obtained the partial rate of muon capture 
to all the bound states of By”. He further gave a 
qualitative argument in favor of the view that most of 
the muon capture transitions to the bound states of 
B? actually go to the ground state of B;!2. As a result 
he identified his observed partial muon capture rate with 
the rate from C6” to the Bs” ground state. Godfrey then 
established an approximate theoretical relation between 
the nuclear matrix elements for muon capture and for 
beta decay between the ground states of C, and B52. 
By using this relation and comparing his observed muon 
capture rate with the known B; decay rate he con- 
cluded that the Gamow-Teller coupling constants in 
muon capture and in beta decay are approximately 
equal. 

Fujii and Primakoff! recently re-examined and 
refined the relation between the nuclear matrix elements 
for muon capture and for beta decay between the 
ground states of C6! and B;?, and also extended the 
argument to the calculation of the ground state to 
ground-state partial muon capture rates in the reac- 
tions, 

u-+Li,° > v+He:’, 

p-+He’ > »+ Hi’. 
Using the expressions for the muon capture transition 
matrix elements, M.E.™, given in Eqs. (2a) and (2b) 


and the analogous expressions for the corresponding 
beta-decay transition matrix elements, M.E.®, 


|M.E.® |? 


1 
= F(Z, E.) | M.E-nua® (b > a)|?, (21a) 
2m) ® 


|M. E.n ® (b > a) |? 
= (gv )?| (a| Ds r: |b)|? 
+ (ga)?| (a| E: 7ia,|b)|*%. (21b) 


Fujii and Primakoff obtained an expression for the 

ratio of transition rates of muon capture from |a) to 

|b) and beta decay from |b) to |a), A“ (a—b)/A°(b—a), 
Zm 1 


-| oF wE) 


d 
J> D |M. E.ma (a — b) |? 


Agr Mb, Ma 


A“ (a— b) 
A® (b= a) 


x 
| M.E.nua® (b —> a) |? 


Mb.Ma 
2Ja+1 
=kX ( 
2Jat L 


a 


)xx , (22a) 
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where , 


(Ee) max 
f = Fyq(Z, E.) (Ee) max— E.)? - 


x E.((E.)?— 1)'dE. (22b) 


F,4(Z,E) = Fermi function. 


Fujii and Primakoff then calculated, employing appro- 
priate approximations for |a), |b), the ratio of the 
nuclear matrix elements for muon capture and for beta 
decay—X in Eq. (22a)—as: 


X (He? SS H’) =0.791 
X (Lis® S He:£)=0.619 
X(C S Bs”) =0.612 


(23) 


and, using also known values for nsa [Eq. (3c)] and 
A®) (b — a)/ foa=1n2/[ foar; (b — a) J, found, from Eqs. 
(23) and (22a), 


A“ (Hes — Hy’) = 1.46X 108 sec, 
A“ (Li; — Hes®) = 1.79 10 sec, 
A“ (C — Bs!) = 7.86 X 108 sec, 


(24) 


with an over-all uncertainty of some 10 to 15%. A 
calculation by Wolfenstein,!® based on general assump- 
tions entirely similar to those of Fujii and Primakoff, 
yields A“ (C6 — Bs”) =7.4X 108 sec. 

As regards experimental values of A“) (a —> b), data 
are at present available only in the Cs" — Bs” case 
and are: 


(9.05 +0.95) X 108 sec™,16 
(9.18 +0.5) X 108 sec“! 
[A (Ce? > Bs?) Jexpor= (6.6 1.1) X 10° sec-!8 
(6.8:£1.5)X10? sect? 
(5.91.5) X 10% sec™t.1t a 4 


15 L. Wolfenstein, Conference on Weak Interactions, Gatli 
Tennessee (1958, to be published). 
16 Argo, Harrison, Kruse, and McGuire, Conference 
Interactions, Gatlinburg, Tennessee (1958), and prepa 
published). 
1 Burgman, Fischer, Leontic, Handy, Meunier, 
Teja, Phys. Rev. Letters 1, 469 (1958). 
18 Fetkovich, Fields, and "Mcllwain Conference on 
actions, Gatlinburg, Tennessee (1958 58). ET 
3 Love, Marder, Nadelhaft, Sana and mo ) 
Weak Interactions, Gatlinburg, 
lished), R. Siegel (private communic cca 


a 
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TABLE II. 
(u) (Cg2B gs? 
An Se Sa Terms ~(u» —un) gp) /g4) 
7.86 included +8 
6.34 omitted +8 
11.80 included —8 
10.25 omitted —8 


It is thus clear, from Eqs. (24) and (25), that the 
theoretical value of A% (C? — B;”) agrees, within the 
over-all theoretical and experimental uncertainties, with 
the corresponding experimental value. This agreement 
offers further support for the validity of our effective 
Hamiltonian Herr“ (Eqs. (1a) to (1c) ]. 

It may be of interest in concluding this section to 
append a table—Table II—giving the theoretical values 
of A (C,— B?) with the assumption of the con- 
served vector current abandoned, i.e., with anomalous 
magnetic moment contributions [~ (up— un) | to Ga, 
Gp™ [Eq. (1b) ] omitted and with gp™ {within Gp 
[Eq. (1c) ]} taken as —8g4® as well as +8g4. Such 
an ambiguity in the sign of gp“/g,) may be con- 
templated since, without a sufficiently detailed theory 
of the st — yt-+y process,’ one can fix, on the basis of 
the known z* lifetime, only the square of the effective 
coupling constant for the u-+-7+— v “step” 34 in the 
pw +p— pw Hrn v+n “two-step” process. We 
find from Eqs. (2a), (2b), (21a) to (22b), (1b), and 
(1c), the numerical results given in Table II. These 
results, together with the values of [A“ (Cg”—B3”) exper 
in Eq. (25), indicate that (1) assumption of gp /g, 
=— 8, which is inconsistent with the usually accepted 
detailed theory of the m+— t+» process which 
involves the dominance of the +7 “intermediate 
state”, yields values of A (C6? —> B?) which fit ex- 
periment less well than values of A% (C2 — Bs!) 
calculated with the assumption of gp/g,®=+8; 
(2) if future observations uphold the first pair of experi- 
mental values in Eq. (25), use of gp“ /g,®=+ 8 will, 
for agreement between experiment and theory of 
A (C,. — B”), require inclusion of the anomalous 
magnetic moment contributions to G4, Gp“ and so 
support the assumption of a conserved vector current. 
This last conclusion agrees with that reached at the 
end of Sec. 3 on the basis of a comparison of experiment 

and theory for the total muon capture rate, A“) (a). 


5. “HYPERFINE” EFFECT IN MUON CAPTURE 
RATE—MUON CAPTURE IN HYDROGEN 


3 Ks The total muon capture rate, A“ (a), calculated in 
is : “Sec. 2 or the partial muon capture rate, A (a > b), 
= calculated in Sec. 4 are actually appropriate averages of 
2) ‘the total or partial muon capture rates from the two 

sae -mesic 
Mg d ual hyperfine states, of | Be werent nae 
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atom, A (Ja£3; a) or AM (Ja+3;a—b), e.g., 


at2 Fa 
AM (Jatz; a)+ AW (Ja—3; 0). 
4Ja+2 4Ja+2 


AW (a)= 


(26) 


Equation (26) represents such an appropriate average 
for A“ (a) in terms of A“) (J,+43; a); this average is 
“incoherent” and is weighted only according to the 
degeneracies of the two hyperfine states involved since 
(1) the energy difference, ĉea, between these two hyper- 
fine states is much greater than their width, i.e., 


ea DALA decay HHA Gare ; a)n 
Sea DALA decay HHA (u) QE 3 ; a)] ; 


and (2) the rate of conversion from one of these states 
to the other is (with the exception of hydrogen and deu- 
terium) sufficiently smaller than A decay +A“ (Ja3}; a) 
for the various mu-mesic atoms of interest.” The phys- 
ical reason for the difference between A“) (Ja+3; a) and 
A“ (J,—%3; a) arises, as discussed by Bernstein, Lee, 
Yang, and Primakoff*?! (B.L.Y. and P.), from the 
combined action of the following three effects (1) the 
correlation between the spin of the muon, ø, and the 
spin of the parent nucleus Ja is different in the two 
hyperfine states F+} =J,+4; (2) there is, in general, 
a correlation between the spin 30; of the proton that 
captures the muon and Ja; (3) the capture rate of the 
muon by the proton depends on their relative spin 
orientation via the terms G4“)o-o; and Gp“: vigi: ¥1 
in the effective Hamiltonian Hos: of Eqs. (1a) to (1c). 

The total muon capture rates from the two indi- 
vidual hyperfine states F,{#)=J,+4, A“ (Ja+3; 9), 
can be calculated on the basis of the Hers“ of Eqs. 
(1a) to (1c) by the closure approximation method de- 
scribed in Sec. 2. By using a procedure similar to that 
involved in the derivation of Eqs. (3a) to (8) we find 
that 


2 Conversion from the energetically higher to the energetically 
lower of the two hyperfine mu-mesic atom states occurs (1) via 
collisions with atoms of the muon moderating medium—this 1s 
important only for the electrically neutral and hence mobile 
hydrogen and deuterium mu-mesic atoms—see below; (2) via 
spontaneous magnetic dipole radiation; (3) via Auger electron 
ejection. The rates of (2) and (3) may be readily estimated an 
are, respectively : 


1 13 m 3m ra 
9 H H 
Ri =10 Ga) (Zest) (me 


mp} h 


$ 1 \6 MeN? Mpc? 
Rey~10 (737) Z (= He 


so that for example, Ri)~1X103 sec", Ris) ~4X105 sec? for 
Aljs’7; on the other hand, the value of 

{Adecay +A“ (Ja4; a) } z 
for Alı?” is ~1.2 108 sec. 


(1958). rnstein, Lee, Yang, and Primakofi, Phys. Rev. 111, 313 


and 


a KS 


AM (Jg+4; a) b) T Spr| a) 


where 
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— 12+ (Ja+$+4)71— 
A“ (a) ( ay" Z 


1 eee Saula) 1 (al2J- el a [— ZZ ah (a|2J- See 


(i= sn ne pa ape OERE end FOC 
. 2 26 (a|2J-S,,|c) 2 (al2d- S,r|@) 2 2 bw) (a|2J-S,,| a) (ota) 
A Ya P A-Z Ya 
1— |Z- Z = |—- ko 
2Z Za) 2 Za) 


b)=2(GyG,—4Gy Gp) —2[ (G4,)?—2G.Gp], 
dH =2(Gy G4, —16y Gp) +2[(Ga™)?— 32G4MGp], 
a) = (Gy )243(P4)2= (Gy )?4+3(G4)?+ (Ge)? 2G4Gp™, 
pH—qw 


= (YOTO+LYATO)) — (== 


w=(fl eolas /( f iv(n|ts.\dr), 


cel Hes =| ff a O, OAE EEEO iar] /| [| e)la odr, (27b) 
vja|r—r 


(En Ja Mal JD: [1+ r:)/20:0(1— r:) | Ea, Ja Ma) 
Itl (Eao Mal E OFO) En Ma) — 
(la| I-14, hlr nOn ON OM +d) (0;+05)— 7 (41X43) O(g —d™) (a;Xa);) ] 
x [6(r—1:)6(r’— r) A+ Pis)/2| a) 
Ease (ni) @ Fa [Eler 1:75) (6-4) (0,05) — 4l: Xr) (bd (a:Xo,;) ] : 
X (1+ Pis)/2|a 


sin((v)a|r—1’|) $ j 
= * (7) o(7’)3 (Fa (r,r) Fa (x,r'))drdr joe) l*2x(nar], 
=| ff may OAE VAs 
p= (a| (Gv (= (TE Ta) 0), 
VaT ((Gy)?— (Pa™)?)(a| (Sp) + (Sneu)? — (SprF Sneu)? |a). 


lso 
We now apply Eq. (27a) to the case of an odd Z, odd A nucleus so that (a|2J-Sneu|a)=0. Considering a 
the heavier ERA Z> 6, A> 12, and hence dropping terms ~1/Z within the curly bracket of Eq. (27a), we have: 


—) (YOTOLYOTOLYO X YO), 


Da ()= 


ef- 1 (a|2J- -K|a) nes laa RA 
aa KAS aa 
NESO) A pa ala 2 2(a|2J-Sp.|a)! 2 (28a) 
pened Se) a 5 | msi ean Tn 
A (a) Z 1—[4/2Z Jaat —[(A—Z)/2]00 
so that 1 1 (a|2J- Kja) 


A (Jab; a) -AW (Ja—};a) 2Jatl bO (a|2J:Sprla) af 2 la| 2J-Spla) l 
AW (a) E 2 1E4227 ka ae 
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The quantity in the curly brackets in Eq. (28a) or 

; (28b) is the ratio of the exclusion principle inhibition 
factors for A“ (Jo+4;a)—A“(Ja—4;@) and for 
= A™® (a) and should be close to unity. Assuming in addi- 
tion that for the purpose of calculating (a|2J-S,,| a) 
one can visualize the odd Z, odd A nucleus as con- 
sisting of an “outside” proton with orbital angular 
momentum L, moving about a spinless “core,” one has 


{a|23-Spr|a)=Ja(Jat1)—La(La+1)+3 
Jati :Ja= Lat} 


Deor nn ee 


~ <u, Spee a 
= so that, substituting into Eq. (28b), 
A® (Jat; a)—A™ (Ja—9; a) 
Se A®)(a) 
me” A. (2Ja+1)/Ja | ae (30) 
a Z L—(2F.+1)/(Ja+1) 3: JSa=La—4 


in essential agreement with the result for 
(A® (To4; a) —A (Jo—4; a))/A (a) 


_ obtained i in another way by B.L.Y. and P. The com- 
> bination of coupling constants in Eq. (30), b/a, 
has the numerical value, using Eqs. (27b), (1b), (1c), 
bear and (3d), and (g4 @)/gy®)=—1. Pil ee 
t a 

La 


ws 
4 


= b+) 


—=-— 0.945. 
ma . a!) 


(31) 


As an example of the magnitude of the hyperfine 
effect we calculate the quantity 
AM Ta+4; 2)—A(J.—}; 2))/A® (a) from Eqs. 


La=2, so 


(Ta+4; 2)— A® (Jg—3; a) 
AM (a) 


=—0.17, (32) 


ee Sat 

I Pear Pacienty large to be observed with 
mental techniques. 

| G and (27b) to the case of 


; (al23-s S,.]@)=2-4; 


-i (al ( or CO 
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=— 1.21 "J, A (Hy!) =169 sec [Eq. (17)], so that 


b) 


AOH}; Hh) = awan (1475) = 13 seen, (33a) 


b) 


A® (3—34; H) = sony (1 -) =636 sect, 
(33b) 


Thus the hyperfine effect is enormous— 
(AM (3—3; Hi)/A™ (3+3; Hr) )=50. 


Muon capture in hydrogen is also unique in that within 
a time ~[ pu] mu-mesic atom mean life 7+ 


(r+= {A decay PHA (5 3 ; Hy’) = 1/A decay 
= 2.21 10- sec) 


there is, under certain circumstances, a high prob- 
ability of conversion from the energetically higher 
triplet state with Fa =3-+-4 to the energetically lower 
singlet state with Fa =4— +4. This is a consequence of 
the fact that, as pointed out by Gershtein and 
Zeldovich,” the [pu] mu-mesic is electrically neutral 
and so wanders through the hydrogen gas or liquid 
making collisions with the (Hy')» molecules—these 
collisions occasionally result in “exchange” between the 
proton in the incident [pu] mu-mesic atom with one 
direction of spin and a proton in the target (H;')2 mole- 
cule with the opposite direction of spin, with the net 
result that the [pu] is converted from the triplet state 
to the singlet state. The rate of such a collisional con- 
version process is estimated by Gershtein® to be 
=~5X10° sec at an (Hı!) molecule number density, 
N/V, of 2X10/cm*—thus at N/V less than, say, 
2X10!7/cm? there is no appreciable collisional triplet 
to singlet conversion and the muon capture rate is, 
from Eqs. (33a), (33b), and (26), 


2h (344; Hy) +24 6-3; Hi) =A (Hy) 
= 169 sect. 


russ 


With increasing I/V the collisional triplet to singlet 
conversion becomes progressively more important and 
at N/V ~10%/cm? practically all the [pu] mu-mesic 
atoms are in the singlet state at the instant of muon 
decay or capture—the corresponding muon capture rate 
is, from Eq. (33b), A“ (4—4; Hy) =636 sec™. At still 
higher N/V, e.g., at N/V of the order of those in liquid 
hydrogen, formation of mu-mesic hydrogen molecule 
ions, [fup], becomes dominant?’ and it is expected 
that most of the muons will be found at the instant of 
their decay or capture in the lowest Bohr orbit of 3 


ere ps eS u CES 
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Coup]. Any such [pup] must have a total spin angular 
momentum of } (since one of its parents is a singlet 
[pu }) and so possesses a muon spin e—proton spin 
o1, 0 configuration intermediate between that in a 
triplet and that in a singlet [pu ]—in fact 


(E= 1=(ho+ orto.) 
=1-+( (dort to) ht Go): (hort 4o2))a 


so that 
(0-0:)a=(0- 0:)a= —((ort+4es)")o=0, —2 


in para, ortho [pup] while (¢-e:).=1, —3 in triplet, 
singlet [pu]. Further, since from Eqs. (33a) and (33b), 


be) 
AM (3+4; HH) ~ (ieoa) 
qa) 


1 b) 
t 


the muon capture rate in para [pup], ortho [pup] 


will, in an analogous fashion, be 
0 AC) 
(1+(_, x). 


A 


It follows that at high N/V the muon capture rate has 


the form 
DIA 

n(—)[aeany(1- )| 
V 


+z(—)conae an] 


+x(=) [enwa] 


[a (Z )rssa( Z) 


ta(Z JEEN eso] se 34) 


6) 
1+ lo . O1)a; para, ortho —) 
K) 


a 


3h) 


qa) 


where «1(9t/V), x2(9t/V), x3(9t/V), are the fractions of 
muons found in singlet [pu], para [pup] and ortho 
[pu-p] at the N/V in question—x(2/V)+<2(N/V) 
+a3(0/V)=1—and y is the ratio of the absolute 
square of the muon orbital wave function at the proton 
position in [pup] and in [pa]. Equations (33a), 
(33b), and (34) show that the muon capture rate in 
hydrogen, considered as a function of I/V, exhibits a 
maximum, £636 sec, at an intermediate value of 
N/V ; if in addition 2yœ1 and (*2(I/V)/«s(9/V))>1 
this rate falls to values 169 sec™ at high N/V as well 
as at very low N/V. 


m 


33 R, M. Cantwell 
CC-0. Gurukul Kangri University Haridwar collectisa Digna M,, PhD. 


It is necessary for the physical relevance of the whole 
above discussion that the isotopically natural hydrogen 
in which the muons stop be purified of deuterium to an 
extent—factor ~25-50 in relative abundance—suf- 
ficient to prevent any frequent occurrence of the 
“exchange” reaction [pu ]+d4 —> [du ]+p. Any ap- 
preciable rate for such an exchange reaction with the 
subsequent formation of [dup] and the resultant 
“muon catalysis” of d+p— He+y complicates the 
situation very considerably. On the other hand, a 
similar analysis may be given for muons stopping in 
pure deuterium with the major difference that at high 
N/V the sequence of reactions: 


Pane 
[du ]+d —> [dud]; [dud] 
H+H pte 


is anticipated with most of the muons eventually being 
trapped in Bohr orbits about the He nuclei.” Thus 
most muons stopping in pure liquid deuterium which 
are eventually captured by nuclei, are captured by 
nuclei of He»*, a circumstance which permits study of 
muon capture in He,’ without any anterior possession 
of a He,’ target. 

In concluding this section we remark again that the ~ 
study of muon capture in hydrogen will not only ulti- 
mately yield the most reliable values of the effective 
muon—dressed nucleon coupling constants, Gy, 
G4, Gp, i.e., values free of “nuclear physics” uncer- 
tainties, but will also shed light on several other inter- 
esting effects such as the collisional conversion in 
[pu] and the formation of para [pup] and ortho 


Che]. 


6. RADIATIVE MUON CAPTURE—TOTAL ` 
RATE AND PHOTON-NEUTRINO 
ANGULAR CORRELATION 


We now discuss the process of radiative muon capture. 
Here calculations of the total rate and of the shape of — 
the corresponding internal bremsstrahlung (I.B) 
momentum spectrum have been made by Cantwell?® — 
for capture by light and medium-heavy nuclei (Z/137 _ 
<1). Cantwell uses the effective Hamiltonian of Eqs. 
(1a) and (1b) but with all nucleon recoil corrections 
ie. all terms ~v/m, omitted so that Gy =g 
Ga“=ga™; Gp™=0. The radiative capture is vis 
alized as predominantly due to the “two-step proce: 


wtp yt) +p > vty 


and a second-order perturbation calculation, 
appropriate (free particle) Green’s function t 
the virtual intermediate u~ states, is t 
this way Cantwell obtains a relation fo 


4 J. D. Jackson, reference 23. 
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f radiative to nonradiative muon capture 
(a b; 00)d(Yr0/Vra™*)ay1 
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4r? 137 


exp= 20; T: exp(—tvba- 1s) 9(71) ; 


accessible states of the daughter nucleus; 
‘v1, Yba= Yoa Yı are the momenta of the penino 


€& (Es— E 
aan, (36c) 


(== 


My My 


e second square bracket describes 
I.B. momentum spectrum on the 
|b) of the parent and daughter 


1 (ya 
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D 


W = (gv )?(1++v1-y1) | (| exp’ | a) |?+ (gaH) A- v1 Y) | (| (exp’)o| a) |? 


) (1—myinadnad | 


dy; 
me Sa'(v1,¥1)) + ((gv™)?— (g4 coy f Die -yı(1— Ia” (v1,71)) 


(gr )2+3 (ga?) (1— 94) 


lusion epbcple inhibition factor given in Eq. (6c) and ga (vnd 


Yba 
dyı 
Vy pe 


+ (ga)?2 Re(vı- (b| (exp’)o|a)*y1-(0| (exp’)o| a))| 


, (362) 
dy; 
Z| f cor lexpl ope (ea l (expela)1 | 
exp Yer epil Hri) rol); (expos Li exp ilow tre) rere Ga 


(exp)o= >; r: exp(— iva: ri) o(r:) 9%. 


nuclei—this second square bracket is however such for 
Yba/Yba™®* = Yoa/ Vea ~0 and {1 that the I.B. momentum 
spectrum has the allowed K-capture shape both near 
its low-energy end and its high-energy end. Thus a 
careful study of the shape of I.B. momentum spectrum 
near its high-energy end will yield (y)o™*, the average 
value of yia™* over the states b; since (y)a™*=(v)a one 
will in this way find empirically the average value of 
the neutrino energy involved in nonradiative muon 
capture by a particular parent nucleus and so remove 
the major uncertainty associated with the evaluation 
of the coupling constant ratio R [Eq. (6b) ] from the 
experimental data [see Sec. 3, Eqs. (19), (20), et seq. |. 

We now apply the closure approximation for the 
evaluation of the sums over b in Eq. (36a). By using 
the techniques of Eqs. (3b) to (6c), we obtain 


: (37a). 3 
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with = 
("act yey! 1—2 Ea tion energy of daughter nucleus b) _ ls ESS : 
My My My | 
Ya ha 
Va = 
(yay es 


As a rough first approximation to evaluate Eq. (37a), 
we take Ja’ (v1,71)==9e'(v1,71)Ia; in this case the 
I.B. momentum spectrum has the allowed K-capture 
shape ~(1—«.)’. and the I.B. photon-neutrino an- 
gular correlation function is 


(gy™)?— (ga)? 
‘(on™ e 


Equation (38) is rigorous for radiative muon capture 


by protons and has been derived directly for this case 
by Huang, Yang, and Lee?’ (H.Y. and L.). As a second 
approximation, we can use the techniques of Eqs. (7a) 
to (11b) and obtain for the heavier nuclei, Z>6, A>12 


Sa! (v1,Y) Ia" (v1, =| —— | — 
1,71 1,71 2A A 


x [enem onons( E) n| Gw 


ane analogous to the result [Eqs. (6a), (11a), and 
(11b) ] 


a =(DIO) 


x[i )+ i | (39b) 


Equations (37a), (39a), and (39b) yield 
Araa™) (a; (Y)a)dxad Yı 
A® (a) 


a= | 


(ga™)?— (gy™)? | 
(gy™)?+3 (gaH) 


x|t—(1— TE + 


e shape for the I.B. PORIA spec m 
involves [as does the I.B. photon-neutrino ang 
correlation, Eq. (38)] the quantity (ga™®/gy 
Taking 


(g/g H) = (g4 ®/gy®)= (1.21)? 4; a 
(d/r0)?= (1.47)°& (ôa) t, oe 

=3.0 [Eqs. (11b) and (14)]; a 

(rat = (v) ty, EE: 


0.67 
ro=—— [Eq. (11b) et seg.) 
My 


and (A—Z)/2A=4, Eq. (40) gives 


sete —)(- Je a -ayuda 


to the total nonradiative capture rate is — 


‘a(S 


Equations (40), (41), and (42) hold al 
muon capture by a proton upon or 
((y)e2*2/m,)= (3)? and of the 
—0.35/5—this has been shown di 
Eq. (42) by H.Y. and L. 
(37a) since for a 
Ga'(v1,71)=0, Fa’ “(vay 
Cantwell?® "has given, < 
of Eqs. (37a) to Gt ) v 


formi HOS 


in the case » +Hes!— y+7+H*. This expression 
corresponds a a correction factor to the allowed 
K-capture shape =[1—1.8(1—22)aa] (for (g1“/gv™)? 
=(g4/gy™)?= (1.21)? ") rather than 


=[1—0.35(1—24) 2a | 


as predicted by Eq. (41) for the heavier nuclei. We 
should also mention that all of the results in Eqs. (36a) 
to (42) refer, in the case Ja~0, to the appropriate 
averages [as in Eq. (26) | of the various radiative and 
nonradiative muon capture rates over the two (Ja+3, 
Ja—%) hyperfine states of the parent mu-mesic atom. 
The I.B. momentum spectrum of Eqs. (40) or (41), 
is susceptible to observational test—a plot of experi- 
i mental values of Araa™® (a; (Y)a)/L(1— Xa) xa] vs the 
E corresponding (1—2,)a should, if our approximations 
are sufficiently accurate, lie on a straight line with a 
slope whose value yields (g4“/gy™)? (subject to the 
uncertainty in the numerical value of ôa). This I.B. 
spectrum peaks at x53 or (y}œ27 Mev and still has 
very appreciable values at (y),255 Mev, i.e., at I.B. 
photon energies high compared to the energies of any 
numerous background photons. This fact should permit 
detection of the muon radiative capture in spite of the 
relative rarity of the phenomenon. 

In concluding the present section it is important to 
emphasize the interest of a calculation of radiative 
muon capture using an effective Hamiltonian which 
includes all nucleon recoil corrections (terms ~v/mp) 
= and in particular contains the “induced” pseudoscalar 
= and the “conserved vector current”? anomalous mag- 
= netic moment contributions. Sues a calculation is now 
D eing carried out by Bernstein?® and is expected to 
= exhibit additional terms in the correction factor to the 

allowed K-capture shape for the I.B. momentum 
= Spectrum. 
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ad 7. PARITY NONCONSERVATION EFFECTS 


_ The anticipated presence of parity nonconservation 
effects in maximum amounts in the muon capture 
_ process is incorporated into the effective Hamiltonian 
= of Eq. (1a) through the assumption that the emitted 
neutrino carries unit negative helicity—this is expressed 
? mathematically by the (two-component neutrino 

coupling type) spin projection operator (1—o- vı)/VZ 
4 ine Hen of Eq. (1a). No direct experimental test 
Meee eee 


Mb, Ma 
a(a— b)= 


Mb, Ma 


O ee 
DS {Gr™)?| bl exp| a)|?+ 33 (Ga) (Ge)? 2G4Gp™) | (b| (exp)e| a)l?) 
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of this assumption is as yet available for muon capture 
but it has now been established that neutrinos emitted 
in the analogous processes of electron orbital capture 
and positron beta decay do possess a helicity =—1." 


` It would clearly be of great interest to observe parity 


nonconservation effects in muon capture and the 
present section is devoted to a discussion of four phe- 
nomena in which pseudoscalar quantities are to be 
measured—cf, Eqs. (44), (53), (57), and (59). 


(a) Angular Distribution of Recoil Nuclei in 
Capture of Polarized Muons 


Experimental evidence is now available that negative 
muons still retain an appreciable fraction of their spin 
polarization at the instant of decay or capture from 
the lowest Bohr orbit of the parent mu-mesic atom. 
This evidence is based on the observation of an aniso- 
tropic angular distribution of the decay electrons rela- 
tive to a unit vector, Sy,1, in the direction of the muon 
spin, 


(43a) 
(43b) 


T— Pe (8.,1° Pel; 1), 
(Sy; Ty Pneg. muon; 1) = AF 12° 


and corresponds to a residual muon polarization at the 
instant of decay or capture, Py, of about 15 to 20% 
for the case of various spin zero parent nuclei. The 
angular distribution of recoil daughter nuclei in a par- 
ticular state, the recoils being formed in muon capture 
by spin zero parent nuclei, is then also expected to 
exhibit an anisotropy relative to Sy;1, viZ., 


1+ Picea — 6) E Pree; 1), (44) 


where Pree;1 is a unit vector in the direction of the re- 
coiling nucleus and the anisotropy coefficient, a(a—9), 
is a quantity involving the muon capture nuclear 
matrix elements of Eq. (2b). To avoid complications 
associated with the “hyperfine” effect (Sec. 5) we 
confine our discussion until further notice to the case 
of zero spin parent nuclei—this case is in addition dis- 
tinguished by a lack of hyperfine-structure induced 
muon depolarization which, for example if Ja=z, cuts 
down the otherwise affective value of P, by a factor of 
at least 2 [see Eqs. (52b) to (52d) ]. 

Calculation of a(a— b) on the basis of the Horr of 
Eq. (1a) yields 


E (Gr™)?| (bl exp|a)|?+3(— (Ga™)?+ (Gr)? 26 4Gp™) | (b| (exp)o|a)|*} 


(45) 
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subject to the assumption that the wave functions |a; Ja=0+), |b; J} are such that 


2 | ©] (exp)o-vi| a) |=; 


p E [6l (expola)]2.! 
Mo, Ma Mo, Ma 
Formulas equivalent to that in Eq. (45), but without inclusion of the anomalous magnetic moment terms in the 
corresponding Herr“) and sometimes without the term in Gp“, have been given by Ioffe, H. Y. and L.,” Shapiro, 
Dolinsky, and Blokhintsev,® Wolfenstein,** Überall, Treiman,” and Fulton.%6 

In a case such as the ground state (Ja=0*) to ground state (Js=1*) transition in +C, > y+ Bs” (Sec. 4) 
the spin independent nuclear matrix element (b|exp|a) [Eq. (36b)] vanishes! and Eq. (45) becomes [using also 
Eqs. (1b) and (1c) with vsa=0.86 m, [Eq. (2c)], and (g4/gy®) =—1.21 ") 


— (G4)2+ (Gp)2—26 Gp 


a(a— b) = 
3(Ga 0)? (Ge)?—2G6 4 WG pH 


0.73. 


(46) 


Thus an accurate measurement of the angular distribution of the ground state Bs” recoils, P, being determined 
by a parallel measurement of the muon decay electron angular distribution [Eq. (43a) ], would yield information 
about the ratio Gpe/G4™ and hence about the ratio gpe/g4® [Eq. (1c) et seg. ]. The corresponding a(a — b) 
[Eq. (46) ] is, fortunately, quite sensitive to the exact value of ge /g48 and to the omission or inclusion of the 


terms ~ (up— Hn) being, for example, —0.33 if ge“/g,®=0 and if the terms ~(up— yn) are absent. 
We now apply the closure approximation to find the anisotropy coefficient, a(a), 


_ Lok (Gv™)?| (| exp|a)|?+-3(— (Ga™)?+ Gr™)?—2G.Gp™) | (b| (exp)o|a)|?} 


“COS (Gr Olexpl ol +4(G(Gx)+(Gr)—26,0°G| 6] (epl 
_ (Gv) (a| Lexp at Lexp a] a)+3(— (Ga) (Gr)? 26.4 “Gp™)(a|[(exp)o_].*-[(exp)o Jala), (47a) 
= (Gy )2a| [exp ]atLexp Ja|a)+3(3(Ga™)?+ (Gr™)?— 2G.4Gp) (a| [exp)o Ja -[(exp)o ]a| a) ? : 
sher 
where Cexph= di r: exp(—i)an-n)o(r); Llexp)ob=Xi r: exp(—ilr)avi rs) (rs) os (47b) 


which enters into the recoil nucleus angular distribution, 
1+ P,,c(@)(Sy:1* Pree;1), appropriate to the total muon 
capture rate by the parent nucleus. Employing the 
techniques of Eqs. (4a) to (11b) then gives, for the - 
heavier nuclei, Z>6, A>12 {using also Eqs. (1b) and 
(1c) with (v)4=0.75 m, [Eq. (3d)] and gs®/gy® 
IL PAL} 
(Gy )2— (G4™)?+ (Gp)? 26, Gp™ 

qzn aa 

aKa) (Gy)2+3 (Ga H)? (Gr®)?—2G4WGp W 


= — 0.39. 


(48) 


Further, if the daughter nucleus is unbound even in its 
ground state, e.g., 


paar Megis"6 = PAP {Na’ Sar? Nann}, (49) 


it is not unreasonable to suppose that in the great 


3 3 B. L. Ioffe, J. Exptl. Theoret. Phys. (U.S.S.R.) 33, 308 
1957). 

i f ® Shapiro, Dolinsky, and Blokhintsev, Nuclear Phys. 4, 273 
(1957). 

3 L. Wolfenstein, Nuovo cimento 7, 706 (1958). 

“H. Überall, Nuovo cimento 6, 533 (1957). 

35S. B. Treiman, Phys. Rev. 110, 448 (1958). 

36 T, Fulton, Nuclear Phys. 6, 319 (1958). 
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neutron carries off most of the available recoil mo- 
mentum. Under such circumstances the angular dis- 
tribution of the recoiling neutrons is, approximately, 


1+ P,a(a) (Sy; 1° Pneu; 1); (50) 


where a(a) is as in Eq. (48) and Pneu;1 is a unit vector 
in the recoil neutron direction. Uberall* has considered, 
on the basis of a Fermi gas model, the effective inter- 
action of the recoiling neutron with the remaining 
nucleons and has concluded that this effective inter- 
action is probably not sufficiently strong to distort 
appreciably the angular distribution in Eq. (50). Su 
We now consider recoil anisotropy for the case of 


Here the anisotropy coefficients are different in the two 
different hyperfine states of the parent mu-mesi 
i.e. one must distinguish between a(Ja+: 
a(Ja—};a). As a general rule it is obv 


atom hyperfine state is then a singlet 
spherical. As a particular illustration ` 
formulas for the case of the hyc 
[pur]. Here, by the remark jj 


a}; Ha) =0 


L 
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and, as shown in a calculation by Bernstein and Primakoff,*” 
$((Gy“+Ga4 (u)) 2 (Gp™)?—2(GyM+Ga Gp | 


SS o C 
Se eres: 80 (Gy+G4)2+ (Gp)2—2(GyM+G64)Gp] 


(51b) 


Thus at low (H!) molecule number densities, ‘where there is no appreciable collisional hyperfine triplet to singlet 
conversion (Sec. 5), the angular distribution of the recoil neutrons is: 


GAMES; Hy)[1+ P,a(3+4 ; Hr!) (sy;1- Pnou;1) hre KAE ; Hy’) 


(52a) 


A“ (H,') 
or {using also Eqs. (33a), (33b), (27b), (1b), and (1c) (with (v),.=0.94 m, [Eq. (2c) ], and (g4 ®/gy ®)=— 1.21 1)}, 
12- [(Gy“+Ga w) 24 (Gp) ?2— 2(Gy™+G,4 )Gp™ ] 


while the corresponding muon decay electron angular 
distribution becomes 


Py 
Sa 3 (Su; 1° ol; 1)- (52d) 


P,, must be interpreted in Eqs. (52a) to (52d) as what 
the residual muon polarization would be, at the instant 
of decay or capture, if the parent proton did not create 
any hyperfine-structure induced depolarization in the 
muon Ís, orbit—it is seen that the numerical values of 
Gy™, G4, Gp“ are such that the recoil neutron dis- 
tribution is practically isotropic; this is expected since 
(A G44; H/A G—4; Hi’) «1 (Eqs. (33a) and 
(33b) ]. At higher (H")2 molecule densities all the [pu] 
are collisionally converted into the hyperfine singlet 
state (Sec. 5), so that as already noted above and as 
first pointed out by Gershtein and Zeldovich,” the recoil 
neutron distribution is here certainly expected to be 
isotropic. 


(b) Angular Distribution of Photons in Radiative 
Capture of Polarized Muons 


The internal bremsstrahlung (I.B.) photons emitted 
in the radiative capture of polarized muons are also 


(53) 


i tum vector. 
f = yta Y1 is the I.B. photon momen ctor, 
ere fe) the corresponding anisotropy coefficient, 
: ns mentioned in Sec. 7(a), the dis- 


> 


ee 


“= Dri blished). 38 R, E. 
stein and H. Prigi, to be pu plished). Haridwar Collection. Digna by Seek eR A NS R ev. 107, 330 (1957). 


Su;1° Pneu; 1) (52b) 


2 [(GyY2+3(Ga™)?+ (Gp)2— 264 Gp] f 


P, 
= 1 [0.01 (ss 1Pneu;1); (52c) : 


cussion is again confined to the case of zero spin parent 
nuclei. 

The anisotropy coefficient B(a@— b; Ysa) of Eq. (53) ; 
may be directly calculated or may be obtained on the 
basis of a theorem of Cutkosky*® which shows that this 
B(a— b; Yra) is numerically equal to the helicity of a : 
massless positron emitted in the beta-decay process: . 
|a;Z,A) —> |b; Z—1,A)+et+y (Z/137&1; beta-decay ; 
process allowed or forbidden). Now in a beta-decay 
theory described by an effective Hamiltonian of the 
type of Eq. (la) with Gy“ —>gy®, Ga® —> g4®, 
Gp“ — 0 the helicity of such a massless positron is +1 
—thus in a muon radiative capture theory with an 
Het characterized by Gy“=gy, GaM=ga, = 
Gp“ =Q, i.e., with an Hors in which all nucleon recoil 
effects (terms ~y/m,) are omitted, one has 


B(a— b; Ya) = 1. (54) 


Thus, summing over all the energetically accessible 
states of the daughter nucleus, we obtain 


Do B(a— b; ba) Araa *? (a FU b; Yba 
Dp Araa™ (a = b; Yoa) 


B(a; (Y)a) = =1, (55) 


which last pair of equations have also been explicitly 
derived by H.Y. and L.” and by Bernstein.”8 = 
There is now very considerable interest in a calcu- = 
lation of B(a— b; Yia), B(a; (y)a), where one includes = 
nucleon recoil terms ~y/m, into the appropriate effec- = 
tive Hamiltonian, i.e. includes in particular the = 
induced pseudoscalar and the conserved vector cur- = 
rent anomalous magnetic moment contributions—such 
a calculation is being carried out by Bernstein.** The — 
effect of the pseudoscalar term can be foreseen quali- 


A 


a a a ÁÁ i Ă- e 
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tatively on the basis of Cutkosky’s theorem?! since 
it is known that with an effective V, A, P beta-decay 
interaction the helicity of a massless positron is less 
than unity; hence in a theory with gp“) 0 the cal- 
culated values of B(a— b; Ysa), B(a; (y)a) will also be 
less than unity. Thus any measurement of the I.B. 
photon angular distribution sufficiently accurate to fix 
a reliable value for (1—8(@;()a)) will, upon comparison 
with Bernstein’s theoretical expression?! for B(a; (y}a), 
yield an “experimental” value of gp“) which can be 
compared with the Goldberger-Treiman?-Wolfenstein‘ 
theoretical value of 8g4® [Eq. (1c) ]. 


(c) Polarization of Recoil Nuclei in 
Muon Capture 


The recoil daughter nuclei formed in muon capture 
are in general polarized even when the polarization of 
the muon itself at the instant of capture is negligible. 
This effect has been discussed in a fairly general way 


3 AW (4-4; a (Gy™+G64)2+ ee] 1A“ (3—4; Hi) 
3 (Gy +G4™)?+ (Gp™)?— 3 (Gy +G) Gp 


4 A (H,!) 


50 Eee 
(Gy)2+3(Ga)2-+ (Gp)?—2G4Gp 


This is very close to — 1 as is indeed expected from the 
fact that [A (3+4;HY)/AM (4-4; Hi) J«1 [Egs. 
(33a) and (33b) J. 

In the calculation given by Treiman® and by Fulto 
the Horr of Eq. (1a) with Gy=gy™, GaM=ga™, 
Gp‘) =0, has been used; also, for the reasons mentioned 
in Sec. 7a, the discussion should be confined to the case 
of zero spin parent nuclei. Making the additional 
assumption that the wave functions |a; Ja=0*), |b; Js) 
of the parent and daughter nuclei are such that the 
s-wave part of exp(—ivsa'ri) predominates over the 
d wave, g wave, -:: parts in the nuclear matrix ele- 
ments,! so that J,=1*, one may calculate the polariza- 
tion of the recoil daughter nuclei formed in a particular 
state with Jy»=1+. Since in a J=0*— Jp=1* transi- 
tion, as for example p +C6?= v+ (Bs!)ground state, the 
spin independent nuclear matrix element (b| exp|a) 
[Eq. (36b)] vanishes,’ the spin dependent nuclear 
matrix element (b| (exp)e|a) LEq. (36b) ] necessarily 
cancels out of the expression for the recoil polarization 
and this becomes 


nî 


2 — Preo;:1 H PuSp;1 


Joa ( Pro: D) 
( , if i ) 3 1—4P,Sp;1Preo;1 


(S7) 


Thus the recoil polarization is (anti) parallel to Pre;1 
only if the muon polarization at the instant of capture, 
P,, vanishes. One can also obtain the recoil polarization 
averaged over all possible directions of recoil: {(Jo;1)}m 
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by Treiman*® and by Fulton,’ and for the particular 
case of the hydrogen mu-mesic atom by Gershtein and 
Zeldovich” and by Bernstein and Primakoff.*7 For 
[pu] in the hyperfine singlet state Gershtein and 
Zeldovich” point out that conservation of angular 
momentum and the assumption that the neutrino has | 
unit negative helicity ensures, independent of the l 
magnitudes of the various coupling constants in He, 
that the helicity of the recil neutron is = — 1. For [pe] 
in the hyperfine triplet state an explicit calculation by 
Bernstein and Primakoff’? shows that the recoil neutron 
helicity is 


1 (Gy™+G4™)?+ (Gp)?—6(GyM+G4)Gp am 
= ee (S68 
3 (Gy+G4™)2+ (Gp™)2— 2(GyM+G4)Gp™ 


so that at low (Hj,')2 molecule number densities the 
over-all recoil neutron helicity is {using also Eqs. (33a), 
(33b), (27b), (1b), and (1c) (with (v)a=0.94 m, [Eq. 
(2c) ] and (g4®/gy) = —1.21 1)} 


ae a en en RO 


= 1 
A hoe) © i 


Gy G4 — (G4)?—Gy Gp 


=—0,99. (56b) l 


— this turns out to be (Jackson, Treiman, and Wyld®) ti 
{(Jo;1)}w= FP Sy; (58) 


and is just what (J»,1(Pree;1)) would be in a theory with | 
a parity conserving Herr. The quantity {(Jo,1)}a has i} 
recently been measured by Love, Marder, Nadelhaft, i 
Siegel, and Taylor on the basis of the observation of n 
the angular anisotropy, relative to s,;1, of the decay 
electrons of the daughter nucleus: (Bs5™)ground state: 
{(Js.1)}a was found to be positive if one identified the 
directions Sı and Preg. muon;1 LEq. (43b)] while its 
magnitude was appropriate to a reasonable amount of 
depolarization of the Bs" by hyperfine interaction with 
its atomic electrons. 


(d) Polarization of Photons in 
Radiative Muon Capture $ 


The internal bremsstrahlung (I.B.) photons emitted 
in radiative muon capture are circularly polarized 
independent of any residual polarization of the muon 
itself. Cutkosky,*® H.Y. and L.” and Bernstein?! have 
shown that the degree of circular polarization, 
B'(a— b; Yea) of any I.B. photon with momentum ie Ea 
B'(a— b; Ye.)= +1 for complete right-hand, left-hand 

® Jackson, Treiman, and Wyld, Ph: 


2 Love, Marder, Nadelhaft, Siegel, 
Letters 2, 107 (1959). 


. Rev. 107, 327 (1957). 
and Taylor, Ph aoe Rt 


a— b; a) which is the corresponding 
coefficient of the I.B. photon momentum 
distribution [Eq. (53)]. The conclusions 
Sec. (7b) about B(a— b; Yia), Bla; (Ya) 
n be applied to the corresponding quantities 


\ Dop (a—> b; Yo0)Araa™ (a — b; Yoa) 69) 
y Do Ara (a — b; Yea) 


NE TE 0 


so that in particular one expects deviations from 
complete right circular polarization of the I.B. photons 
only to the extent that the induced pseudoscalar inter- 
action is present. 
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REVIEWS OF MODERN PHYSICS 


Strange Particle Decay Processes and the Fermi 
Interaction“ 
R. H. Darz : 


è Enrico Fermi Institute for Nuclear Studies, University of Chicago, Chicago, Illinois 


1. INTRODUCTION 


T recently has become apparent that the processes 
of beta decay, muon decay, and nuclear capture of 
u` mesons may each be well described in terms of a 
four-fermion coupling of the form V—A, each of these 
couplings having a strength gm,?~10~’. Feynman and 
Gell-Mann! have suggested that these interactions may 
be parts of a more general weak interaction expressed 
in terms of a weak interaction current J,, this general 
interaction having the form 


Hoc II (1.1) 
where the current J, consists of the sum 
T= g Pya (1+ ys) tHg y (Hys) 
+ghiy(it+ys)p+:-:. (1.2) 


These authors, and earlier Gershtein and Zeldovich,? 
have suggested further that the terms of J, which 
involve strongly interacting particles, but allow no 
change of strangeness, may obey a conservation 
principle. 

As a result of these successes of the four-fermion 
interaction, attention is confined here to a survey of the 
possibility of accounting for strange particle decays by 

ed the addition of further terms to the current J,, terms 

which have the same form as those of Eq. (1.2) but 

which do not conserve strangeness. There are many 

possibilities for additional terms of this kind, for 

example Ay,(1+75)p, È Ya (1+ys)n, = Yu(L+79)A, etc. 

R The coupling of these terms to those of (1.2) which 

involve strongly interacting particles and no change in 

strangeness then lead directly to the pionic decay 

modes which are observed for the hyperons; their 

e coupling to the leptonic terms of (1.2) leads to leptonic 

decay modes for the hyperons. The decay of K mesons 

is then interpreted as due to their coupling to virtual 

hyperon-nucleon pairs which allow weak decays for the 

K mesons through the four-fermion couplings of the 
expression (1.2), for example 


K+ ——> pHi utt, (1.3a) 
strong weak 
s — pHi — rtr Hrt. 
$ weak strong (1.3b) 


* Work done under the auspices of the U. S. Atomic Energy 
Commission. 
1R. Feynman and M. Gell-Mann, Phys. Rev. 109, 193 (1958). 
( 2 $) Gershtein and J. Zeldovich, Soviet Phys. JETP 2, 576 
1957). 
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The possibility of a qualitative account of the decay 
modes of hyperons and K particles on the basis of such 
four-fermion couplings has been known for some time. 
It was first put forward by Dallaporta’ and by Gell- 
Mann,‘ and has been discussed in some detail by 
Gell-Mann and Rosenfeld’ in their recent review article. 

At this stage it is necessary to inquire to what extent 
this scheme offers the possibility of accounting quanti- 
tatively® for the branching ratios and detailed character- 
istics of these decay modes. However, such a complete 
program is not carried through here. Our present pur- 
pose is simply to discuss the degree of agreement be- 3 
tween the data and those expectations from this model 
which do not depend on detailed theoretical calculations 
of its consequences. 


a 


2. PIONIC DECAY MODES FOR STRANGE PARTICLES 
First the A decay modes, 


are considered. These modes are regarded as the result 
of weak interactions connecting (Ap) and (ñp). With 
a parity nonconserving weak interaction, each of these 
modes requires two parameters (s_,p_) and (5o,p0), 
which denote the amplitudes for emission of s- and 
p-wave pions in the m~- and m°-decay processes (2.1), 
respectively. Assuming time-reversal invariance to be 
valid for both strong and weak interactions, the phases 
of these amplitudes arise only from the scattering in the 
final pion-nucleon state, as first pointed out by Takeda.” 
Since the relevant pion-nucleon phase shifts are known 
to be small at the energy of A decay, the parameters s — 
mee = + a . Paws 


3N. Dallaporta, Nuovo cimento 1, 962 (1953); G. Costa : 
N. Dallaporta, Nuovo cimento 2, 519 (1955). i 
4M. Gell-Mann, Proceedings of the Sixth Annual Roche. 
Conference on High-Energy Physics (Interscience Publishers, 
New York, 1956). y 
5M. Gell-Mann and A. H. Rosenfeld, Ann. Rev. Nuclea 
7, 407 (1957). eÈ. 
6 The absence of evidence for the beta decay of the A pi 
appeared a stumbling block for this scheme for a long tim 
ever this process has now been observed by Crawfor 
Good, Kalbfleisch, Stevenson, and Ticho [Phys Rev. 
377 (1958)] and by Nordin, Orear, Reed, Rosenfeld 
Taft, and Tripp [Phys. Rev. Letters 1, 380 (1958). Thi 
rate suggests (cf. Sec. 4) that the term of J, associat 
rocess may have an amplitude of order 0.3 relative 
ta Seay of the neutron. > 
1 G. Takeda, Phys. Rev. 101, 1547 (1956). 
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Fic. 1. The simplest graphs leading to pionic A decay 
through the Fermi interaction. 


and p will be assumed real. The experimental infor- 
mation bearing on these parameters is at present limited 
to the following’: 


(so?-+-po®)/(s2+p2) =0.590.07, 
a_=2s_p_/(s?+p) 2 0.7340.14, 


(2.2a) 
(2.2b) 


the first representing the branching ratio between the 
modes (2.1), the second representing the information 
available on the up-down asymmetry in the decay of 
polarized A particles. The condition (2.2b) limits p_/s_ 
to lie between the limits 


0.45< | p_/s_| $2.25. (2.3) 


Calculation of the amplitudes for the processes (2.1) 
from the four-fermion weak couplings of the type 
(Ap) (pn) would involve consideration of many compli- 
cated radiative corrections arising from the strong pion 
couplings of these fermions, a task beyond our present 
ability. For the purpose of orientation, we confine 
attention to the simplest possible graphs leading to 
these decay processes, those shown in Fig. 1. For this 
graph the amplitude for process (2.1a) takes the form 


VZC (Ayp(1+-7s)p)9¢x/ 9%, 


=V2C{ (ma—m)+(mat+m)o-g/2m}, (2.4) 


in the nonrelativistic limit appropriate to the low- 
energy release in A decay, when q denotes the pion 
momentum. This crude estimate leads to a ratio 
p_/s-= (mx—m)q/2m(ms+m) = 0.64, corresponding to 
a value a_=+0.9 for the polarization coefficient in A 
decay. This value of p_/s_ lies within the limits (2.3) 
from the up-down asymmetry observed® in polarized 
A-particle decay and corresponds to the sign recently® 
deduced for æ from the polarization observed for protons 
resulting from the decay of unpolarized A particles. 
Such a large value of p_/s_is characteristic of a coupling 
which involves V coupling with the chirality factor 


edings of the 1958 Annual International 
: oe Bi nes Physics at CERN, p. 265; Glaser, 
Lie mi Morrison, Proceedings of Lee AnA A ama International 
pape: ; Physics at CERN, p. 270. 
OM, Grete or ee Caldwell and Pal, Phys. Rev. Letters 1, 256 
; oldt, J 


=a 
= 8p, Glaser, 
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(1+-ys); in the same approximation, S coupling (with 
this chirality factor) would lead to a ratio p_/s_~q/2m 
=0.05 and the large lower limit observed for a_ would 
then be difficult to understand in terms of a chirality 
factor. 

Further evidence concerning this ratio p_/s_ may be 
obtained from the data on the frequency of the two- 
body decay mode 


sH! > 2-+He', (2.5) 


relative to that for all the three-body 7 -decay modes 
of ,H‘, for example 7+ p+ Hi? or r-+2-+ He’. System- 
atic investigation by the EFINS—NU emulsion groups” 
at Chicago has established 35 examples of the two-body 
mode (2.5), compared with a maximum of 27 examples 
for other m ~-mesonic modes of ,H* decay. This last 
figure includes six events which probably represent 
three-body modes of aH‘ decay but which could not 
be uniquely established as representing ,H* decay 
events. From this evidence the proportion R of m~- 
mesonic ,H* decays which proceed through the two- 
body mode (2.5) is rather high, in fact, 


R20.6+0.1. (2.6) 


If the ground state of aH* has spin /=1, the process 
(2.5) must involve the emission of a p-wave pion. This 
can take place only through the p channel of A decay 
since the initial and final nuclear systems consist 
predominantly of s states. According to a recent esti- 
mate,!! the value R, for this case cannot exceed 0.25, 
reaching this value only with the upper limit (2.3) for 
p-/s_. On the other hand, with J=0, the pion emission 
can proceed through the s channel of the A-decay 
interaction and the corresponding estimate Ko can 
reach a value of 0.45 with the use of the lower limit 
(2.3) for p_/s_. There is now some reason” to believe 
that these calculations may involve some overestimate 
of the probability of the three-body modes, and cor- 
rection for this will allow correspondingly larger esti- 
mates (by as much as 20%) for R. However, the 
qualitative conclusion appears definite, that it is difficult 
to account for the value (2.6) observed for R unless 
J=0 for aH‘ and p_/s_<1. With this last conclusion, 


1 Levi-Setti, Ammar, Slater, Limentani, Roberts, Schlein, and 
Steinberg (Nuovo cimento, to be published). 

u R. H. Dalitz, Phys. Rev. 112, 605 (1958). i 

2? R. H. Dalitz and L. Liu (to be published). [Note added in 
proof.—Re-examination of the calculation reported in reference 11 
has shown that the conclusions drawn above are too strong. With 
J=0, the values calculated for R are consistent with the experi- 
mental result (2.6), within the stated error, for values of p-/s- 
less than 1.5. With J=1, the calculated value of R is within two 
standard deviations of (2.6) for p_/s_>1. Consequently, no con- 
clusion concerning the value of p-/s- can be reached from these 
data alone until the spin of 4H‘ is known independently. Con- 
versely, if a value of p_/s_ could be obtained directly (for example, 
from observations on the proton polarization resulting from 
polarized A decay), this argument may allow the aH spin to be 
deduced. The evidence on the nonmesonic decay rates for A 
hypernuclei mentioned below does suggest that it is rather unlikely 
that p_/s_ should exceed unity. ] 
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it appears that the p_/s_ ratio of 0.64 obtained from 
this crude model of A decay does not differ widely from 
the actual situation. 

A similar crude calculation may be made for the 
mode (2.1b) on the basis of the V—A interaction 
(Ap)(pn), with the result? C(Ay,(1+-s)n) dbo/dax,. 
This estimate predicts that 


So/s-= po/p_-= 1/v2. (2.7) 


This is consistent with the observed ratio (2.2b) 
between the (r°+m) and (1+ )) modes of A decay, 
and it further requires that the polarization properties of 
these two modes should be identical, a prediction of 
current interest, since the ratio ao/a_ may soon be 
determined from A particles by experiments in bubble 
chambers with working fluid of high density. 

However, these experimental predictions would also 
follow directly from the AT=} selection rule proposed 
by Gell-Mann and Pais" for strange particle decays. 
With T=0 for the A particle, this would require T=} 
for the (m+) system and therefore that 


$0/s_= po/p-= —1/N2. (2.8) 


It is obviously difficult to distinguish between the 
cases (2.7) and (2.8), since they differ only in the 
relative sign of the matrix elements for two different 
physical processes. However, this relative sign is, in 
principle, a physical observable and can have physical 
consequences, as shown in the following. At present 
there is no theoretical basis to favor the existence of 
such a AT=# selection rule. The proposal of this 
selection rule is based entirely on the empirical evidence 
that is discussed in the next section. It is possible to 
construct a four-fermion interaction which leads to this 
AT=+4 selection rule; in fact, for an interaction in- 
volving only the A hyperon, this would have the 
unique (V—A) form 


{ (Kp) (pu) + (An) (n). (2.9) 


However, this interaction does not fit with the view- 
point of Feynman and Gell-Mann as expressed in 
Eq. (1.1), since the only terms which they permit for 
the current J, involve a charge change AQ of one unit. 
This restriction appears necessary in order to forbid 
certain types of decay process (e.g., yt — eHe tet, 
K+— wt+y7+ 5) whose occurrence has not been ob- 
served, and excludes terms of the type (iv), (Am) or 
(ñn) from J,. The four-fermion interaction allowed by 
these considerations is composed of the current terms 


13 With the V—A form of interaction, the interactions (ab) (¢d) 
and (Gd) (bc) are identical. Consequently, with (An) (pp) replacing 
(Ap) (pn), the same factor C appears here, but without the 
factor VŽ since the emission is now of a x° meson instead of a 
m~ meson. 

1 M. Gell-Mann and A. Pais, Proceedings of the International 


Ce on High Energy Physics (Pergamon Press, London, 
1955). 


(Ap) and (pn), leading to the form 
(Ap) (Pn). (2.10) 


This interaction allows isotopic spin changes of AT=4 
and 3. The crude calculation of the A-decay amplitudes 
given in the foregoing, which is based on this inter- 
action, therefore leads to both T=4 and T=% compo- 
nents in the final pion-nucleon state. These combine 
fortuitously to give the result (2.7) and the ratio 2:1 
for the (1+ )/(7°+n) modes, in agreement with 
experiment. Inclusion of radiative corrections may be 
expected to modify the (7°+) and (# +) ampli- 
tudes in quite different ways. On this view, the agree- 
ment between (2.2a) and the prediction of the AT=4% 
rule would be regarded as fortuitous and would give no 
basis for expecting ao/a_ to be unity, as would follow 
if the AT=} rule held. Observation of the polarization 
properties of A— +7° decay will therefore have con- 
siderable relevance for understanding of the (1+ )/ 
(x®+-) ratio in A decay and of the basic mechanism 
underlying A decay. 

The process of nonmesonic decay of A hypernuclei is 
of special interest concerning the mechanism under- 
lying A decay, since the elementary four-fermion inter- 
actions (2.9) and (2.10) mentioned above actually 
correspond to processes of nonmesonic de-excitation of 
the A particle,!® thus 


A+p— n+p (2.11) 


from interaction (2.10). The strength yf appropriate 
to a current term (Ap) in J, is not well-known empiri- 
cally. The hypothesis of universality for the four- 
fermion weak interactions would require that f should 
equal g, the coupling strength appropriate to the terms 
(vet) or (ut); in fact, as shown later (cf. §4), there is 
reason to believe that f may be an order of magnitude 
smaller than g. To a sufficient approximation (of order 
10%), it is sufficient to consider only the non-relativistic 
approximation to the (V—A) interaction for (2.10), 
namely, 


(fg) } (PaYa(1 +ys)Yp) (Von U +ys)Yn) f 
œ(fg) Paypin (Yasay p) 3 (W,onyn)}. (2.12) 


This dominant term is parity conserving (convention- 
ally, the A parity has been defined to be even); parity 
nonconserving terms are of order v/c~0.25 relative to 
this term. There are many radiative corrections due to 
the strong pion and K-meson couplings of these parti- 
cles, which modify the form of the amplitude for the 
nonmesonic capture process (2.11) from this form 
(2.12). Several of the many possibilities are sketched 
in Fig. 2. Terms corresponding to Figs. 2(b) and 2(c) 
are of special importance and have been discussed in 
detail by Karplus and Ruderman.’® They may be 


18S, B. Treiman, Proceedings of the 1958 Annual International 
Conference on High Energy Physics at CERN, p. 276. 
18 M. Ruderman and R. Karplus, Phys. Rev. 76, 1458 (1949). 
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Fic. 2. The elementary four-fermion interaction (a) and some 
examples (b), (c), and (d) of pionic radiative corrections to this. 


expressed phenomenologically in terms of the ampli- 
tudes for A decay and for pion-nucleon coupling. In 
physical terms, they represent the process of internal 
conversion of the pion field generated by the A > N+r 
decay interaction, due to the presence of a neighboring 
nucleon. The sum of these terms and the basic inter- 
action (2.12) is 


ET SE A a BSE TOS 


V2G 4r (s—— p-o, —on 
Ca o,-ox)+ eet mao) = ew 9) 


2m 4 (ma—m)?—g?—m? 
Am (sot poa: 4) (ow q) 


(2.13) 
2m 4(ma—m)?— q?—m,2 

on here q is the momentum of the outgoing proton, G 
den tes the pion-nucleon coupling parameter (G?/hc 
, Sa is the spin vector of the initial A particle 
proton, ey is that of the initial proton or final 
n, and Pay’ denotes the spin exchange operator 
.-on)/2. Since these last terms are coherent, as 
alized by Cerulus,!” the relative signs of (s_,p_) 
a affect the rate computed for the nonmesonic 
capt rocess. The direct amplitude (2. 12) interferes 
only with the ~_ and po terms of (2. 13) in the total 
non mesonic capture rate. The expression (2.13) now 
: o illustrate two points: 

t the relative sign between s_ and so, and 
$- and ad does have physical consequences, 


es yin orientation of the A particle 
sent. The direct term (2.12) 
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light hypernuclei. For example, in ,H*, the (Ap) spin 
orientation appears to be singlet, from the arguments 
given above on the decay modes of ,H*, whereas in the 
mirror nucleus ,He’, there are two protons and the 
(Ap) spin orientations are randomly distributed. The 
ratio of the nonmesonic capture process (2.11) in these 
systems therefore reflects the spin dependence of the 
amplitude for this process. To illustrate this effect, we 

give in Table I correction coefficients F, and F, for | 
the Ruderman-Karplus calculation of the internal | 
conversion coefficient R, as function of the (Ap) spin 
state. In terms of F., Fp the internal conversion 
coefhcient R= (nonmesonic capture rate) /(a~-mesonic 
capture rate) is given by 


R= (Fs “R: +F ?R,)/(s?+p—), C 


where R,, Rp are the coefficients computed for /=0 and 
1 by Ruderman and Karplus. This estimate includes 
only the “internal conversion” graphs of Figs. 2(b) 
and 2(c) and neglects the direct term (2.12) as well as 
all other radiative corrections. It is intended only to 
illustrate the order of magnitude of these effects. 

No example of nonmesonic decay for ,H* has yet been 
reported although many examples are known of non- 
mesonic decay for ,He* and aHeë. This contrast may 
possibly be the result of experimental bias, since aH‘ 
nonmesonic decay can lead only to a one-pronged star. 

To sum up, we emphasize again the difficulty of 
making a quantitative estimate of the rate of non- 
mesonic A-hypernuclear decay in terms of the ele- 
mentary four-fermion interaction, owing to the compli- 
cations of the radiative corrections possible. However, 
there is one conclusion from the hypernuclear decay 
evidence which appears unlikely to be modified by the 
effect of further radiative corrections; with the esti- 
mates R,~1, Rpœ17 of Ruderman and Karplus for 
aHe hypernuclei, the observed value of ~1.5 for R(,He) 
obtained by Schlein'® and by Silverstein” still requires 
that the amplitude p_ should be smaller than s_. If 
p-channel emission were dominant relative to s-channel 
emission in free A decay, it would require very detailed 
cancellations to account for such a low value of R as 
that observed. 


TABLE I. Correction factors to the internal conversion 
coefficient in hypernuclear decay." 


(a) AT =} (b) Amplitudes (2.7) 
Spin state F, Fp F Fp 
S=0 9/4 9/4 1/4 1/4 
_ Sei 19/12 1/4 11/12 9/4 
Spin average 7/4 3/4 3/4 1/4 


(appropriate for 
AHet and ,He§) 


* These correction factors have been obtained in joint work with sS. 
Eckstein. 


R E. Schlein, Phys. Rev. Letters 2, 220 (1959). 
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STRANGE PARTICLE DECAY 


The decay processes of the X particles, 


=e | Calas (2.15a) 
Ez 

ptr’, (2.15b) 

D- >n, (2.15c) 


could already take place through the interaction (2.10), 
since the 2 and A particles are strongly coupled, but 
probably occur also through additional four-fermion 
couplings involving the 2 particles. The most general 
such coupling would allow isotopic spin changes AT up 
to } in strange particle decays. However, the isotopic 
spin changes may be limited to AT <4 by considering 
the interaction 


1 
{ S-n)-+— 9) | on (2.16) 


It is attractive to combine this with (fg/2)! times the 
term (2.10) to give the form 


(Je)? (ZN) (pn) 
A+2 


= (46) Em = 


p) ja, em 


where F, Z denote the combinations 


2 (“iam ), 


á ) es 


“a (e 


of = and A states appropriate to the hypothesis of a 
universal pion-hyperon coupling, advanced by Gell- 
Mann” and by Schwinger. The interaction (2.17) 
appears naturally on the hypothesis that the strangeness 
nonconserving current J,’ consists of the terms 


Jul = ft Zu (ltrs) N +2 uit) ¥), (2.19) 
each of which is a spinor under isotopic spin rotations. 
The hypothesis of a T= form for the strangeness 


nonconserving current has been put forward especially 
by Okubo et al.” 


Fr 


Fic. 3. Some of the lowest order graphs which lead to the 2+ decay 
processes through the four-fermion interaction (ZN) (NW). 


2 M. Gell-Mann, Phys. Rev. 106, 1296 (195 

21 J. S. Schwinger, Ann. Phys. 2, 407 

2 Okubo, Marshak, Sudarsan an, Ti 
Rev. 112, 665 (1958). 
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Fic. 4. The Gell-Mann—Rosenfeld diagram for representation of 
the matrix-elements for X+ and =~ pionic decay modes. 


The interaction current (2.19) is a T=4 form which 
corresponds to the rule AO/As=+1, whose empirical 
basis has been discussed recently by Gell-Mann and 
Feynman.! The main argument in support of this rule 
is the absence of evidence for As=-2 decay inter- 
actions of strength comparable to those observed for 
As=-+1, since the combination of two interaction 
currents with AQ/As=+1 and AQ/As=—1 would 
generate a four-fermion interaction of amplitude ~g 
which gives rise to processes As==2. In accord with 
this rule, expression (2.19) does not include any term 
corresponding to decay of the Z+ particle. This means H 
that the beta-decay process 3+ — n-+-et-+3 is forbidden j 
with these interactions (however the beta decay | 
Z+ — A+et+7 is still an allowed process). The pionic 
processes (2.15) of 2+ decay are forbidden in the lowest 
approximation corresponding to the graphs of Fig. 1 
for A decay, but take place through processes involving 
pionic radiative corrections, examples of which are 
shown in Fig. 3. The 2~— n-+77 decay, on the other 
hand, can take place already through the lowest order y | 

Í 
i 
f 


process analogous to that shown for A decay in Fig. 1. 

For the phenomenological discussion of the decay 
processes (2.15), Gell-Mann and Rosenfeld® point out 
that it is convenient to represent each of the decay 
amplitudes for these processes as a vector M in an 
(s,p) space. With the assumption of time-reversal 
invariance, the smallness of s} and p, pion-nucleon phase 
shifts implies that the vectors M+, M°, M- representing 
the processes (2.15) may be taken to be real. The 
existence of a AT=} rule would imply the following 
relationship between these real vectors, 


M-=Mt?t-+v2M?. 
The decay rates observed for the three pro 
are very nearly equal, which requires abou 


magnitudes for M-, M+, and M° 
the vectors M~, Mt, and v2! 
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(a) 
o p 
Kmr2 decay 


Fic. 5. Graphs illustrating the complexity of Ky2 and K3, decay 
processes occurring through an elementary Fermi interaction. 


angle B_~40°. As remarked, there is no corresponding 
lowest order calculation for 2* decay. However the 
AT=3 rule would then require that £, should have the 
value —50° or 130°, and that M° should be parallel to 
either s or p axis. 

The observations of Cool et al. on up-down asym- 
metry in 2* and 2 decay are completely opposite to 
these predictions, for they found that evidence for a 
polarization effect appears only in the (7+ ) mode 
(2.15b), no polarization effect being observed for the 
(r++n) mode nor for X- decay. Their observations are 
not inconsistent with the AT=} rule but imply that 
the triangle of Fig. 4 should have its sides OA, OB 
quite close to the two axes, with M° at an angle of 
order +45° to the axes. This failure of the lowest order 
calculation for 2~ decay to agree with the data certainly 
casts doubt on its relevance in the case of A decay; 
there is certainly no reason to believe that the pionic 
radiative corrections should not have a large effect and 
the degree of agreement noted above for A decay may 
well be fortuitous. Alternatively, it may be that the 
AT=% rule does not hold here and that no up-down 
asymmetry has been observed for 2 decay because 
the relevant 2~ production processes happen to give 
little polarization; if AT=3 transitions are also effec- 
tive, the present data certainly allow many other 
interpretations. 

In terms of the interaction (2.19), Kz and K,3 decay 
modes occur through even more complicated sequences 
of virtual processes, such as those of Fig. 5. With this 
situation, it is reasonable to expect the ratio of the 
matrix elements for Kz and Kart decays to be of the 
order of magnitude 1/M, M being the nucleon mass. 
Taking into account the ratio of phase space for 2r and 


ee 
23 Cool, Cork, Cronin, 
Ser. IL, 4, 83 (1959). 


and Wenzel, Bull. Am. Phys. Soc. 


Is Ist, JB) 


25 R, H. Dalitz 
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3m systems, the observed ratio P(6,°)/P (7+) =3X 103 is 
in reasonable accord with this expectation. The situ- 
ation for K,2+ decay appears exceptional and provided 
the first evidence suggesting the possibility of an 
isotopic spin selection rule in strange particle decays, 
the topic discussed in the next section. 


3. ISOTOPIC SPIN RELATIONSHIPS IN 
STRANGE PARTICLE DECAY 


The possibility of a AT = % selection rule for strange 
particle decays was first suggested by the empirical 
data, especially by the large ratio between the partial 
lifetimes for the K++ and K? decays. More recently, 
the possibility of isotopic spin relationships in weak 
decay processes has been considered in connection with 
the isotopic spin character of the weak interaction 
currents. Gell-Mann and Feynman! have suggested that 
the strangeness-conserving current involving strongly 
interacting particles may bear a close relationship to 
the mr=1 component of the isotopic spin vector for 
these particles; in fact, the current (Øn) of expression 
(1.2) is already the mr=1 component of a T=1 oper- 
ator. On the other hand the strangeness nonconserving 
current (2.19) has a spinor character in isotopic spin. 
This would mean that the interaction (1.1) formed from 
(ZN) and (pn) would be a combination of T=} and 3 
terms, so that AT=4 and AT=$ interactions would 
contribute to pionic decay processes for strange parti- 
cles. On the other hand, the leptonic modes, which 
would result from combinations such as (Zi) (#e), 
would allow only an isotopic spin change AT=3, as far 
as concerns the strongly interacting particles among 
the decay products. The empirical evidence bearing on 
these possibilities is in the following. 


K,2 Decay 


A strict AT=4 rule would forbid the K+" decay 
since the final pions have even angular momentum and 
therefore isotopic spin T=2 (T=0 being excluded since 
the system has nonzero charge). Existence of Kz: 
decay implies the presence of some AT=} interaction, 
with an amplitude of about 4% of the AT=4 amplitude 
leading to K? decay. Presence of this AT=? amplitude 
will also affect the K,° branching ratios; a strict AT=2 
rule implies the value 0.33 for the ratio of decay 
probabilities, 


R(K) — m° r°) 


= . (3.1) 
R(K P> m+) + R(Ki > rtr) 


T0 


Inclusion™?5 of a AT=% amplitude which accounts for 
the K,s* partial lifetime allows this ratio ro to be 
between the limits 


0.33{144(2T,/3T0)}}, (3.2) 


4M. Gell-Mann, Nuovo cimento 5, 758 (1957). 
Proc. Phys. Soc. (London) A69, 527 (1956). 
ndation USA 


STRANGE PARTICLE DECAY 


where T, and To are the K,2+ and K’ partial lifetimes, 
that is between 0.29 and 0.37. The empirical value of 
this ratio is still rather uncertain. The observation of Y 
rays from the neutral K, mode leads to the value 
0.14+0.06, whereas the observation of A particles 
produced in the 7+ reaction, but unaccompanied by 


K? decay in the charged mode, leads to the valueft 
0.30Æ0.06. 


A and £ Decay 


The branching ratio R(A— p+7~)/(R(A > p+r) 
+R(A — n+7°)) is observed to have the value 0.63 
+0.03 in close agreement with expectation (3) from 
AT=}. As emphasized by Okubo et al.,” this agreement 
could also be fortuitous, since there is a certain combi- 
nation of final T= and å states which also corresponds 
to the observed ratio. 

The 2-decay situation was discussed in the foregoing. 
The present evidence is compatible with the AT=4 
rule but provides no strong argument for it. 


K;,-Decay Modes 


From the spectrum observed for 7* decay,” it is 
reasonable to conclude that the final 3r state is predomi- 
nantly symmetrical between the three pions; this is 
also the theoretical expectation for the decay of a K 
particle of spin zero. There are only two totally 
symmetric configurations for three pions, one with 
T=1 and the other with T=3. If the amplitudes of 
these states in K» decay are denoted by J; and Js, 
the ratio of the r’ and 7 modes in K+ decay is then 
given by% 


R(r’)/R(r) = 1.295 (11 — 213)?/ (2+13)?. 


The observed ratio 0.32+0.05 obtained from the present 
data? agrees well with the value 0.325 expected for a 
pure T=1 state; however there is also a nonzero 
solution for J3/Z; which gives the observed value. 
Accepting the solution J;/J:~0, this agreement with 
experiment still does not provide any support for the 
AT=4 hypothesis, since both AT=} and AT=} 
interactions can reach only the T=1 state. The experi- 
mental ratio will be attained with any interaction which 
produces pions predominantly in a state of total 
symmetry and which allows only AT=} and AT=3. 
However this situation does not hold for the com- 
parison of the partial lifetimes for 3m decay of the K+ 
and K.° mesons. With the AT=} rule these partial 


(3.3) 


t Note added in proof—Crawford, Cresti, Douglass, Good, 
Kalbfleisch, Stevenson, and Ticho [Phys. Rev. Letters 2, 266 
(1959) ] have now obtained the value 0.32+-0.04. Their observa- 
tion of y rays from three K? decay events raises the former value 
to 0.18++0.05, still rather low. 

26 See R. H. Dalitz, Repts. Progr. in Phys. 20, 163 (1957). 

27 Birge, Perkins, Peterson, Stork, and Whitehead, Nuovo 
cimento 4, 834 (1956); Alexander, Johnston, and O’Ceallaigh, 
Nuovo cimento 6, 478 (1957). 
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lifetimes should be equal?5.28 and 2 of the K.°— 3r 
decay processes should be of the mode (x+-++-z2-+7°). 
With the known decay probability R(K3,.+) =6.0 108 
sec, the AT=3 rule therefore predicts that this mode 
should have decay probability 


eee eh Rt TG ATR a s 


R(K2 > rtr +r) =2.4X108 sec. (3.4) 
In contrast, the empirical lifetime of the K? meson is 
8.1234X10-8 sec and it is stated by Bardon e al? “4 
that not more than 15% of the K? decay events giving | 
charged particles can represent the (x+-++-2—-++7°) mode. 
This corresponds to the empirical statement 


R(K2 > rt+r +r?) <1.8(£0.6) X10 sec, (3.5) 


which certainly provides a considerable overestimate 
for the empirical value of this decay probability since 
it includes events for which this interpretation is f 
doubtful and omits the possibility of unobserved neutral } 
modes of decay. As a result, there is some degree of ji 
disagreement at this point which may provide evidence ji 
of the influence of ATY=# interactions on this decay i 
mode. If J, and J,’ denote the contributions of AT=4 ji 
and AT=$ interactions, respectively, to the final T=1 | 
state, the amplitude of the T=1 configuration is i 
(1-411) for Kast decay, but (it+hi’) for K2°— 3m i 
decay. Generally, assuming a totally symmetric state 
for the final three pions, the ratio of all 3m modes for 
K2 and K* particles is 


R(K2— 3r) (IHIH: 


—— (3.6) 
R(K+— 3r) (1—3’)?+J3? 

With 7;=0 (absence of AT=# interactions), this ratio 
can fall to a value of 0.5 for a AT=% admixture as 
small as J;//J;=—0.2. The ratio (34°)/(a++2-+7°) 
for K2} decay should still be 1.5, and it would be of 
interest to have this prediction of the K? — 37° decay 
rate checked, for example by exposure of a dense bubble 
chamber in a K? beam. 


SIP SETAE EES LRT TI aee 


Leptonic Modes 


Here the hypothesis is that AT=} holds for the 
strongly interacting products of this decay. In this 
sense, interactions for which AT=$ certainly do exist, 
as shown by the existence of the K,:+ mode, for which 
the only strongly-interacting particle taking part is 
the K* meson itself; the question is whether these are __ 
the only interactions. At present the main physical — 
consequence of this rule is a relation between t 


AA th 
So E E S S 


Okubo a 


238 This has also been pointed out by 
A. Pais and 


(private communication). See also 
Phys. Rev. 106, 1106 (1957). 
9 Bardon, Fuchs, Lande, Lederman, Chin 
Phys. Rev. 110, 780 (1958); Bardon, Lanc 
Chinowsky, Ann. Phys. (N. Y.) 5, 15 
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states, namely 
M (K? —> mut +i) =V2M (K+ —> +u), (3.7) 


and a similar relation for the electron modes. Okubo 
et al.” point out that, irrespective of whether or not 
time-reversal invariance holds for these interactions, 
this result implies a definite relation between the K? 
and K+ decay rates 


R(K2 > rt+ut+7)=2R(Kt —> w+y++7), 


(3.8) 


and a similar relation for the electron modes. With the 
i = decay rates 3.4X10° sec! and 3.3X10° sec for the 
f K,„ and K,3;+ modes, as given by Gell-Mann and 
Rosenfeld,’ a total decay probability of 13.5 10° sec 
is therefore predicted for the leptonic modes of K? 
decay. Since there are observed some K? — 3r-decay 
events (the AT=} rule would require a total of 6.0 10° 
sec! for charged and neutral modes) in addition to 
these leptonic modes, the observed K? decay proba- 
bility, given as 12.3*3%< 108 sec™ by Bardon ef al.,” is 
somewhat lower than these remarks would suggest. 
However this does not necessarily vitiate the AT=4 
rule for the leptonic modes for it may simply indicate 
some degree of cancellation between J; and J,’ for the 
3m modes of K? decay. The empirical lower limit for 
Ko — (rtr +r?) events is 5% of all charged modes, 
and the addition of 7.5% for K2 — 37° modes (required 
if T=1 holds for the final 3m state) would then bring 
the predicted probability only up to 15.3X 10° sec~, 
well within the experimental errors of the observed 
value. 
_ To sum up, the only evidence pointing strongly to 
N the AT=}ż selection rule consists of the long partial 
> _ lifetime for the K,2+ decay and the (x +9)/ (m+n) 
_ branching ratio in A decay. In these cases, further tests 
o n the AT=3 rule will be possible in the comparison of 
th he polarization properties of the two A-decay modes, 
and in a definitive measurement of the branching ratio 
K 9 decay. On the other hand, none of the data at 


whether the lepton be electron or T muon, and 
-only of V and A terms, is at present in 
ve aci yrd with the evidence on leptonic modes 
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assumptions lead to an interaction of the form 


CxlLbryu+ys)v1J0bK/ Ox, 
=Cxmi(Wi(1+ys)br)ox (4.1) 


for Kz decay, which implies a K,2t/K,s+ ratio of 
2.5X10~*. This corresponds to the prediction mest/mps* 
= 1.25X10~ for the pion, now borne out by experiment, 
and is to be compared with the experimental upper 
limit, Keot/K,2+<0.01, at present. 

The decay probability for the K,y2+ mode (48X 105 
sec) is only about 20% greater than that for myst 
decay, despite the much greater energy release. These 
lifetimes correspond to a ratio (Cx/C,)?~1/15. How- 
ever, the relation between Cx and C, is not necessarily 
simple but may depend on the following factors: 


(a) The parity of the K meson (relative to even 
parity for the A particle). The expressions for the 
intermediate baryon loops between the initial meson 
and the point at which the four-fermion weak inter- 
action is effective are quite different for a scalar K 
meson from those for m-meson decay. Even for a 
pseudoscalar K meson, the graphs for a-meson and 
K-meson leptonic decays do not generally correspond 
in detail unless the A and X particles have the same 
parity, their virtual K-meson interactions are neglected 
and their pion interactions have the global symmetry, 
and their weak interactions are closely correlated in a 
similar manner. 

(b) The amplitude fè of the strangeness noncon- 
serving current need not be the same as that for the 
strangeness conserving current. The radiative correc- 
tions to the former current have quite different struc- 
ture from those for the latter, so that, even if the weak 
interaction current strengths are the same in the 
approximation where the strong pion and A-meson 
interactions are turned off, these coupling strengths 
may be modified by different effective renormalizations. 
In fact, the experimental data on the beta decay of the 
hyperons (see the following) require that f should be 
of an order of magnitude smaller than g. 

(c) The K-hyperon coupling strength Gx?//e appears 
to be about an order of magnitude weaker than the 
pion-nucleon coupling strength G?/hc—™~13.5. 


Consider now the three-body leptonic modes Kys 
and K.s. Assuming only V and A interactions to be 
effective, the matrix elements for these processes may 
generally be written in the formt 


{Rpxat S(pxa—pra)} Wa (1+y5)¥z) 
=P{Rmxyo(itys)+Smi(1+75)}v. (4.2) 


R and S are scalar functions of px: p;/M?=mxw,/M’, 
which are independent of mz. If time-reversal invariance 


t Note added in proof.—See, for example, A. Pais and S. B 
Treiman, Phys. Rev. 105, 1616 (1957), R. Gatto, Phys. Rev. iu, 
1426 (1958), and references cited there. A recent preprint, ‘Deca 
of Hyperons and Mesons from the Universal Fermi Interac 
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holds for both strong and weak interactions, the ratio 
R/S will be real.® In K., decay, the term proportional 
to mz may be neglected and the only contribution is 
from the R term of (4.2). Since the variation of 
bx: p,/M? over the allowed range in Kz; decay is only 
about 0.05 it appears a reasonable first approximation 
to neglect the energy dependence of R, in which case 
the shape predicted for the electron spectrum is unique. 
This spectrum* is shown in Fig. 6, where comparison 
is made with the available data’”’ on K.3+ decay. The 
agreement is rather poor, but the data are subject to 
experimental bias whose effect is difficult to estimate. 
If the hypothesis of a T=4 weak interaction current is 
valid, the expression (4.2) holds for both K.3* decay 
and for the K? modes (e#-++-y-+2"). 

In K,3 decay, both R and S terms contribute and 
there are therefore a range of theoretical possibilities 
for the K,s spectrum, of which two examples* are 
shown in Fig. 7. This figure also shows the data obtained 
on K, 3 decay in the emulsion investigation of Alexander 
et al” who attempted to assess the effects of various 
empirical biases on this distribution and who conclude 
that these are so many and so little understood that no 
detailed comparison with the theoretical distributions 
is justified at present. With neglect of the energy 
variation of R and S, the ratio of the decay probabilities 
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Fic. 6. The electron energy spectrum in Kest decay, 
compared with the theoretical prediction. 


In the literature, it has frequently been stated that this 
ratio will be real as long as time-reversal invariance is valid for 
the strong interactions alone. When the strong interactions have 
this property, this conclusion is generally correct only if the Krs 
mode arises from just one four-fermion interaction [for example, 
(pA) (DL) ]. When there are several such four-fermion interactions, 
for example (pA) (PL) and (AZ)(pL), the Kzs-decay amplitude 
will consist of several coherent terms, thus 


cm Ate) “ete 
Kt ZES L+H, 


and reality will hold for R/S only if the ratio of the coupling 
coefficients of these four-fermion interactions is also real; that is, 
if the weak interaction also satisfies time-reversal invariance. 

3t These K.s and Kps spectra are taken from the paper by 
Furuichi, Kodama, Ogawa, Sugahara, Wakasa, and Yonezawa, 
Progr. Theoret. Phys. 17, 89 (1957). 

2 Bruin, Holthuizen, and Jongejans, Nuovo cimento 9, 422 
(1958); have collected together the data obtained with the same 
selection criteria. 
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Fic. 7. The muon energy spectrum for K,;* decay according to 
Alexander et al.,27 and two examples of the Kyst spectrum pre- 
dicted by the interaction (4.2). 


for K,3 and K, decay is 


This ratio is limited to values less than 2.3, which does 
not disagree with the present empirical ratio of about mii 
unity. The ratio of the decay probabilities for Kest and 
K,3* modes is about 0.07, which corresponds to a value 
R/Cx~2/(3M), where M is the nucleon mass and Cx 
the coefficient in expression (4.1). This ratio is therefore Í 
in reasonable accord with qualitative expectation on i 
the basis of the Fermi coupling model, since the addi- ; 
tional pion is then emitted from the intermediate Tk 

| 


| 

| 

| 

Í 

| 

| 

R(Kea)/R(Kys) = R?/ (0.80R?+0.33RS+0.075S?). (4.3) | 
i 

f 


baryon pairs and the effective radius of the system is 
h/Mc. 


Hyperon Decays 


One of the most direct consequences of Gell-Mann’s 
tetrahedral scheme of four-fermion interactions is the 
prediction of a beta-decay process for the A and X= — 
hyperons, arising from the couplings (Ap)(é*#) and ai 
(3-n) (ē+7) generated by expressions (1.1) and (1.2). 
With the strength f? of the interaction currents (2.19) 
equal to gè, the expectation is that the beta decay and 
muon decay of the A particle, 


iail jaria Goog 
pHa +r, (4.4 ) 
pf 
should have rates 0.8% and 0.15%, respectively, 
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decay. Recently, several events of the type (4.4a) 
have been reported,® but their rate appears to be 
significantly smaller than these expectations. Including 
an earlier > event? probably representing the mode 
(4.5a), or possibly (4.5b), the present situation is that, 
with f=g, 12 A. events and 2.5 A, events would have 
been expected in the experiments to date compared 
with the observation of 2 A, events and no A, events, 
and that 11 >= events and 5 X, events would have 
been expected compared with one >, or X, event. These 
results are compatible with a four-fermion coupling 
based on the interaction (2.19) only for a value f?/g* 
~0.3. The calculation of leptonic decay probabilities 
for K mesons in terms of an elementary four-fermion 
interaction involves divergences and many other uncer- 
tainties, so that the conclusion that the strangeness 
nonconserving interaction current is weaker than the 
strangeness conserving current by a factor of order 3 
does not conflict with any evidence on K-meson decay. 


5. TIME-REVERSAL INVARIANCE FOR 
WEAK INTERACTIONS 


It has frequently been assumed that the weak 
interactions are invariant under time-reversal. In the 
present framework, this may be expressed as the 
assumption that the interaction constants for each 
term in the interaction current (1.2) may all be chosen 
to have the same phase. 

There is very little direct information available at 
present on the validity of this assumption. In A decay, 
if the amplitudes s_ and p_ had relative phase ¢, the 
expression for the polarization parameter a would have 
an additional factor cos¢. The experimental limitation 
(2.2b) shows that the angle ¢ cannot deviate by more 
than ~45° from 0 or z, the values allowed with time- 
reversal invariance (neglecting the small pion-nucleon 
scattering phase shifts). No test is possible from the 
study of Kys-, Kes-, or r-decay modes. For K,3+ decay, 
Sakurai has pointed out that the violation of time- 
reversal for the weak interactions would generally 
imply that the muon polarization would generally 
have a component perpendicular to the (7°,v) plane. 
This possibility has yet to be examined experimentally. 
However, Sakurai’s formulas show that, when the K,; 
interaction is limited to the form (4.2), the existence of 
a relative phase between the coefficients R and S, which 
can arise if the strangeness nonconserving current 
(2.19) is not invariant under time-reversal, does not 

imply a normal component for the muon polarization. 

Weinberg®® has suggested recently that the most 

severe test of time-reversal at present may be the 
absence of 2 modes for K»? decay. As pointed out by 


Be Lee et al.,® the K 1°, KÆ states are generally expressible 


8 J, Hor el and E. O. Salant, Phys. Rev. 102, 502 (1956). 
ag Beers Phys, Rev. 109, 980 (1958). 
Á Phys. Rev. 110, 782 (1958). 
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in terms of the K’, K? states by the relations 
| K.°)= | K°)+q|K°), 
| K:°)=q|K°)— p| KY, 


(5.1a) 
(5.1b) 


where p and g are generally complex numbers and 
|p|*-+]|q|?=1. If time-reversal invariance holds, which 
means that CP invariance is valid, then p=q and both 
may be chosen real. Weinberg remarked that, for 7=0, 
there are two final 27 states, one with T=0, the other 
T=2. If ô and ô denote the s-wave scattering phases 
for the pion-pion systems of T=0 and T=2, then the 
amplitudes for 2r decay of the K? system may be 
written 


(K°| 2a, T=0)=aoe*, (K°| 27, T=2)= aze". (5.2) 


The corresponding amplitudes for K? decay are directly 
related to these, 


(K°| 2x, T=0)=ao*e*, (K°| 2x, T=2)=ay*e%2, (5.3) 
From these and Eqs. (5.1), the amplitudes for K,’ and 
K2 decay to 27° and wt+7~ states may then be 


deduced, for example, 


(K2°| 2°) =v 3 (qao— pao*)e"* 


+v 3 (gqas— pas*)e?, (5.4a) 
(K2 | at-+a-)= N 2 (gao— pao*)e* 
—V4(gas— pas*)e', (5.4b) 


From the experimental work of Bardon et al., the 
decay probability for K2 — mt*+r is known to be less 
than 10-5 that for K? decay, so that 


| (K2 |rt +r) | $0.3 10-*{ | paotgao* |? 
JP | pastqas*|*} i. 


An upper limit for the decay probability K2 — 1°+7° 
is not as well known. It is certainly less than 10~ of the 
K,?-decay probability, and Weinberg’ gives an argu- 
ment that this ratio is actually less than 2X10. From 
this 


| (K 2| Tr] $1.5 10-{ | paot-gao*|? 
+ | paetgas*|?}}. 


If the right-hand sides of these inequalities were zero, 
these conditions would require, according to (5.4), that 


(5.6) 


(5.5a) 


(5.5b) 


ao*/ao= q/p=42*/a2, 


and that ao and a» have the same phase, just the 
relationship which time-reversal invariance for the 
interaction K® — 2r would require. 

However, owing to the uncertainty in the Ky 
branching ratio, it is not clear exactly what restriction 
on the relative phases of as and ao is implied by the 
empirical inequalities (5.5). If the AT=} rule held 


i 
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inequalities (5.5) would then imply only that ao*/ao 
=q/p, a statement about the phase of ao which carries 
no implications concerning the time-reversal invariance 
of the decay interaction. If the ratio of | (pastga»*)| 
and | (pao+gao*)| is taken from the ratio of the K,s+ 
and K? decay probabilities, then it is still true that 
the phases of a? and q/p are very close; however the 
limitation on the phase of az is given by 
! 1 eet 
~- 5.7 
F (5.7) 


| 15e TAA 


qa2— pas* 


|Paot gan” 
| past gas* 


<0.014 


pao+qas* 


which involves almost no restriction on the relative 
phase of a? and q/p. This argument would lead to a 
significant restriction on the relative phase of a and ao 
if the present upper limit for the K.°— 1°+7° decay 
probability were improved by an order of magnitude 
or if the K? branching ratio were confirmed to lie in 
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the range 0.1-0.2, as indicated by some of the present 
experiments. 

So far, the only serious test of the assumed property 
of time-reversal invariance for the weak interactions 
has been provided by the experiments on the beta decay 
of polarized neutrons reported recently by Clark et al.*” 
and by Burgy ef al.38 These experiments have shown 
that the Fermi and Gamow-Teller matrix-elements for 
neutron decay do not differ in phase by more than +8°. 
For the strange particle decays, the only tests which 
have been available to the present have been rather 
inconclusive or have provided only very weak evidence 
in support of this property for the strangeness non- 
conserving weak interactions. 


31 Clark, Robson, and Nathans, Phys. Rev. Letters 1, 100 
(1958). 

3% Burgy, Krohn, Novey, Ringo, and Telegdi, Phys. Rev. 
Letters 1, 324 (1958). 
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HE invited speakers have done an excellent job 

of describing the various topics within the field 

of weak interactions. The contributed papers have 

presented many fascinating results. Furthermore, what 

is rare these days at a physics conference, there has 

been plenty of discussion, which has brought out, I 

think, every point that was overlooked in the speeches. 
Evidently, there is no need for a summary. 

A few words about what appear to me to be the 
principal questions that we would like to have answered 
in the future are given in the following. 

Let us begin with those two beautiful and mysterious 
particles, the electron and the muon. The recent revolu- 
tion in weak interactions has brought about measure- 
ments of the muon magnetic moment, confirming to a 
high degree of accuracy that the muon, like the 
electron, is a pure Dirac particle with electrodynamic 
coupling of the conventional form. It now seems clear 
also that the weak couplings of electron and muon are 
identical in form and strength, both involving the 
neutrino (presumably the same neutrino, although 
we have no way of proving that at the moment). Both 
electron and muon lack, so far as is known, any other 
interaction whatever, except the gravitational. And 
here appears the only known difference between 
them, their masses. Why do these two otherwise identical 
os ae ts have different masses? No one has the slightest 
i idea. This is perhaps the most interesting question in 

_ particle physics today. If it were not for this example, 
we might say that differences in mass among the 
_ elementary particles are always owing to differences in 
PF: interaction. Or at least we might say that all the different 
particles are distinguished from one another by the 
values of symmetry quantum numbers. But the electron 
_and muon seem to provide the first inkling of a radial 
= quantum number in particle physics. 
_ It is evidently important to refine existing measure- 
= ments of the muon and electron still further, and to 
= compare electron and muon scattering from the same 
= target, in order to see whether the equivalence of the 
Ki x wo. particles really persists down to small distances. 
ave decay of the muon, involving just u, e, and v, 
s an example of a pure weak interaction. The only 
WI | corrections are the electrodynamic ones, which 
nite, have been calculated, and are easily taken 
eat. To within present accuracies, all our 
e of the muon decay agrees with the very 
ry of the interaction given by the following 
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-+Hermitian conjugate (h.c.) 
or, for short, 


&,=VIG (ev) (av)++h.c. (1) 


We may describe this interaction by saying that it is 
the direct coupling in which e~, u~, and v appear with 
negative helicity only. 

It seems, in fact, that in all weak interactions these 
particles appear with negative helicity and their anti- 
particles with positive helicity. It is a remarkable fact 
that this same grouping is followed by the law of con- 
servation of leptons, which states that the number of 
leptons minus the number of antileptons is always 
conserved. The leptons all have negative helicity in the 
weak interactions, the antileptons all positive helicity. 

Now there is a law of baryon conservation, too, and 
therefore we suspect that all baryons enter the weak 
interactions with the same helicity. Future experiments, 
suitably interpreted, will enable us to test this idea. 

To return to the u decay, it is of the highest im- 
portance to refine all experiments on the spectrum and 
asymmetries of u decay in order to see whether any 
departures from the simple contact-interaction theory 
of Eq. (1) can be found. Accurate knowledge of the rate ¿ 
of decay is also valuable, because only in the y decay 
can we be reasonably sure of measuring the true Fermi 
constant G, unaltered by the effects of strong couplings. 

It is just these strong couplings that make 8 decay 
complicated. It is now well established that the -decay 
interaction resembles that in u decay; it is vector and 
axial vector and the strength is about the same. We 
do not, however measure the interaction Lagrangian 
directly, because of the strong couplings. At low mo- 
mentum transfers, such as we have in nuclear 8 decay, 
we measure an effective Lagrangian density 


re) Fe eS eS ae a 


(2) 


n Gy Ya ł AJ YaY p CV a y 
Ga 5 vZ 


to be compared with 


ilar 
(Grat Gray) (er. VI 


2) +h.c. 


as in Eq. (1) for » decay. Here Gy and —Gz, are the 
effective or renormalized Fermi constants, containing 
effects of the strong interactions of nucleons. 
Gy is determined experimentally from the rate of 
a 0— 0 transition such as the decay of O. Remar k- 
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ably, it appears to be within one or two percent of 
the “pure” Fermi constant G determined from y decay. 
There are two methods of measuring —G4/Gy without 
estimating nuclear matrix elements. One is the Argonne 
experiment on the asymmetry of electrons in the decay 
of polarized neutrons and gives —G4/Gy=1.25+0.04. 
The other is a comparison of neutron and O" lifetimes; 
it gives only the absolute value and yields 


| —Ga/Gy| =1.19+0.04. 


It is very tempting, then, to guess that the original 
Lagrangian for 6 decay is just like that for p decay, 


LB =V2G (ip) (&)*+h.c., (3) 


with negative helicity for the nucleon. It is surprising, 
though, that the vector and axial vector renormaliza- 
tion factors are about 0.99 and 1.2, respectively; nearly 
1 in spite of the strength of nuclear interactions. No one 
has any explanation for the closeness of the latter to 
unity. For the former, the closeness to unity is really 
striking, and a possible explanation has been suggested 
—the speculation that there is a “conserved vector 
current.” 

This speculation is suggested by an analogy between 
the vector weak interaction of baryons and mesons 
(without change of strangeness) and electromagnetism. 
For electromagnetism, a law of universality holds—all 
charged elementary particles have the same charge +e. 
The universality is not disturbed by strong interactions, 
which are present for the proton, for example, and not 
for the positron. The reason is that the electromagnetic 
current jq is a conserved quantity: 0j./d%.=0. 

Now the vector operator describing the nucleons in 
B decay is Ēyap, which is not, in the Yukawa theory of 
strong interactions, a conserved quantity. Itis, however, 
one term of a conserved quantity, the x+ty component 
of the isotopic spin current, 


Ont On 
Sa =a p HVZ — m — Vit — 
Xa OX 


HVIS y2? VIS SHE, (8) 


which obeys the conservation law ôJa/ôxa=0 apart 
from electromagnetic effects. 

If, then, in the original 6-decay Lagrangian we replace 
7iya(1-+75)p by the expression 


Ont On? 
firya(1-+7s) pO iran 
OXa OVa 


FEVE a(1 4y)? — VEY (1y) RES (5) 


we have a theory in which Gy must equal G to within 
an electromagnetic correction of a percent or so. It is 
evidently worthwhile to refine the measurements and 
calculations involved in the determination of Gy/G. 


It would also be good to have independent tests of 
the conserved current hypothesis. The simplest one is 
unfortunately very difficult; it involves measuring the 
rate of the rare decay m+ — 7°+e+-+y, never so far 
observed. If the conserved current hypothesis is correct, 
the rate* must be 0.3740.07/sec giving a branching 
ratio of 1.00.2 10-8. Other B-decay theories will also 
lead to decay rates of this order of magnitude, but not 
(except accidentally) to the same number. 

An easier test of the conserved current idea has been 
proposed, involving the ratio of the -decay spectra of 
B”? and N”. At this meeting, certain theoretical correc- 
tions have been presented that must be applied to the 
analysis of the results, but they do not vitiate the con- 
clusion that the experiment can distinguish whether 
or not pions participate directly in the 6-decay interac- 
tion as in Eq. (5). The B’—N® experiment is being 
performed at the California Institute of Technology by 
Hilton and Sérgel, who will announce their results 
shortly. I would like to encourage other groups to do 
the same experiment, however, since it is a hard one and 
recent experience should have taught us that hard ex- 
periments should be repeated. 

We have seen how the vector and axial vector coupling 
constants in 8 decay can be renormalized by effects of 
the pion cloud around the nucleon, giving, at low 
momentum transfer, the effective Lagrangian of Eq. (2). 
When we take into account finite momentum transfer 
to the nucleon, other pionic effects must show up, so that 
Gya(1-+7s) is replaced, not simply by Gyyat (—G)Ya¥5; 
but by the more complicated expression 


GyF1(q°)Yat (—Ga)F2(q?) Yas 
+AF3(P)oasqe+BFi(q?)ysqa- (6) 


Here qa is the four-momentum transfer and the form 
factors F;(q?) all have F:(0)= 1. Besides the renormaliza- 
tions and the form factors, the pion cloud has induced 
two new interactions. The first, with coefficient A, we 
might call “weak magnetism,” since it bears the same 
relation to the vector coupling that an anomalous Pauli 
moment bears to the electric charge. In fact, in the 
conserved current theory, the value of A can be pre- 
dicted from the anomalous moments of proton and 
neutron. It is just this weak magnetism that is tested 
in the B2—N® experiment. (The conserved current 
theory, by the way, also permits Fı and F; to be calcu- 
lated from the electromagnetic form factors of the 
nucleon.) 

The last interaction, with coefficient B, is sometimes 
called the induced pseudoscalar. It can be shown that 
in the calculation of B by field theory, one particular 
type of diagram predominates, in which the nucleon 
radiates a virtual pion, which decays into electron and 
neutrino. This contribution can be evaluate 


*The principal uncertainty arises from the mass difference - 


of x* and 7°. 
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giving a fairly reliable estimate of B, or at least its 
absolute value. 

All these effects are barely detectible in 6 decay, 
because of the low energies involved. They may, how- 
ever, be studied in the absorption of u~ by nuclei. But 
how is the basic interaction of nucleons with u ard v re- 
lated to that with e and vy? The simplest assumption is 
that it is absolutely identical. There is very little evi- 
dence on this point from u~ absorption itself. It has not 
even been proved experimentally that the interaction is 
V-A or that parity conservation is violated. However, 
the theory of identical u and e couplings leads to the 
famous prediction of the branching ratio for t — e+», 
which has now been confirmed experimentally with an 
accuracy that is constantly increasing. 

Pending further experimental work, we are certainly 
justified in supposing that the basic coupling in u` 
absorption is known. The absorption rates and other 
quantities may then be used to yield information about 
the pionic corrections of Eq. (6). For this purpose, the 
ideal nucleus would be the proton, but that poses tre- 
mendous experimental problems, and for a while we 
will have to be content with the current experiments 
on C? and other light nuclei. 

There remains the problem of calculating the absolute 
tate of the charged pion decay. We have heard from 
Professor Goldberger an excellent account of a recent 
attempt to do that. I think we will agree that the 
remarkable success of the calculation is greater than 
might have been expected from the approximations in- 
volved. But it is very instructive in any case, particularly 
the result that for large meson coupling constant g?, the 
rate of pion decay is inversely proportional to g?. 

All the phenomena we have discussed so far involve 
the interaction of ev, uv, and a series of particle pairs 
beginning with np. The structure of the interaction is 
rather peculiar; pairs of particles, one neutral and one 
charged, interacting with one another in identical 
fashion. It suggests that the coupling Lagrangian is of 
the form JatJa, where 


1+ys 1+ys 
Je=2iv6l ere vI V+ Eye v2 y 
E (7) 
W a one O 
y: Z p 


This idea of a current coupled to itself has certain 
new consequences, but they are very hard to detect 
by experiment. One is neutrino-electron scattering with 
a cross section comparable to that of neutrino absorp- 
tion by nuclei. (Of course, the absorption has been 
detected by a coincidence technique, while thescattering 
gives only electron recoils.) It should be noted that the 
E BIVES longitudinal or “two-component” neutrino 
zo eee ee magnetic moment, so that if neutrino- 
ee t : 


a electron scatte 


3 ; r E o 


ring is evee.ofaund with ube predicted | that is available, sGofaBa (Of course, if the neutrinos 
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cross section, it should be attributed to the direct 
(ev) (ev) coupling. 

A second consequence, just as difficult to test, is the 
existence of a parity nonconserving nuclear force arising 
from the (np)(np) term of the weak coupling. The 
violation of parity conservation should amount to 
something like 1077 in amplitude. Present experiments 
are capable of measuring only one part in 10° or 104, 

If we accept the form JatJa for the weak interactions, 
we are still puzzled by it. True, it is reminiscent of 
electromagnetism, although weaker, of short range, and 
not parity conserving. But the electromagnetic current 
je interacts with itself through the photon. Is there a 
boson to carry the weak interactions? | 

Let us examine the properties of such a hypothetical 
boson, a particle that Feynman and I like to call the | 
uxl (symbol X+). It must be a charged vector meson, 
say of mass M, with coupling constant (analogous to 
1/137) equal to V2(GM?/4r). If M is of the order of the 
nucleon mass, this comes to around 107°, Processes in- 
volving creation or destruction of the zal have prob- 
abilities proportional to this coupling constant, while | 
ordinary weak processes involve the square. Thus a 
K particle, for example, would much rather decay into 
X-+7 or even X+7 than into its actual disintegration 
products, provided these channels involving X were | 
open. It follows that M must be 2 mx. 

If M is really around the nucleon mass, then the 
cross section for «xl production in, for example, nucleon- 
nucleon collisions of several Bev, must be around 10-° | 
of geometrical. The time of decay into, say, e+v, must 
be something like 10~!8 sec. Evidently the direct ob- 
servation of a particle that is rarely produced and 
decays immediately is a most difficult matter. If M is 
very much greater than 1 Bev, then the threshold for 
uxl production becomes a problem. 

Fortunately, an indirect test of the existence of the 
uxl can be made. If X exists, it induces the decay 
u—e+y with a rate that can be estimated. The 
mechanism is best described by the Feynman diagrams 
involved: 


The rate comes out logarithmically divergent, so that 
we must introduce a cutoff momentum A. The fraction 
of muon decays into e+y then comes out 


Pu e+y)/T (u —> e+ v+) =3/80 1/137 f(M/A), (8) 


where for large cutoffs A>>M, we have f— (InA?/M?)’. 
For reasonable values of the cutoff, we should expect 
a fraction like 10-3 or maybe 10-4. Experimentally the 
branching ratio is less than 210-5. It appears, then, 
that the wal has flunked the only test of its existence 
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associated with e and y are not identical, then this was 
no test at all.) 

Formula (8) can also be used to describe what happens 
when there is no wal but instead a point coupling of 
(uv) to (ev). Mathematically, that corresponds to the 
limit of infinite M with the cutoff A held fixed. Under 
these conditions f(M/A)— 0 and there is no decay 
j=? Caray 

We have not succeeded, then, in understanding the 
proposed form JatJa of the weak couplings. Neverthe- 
less, I shall assume it in the remaining discussion, which 
concerns the weak decays of strange particles. 

In order to describe decays involving a change of 
strangeness, we must add new terms to Eq. (7) for the 
current Ja, which already consists of two parts: the 
leptonic current J.” and the current J,“ comprising 
(ñp) and probably other terms like (#x+), ($29), etc., 
all involving baryons or mesons, but with no change of 
strangeness. 

The existence of decays like K+ — pt+v requires us 
to add a current J«®) consisting of pairs like (Ap), 

S-n), etc., in which the partner with higher charge 
also has strangeness higher by one unit. It is easy to 
see that the interaction of J,.@ with J." and Ja” is 
sufficient to account qualitatively for all known weak 
decays of the strange particles. 

The question now arises whether the current Ja” 
JaJa is complete, or whether other terms of 
still a different character must be included. Such 
additional terms could be of two kinds: pairs like (#2+) 
in which the partner with higher charge has strangeness 
lower by one unit and pairs like (=n) in which the 
partners differ by two units of strangeness. Let us call 
the corresponding currents J.“ and Jq“), respectively. 
There are a number of experimental tests of the 
existence of J, and Ja: 


(1) The decay 5+ n+e++-y will have appreciable 
probability if and only if J® is present. (Abbreviated, 
Et —> netyn.) 

(2) Z- —> nte-+3, 2 pte+ti~J®. 

(3) Z —> n+, BD ptr ® or J. 

(4) If J® or J® exists, so that the weak interactions 
can induce strangeness changes of two units in the 
lowest order of G, then the transitions K? +> K° occur 
with an amplitude proportional to G instead of G°. The 
mass difference between K,’ and KX is then of order 
G instead of G?, corresponding to a frequency of, say, 
10!7/sec instead of, say, 10!°/sec. A beam of K? particles, 
then, will be converted immediately (that is, in some- 
thing like 10-17 sec) into a rapidly oscillating half-and- 
half mixture of K® and K°. Thus a K’ produced in a 
reaction like m—-+p— A+K° will be “immediately” 
capable of behaving like a K® in, for example, the 
reaction K°+p— A+r+. 

(5) Only if J® exists can the leptonic decay rates of 
K? and K} be different. If they are the same, the 
probability of a neutral K particle decaying into leptons 
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during the lifetime of the K? component (~10-" sec) 
is only around 10-*. (It should be noted, of course, that 
during the first 10- sec the amplitudes for leptonic 
decay of K? and K.° interfere with each other.) 

None of these tests has yet given conclusive results. 
There is some evidence that a beam of K® particles 
does not immediately behave like a 50-50 mixture of K? 
and K”. If this is confirmed, we can presumably throw 
out both J® and J, but at the moment we cannot 
be sure. Let us continue, however, on the assumption fi 
that J® and J® are not present. Í 

We must still clear up the forms of J® and J®. The | 
conserved vector current hypothesis, if we take it 
altogether seriously, together with the assumption that 
baryons always occur with the factor 1+ys (negative 
helicity), gives a definite form for J, but we are far 
from being able to test such an assertion today. 

However, we can test a much less specific suggestion, 
in the form of a symmetry rule, that J® behaves like 
an isotopic vector and J® like an isotopic spinor. Such 
a rule is obeyed, for example, by the term (ñp) in J 
and the hypothetical term (Ap) in J®. If this rule is 
right, then apart from electromagnetic corrections, we 
have |AI|=4 or 4 in nonleptonic decays and | AI| =4 
in leptonic decays (with the leptons defined to carry 
no isotopic spin). Some experimental consequences are 
the following. 


(1) The fraction of K+ — 3m decays that yield x+-+ 27° 
is 4, assuming a totally symmetric spatial wave function. 
(2) The fraction of Ky°— 3m decays that yield 37° 
is 3, assuming a totally symmetric spatial wave function. 
(3) The fraction of KP — 2r decays that yield 27° is 
between 0.28 and 0.38. (We make use here of the experi- i 
mental rates of K}? —> 2r and K+ — 2r.) i 
(4) The rate of leptonic decay of K? (or of K,°, if we i 
forget interference) is twice that of K+ and the spectra, 
relative proportions of e and y, etc., are all the same for i 
KX and Kt. 


All these statements seem to be roughly true except 
perhaps the third one, but more experiments are | 
necessary. 

We come now to some mysterious features of strange 
particle decay, which, if we could understand them, 
would probably help us to pin down the interactions 
further. The most striking is the fact that K,° decays 
into 2r at a rate about 500 times faster than that of 
K*+— 2r. An explanation has been suggested in the 
form of an approximate rule (supposedly good to 5% 
or so in amplitude) that | AI| =} in nonleptonic decays. 
However, it is not known how to derive any such 
in a convincing way from an interaction of the fo 
JatJa. Several ideas have been discussed, e.g. ( 
divide the strong interactions into classes, 
one is assumed to be less strong and appr 
negligible. The stronger one may hav 
metry than charge independence, 
together with a hypothetical sy 
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currents J.“ and Ja® may lead to something like the 
|AI|=4 rule. (b) to add to the interaction JaJa a 
term KaKa in which the particle pairs have the same 
charge. The “neutral” current Ka must not contain 
leptons, however, since emission of pairs like #v and jie 
has never been observed. An interaction Ja*Ja+KaKa 
can be made to yield the | AI| =} rule. 

The first suggestion seems much more attractive 
than the second. Actually, neither has so far resulted in 
a convincing theory. For a third suggestion, see below. 

‘In any case, it is worthwhile to see whether the 
approximate | AI| =} rule works experimentally. Besides 
the K — 2r situation, it explains the fact that A> 2-+7° 
occurs at about half the rate of A— p+ and certain 
relations among the nonleptonic decays of £+. It may 
be further tested by other predictions, including the 
following: 


- 


(1) The asymmetry of A—n-+7° for polarized A 
must be the same as that of A— p+. 

_ (2) The rate of K.°— 3r must be twice that of 
_ K+— 3x. At the moment, there seems to be some 
_ diffculty with the second prediction, but again more 
experiments are needed. 


$ Another mystery involves the rate of leptonic decays 
of hyperons. The decay A— p+e~+7 has been ob- 
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served, but it is reported to be very rare, occurring in 
much less than 1% of all A disintegrations. The decay 
E- — n+e+? seems also to be very rare, and has 
never been detected for certain. Now if the current 
J contains terms (Ap) or (=~), if the coefficient of 
each of these terms is unity, and if renormalization 
effects are not large (and they seem to be small in the 
B decay of nucleons), then the fractions of 8 disintegra- 
tions of A and =~ should be 1.6 and 5.6%, respectively. 

While awaiting further experimental results, let us 
assume the discrepancy is real. If it is really large, it is 
unlikely to be a renormalization effect but rather an 
indication that the strangeness-changing weak couplings 
are even weaker (by a factor f, perhaps ~1/20) than 
those that conserve strangeness. Such a situation would 
permit another explanation of the approximate | AZ| =} 
rule, namely that it characterizes a special set of dia- 
grams, occurring only in nonleptonic decays, which are 
so much larger than other diagrams that they make up 
for the small factor f and give a “normal” rate for such 
a process as A —> N+r. 

It is clear, in any case, that we have an enormous 
body of information about the weak decays of strange 
particles but that we need still more information if we 
are to understand them. ó 
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i 
Errata: Comparative Study of Potential Energy Equation (11-29) should read l 
E Functions for Diatomic Molecules san me SE ; 
[Revs. Modern Phys. 29, 664 (1957)] NODE zo GHO blia]. 4 
YAWENDRA PAL VARSHNI f 
Department of Physics, Allahabad University, Allahabad, India Equation (I-31) should read: dFo/dz= — (dFı/dx) i 
N page 670, Ea, QE = —Fooo1+F i010. d 
a P (¢ 1 
QC) pase 670, Eq. (40) should read 1150 for “antilog 4030-5” read “antilog (4.030-5).” Ei 
a =0. 1155 Bottom line, left. Actually, Table V-6 shows that the E} 
In line 4, read “0” in place of “2.” largest observed Fī» for hydrogen is at 4 kev in argon F 
- On page 671, Linnett function, Eq. (55) should read 1162 ANKE 6: for “oy” read “ory.” 
avy 11 17- 
1164 Center, right, for “from Eqs. (V-1) and (V-2)” read “from 
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On page 672, Lippincott function, Eq. (61) should read 
a,=0. 
In line 24, delete “As (61) shows this appears to be in error.” 
On page 677, Table IX, correct F values are 
Linnett (8+3/—#)/3(4—1) 
Lippincott 0. 


On page 679, Linnett function, change the discussion for a as 
follows: 


“Too high at low A. Suitable for CO, Ns, NO, 02.” 


Errata: Experimental Results on Charge-Changing 
== Collisions of Hydrogen and Helium Atoms and 
Ions at Kinetic Energies above 0.2 kev 


(Revs. Modern Phys. 30, 1137 (1958)] 
SAMUEL K. ALLISON 


Enrico Fermi Institute for Nuclear Studies, University of Chicago, 
Chicago, Illinois 


Page 
1138 In text, above Eq. (II-1), for “dois/dé would be a function 
of angle” read “dois/d2 would be a function of angle.” 
In text, above Eq. (II-5) for “and exists into high vacuum” 
read “and exits into high vacuum.” 
1140 Equation (II-12) should read 
1 Fi= Fis HCP (si) explo) +N (si) exp(—79)] 
Xexp(—4x X ais). 


1139 


1141 Equation (IT-28) should read 


PLT) = Foy oa(s-9)— bh} 
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equations in Sec. VA-3, page 1156.” 


Errata : Spin Waves 


[Revs. Modern Phys. 30, 1 (1958)] 
J. VAN KRANENDONK AND J. H. VAN VLECK 
Harvard U niversity, Cambridge, Massachusetts 


J Eq. (28), 0.1187 should be replaced by 4 (0.1173). 


In the third member of Eq. (32), 15/8 should be replaced by 
15/32. 

On the top of page 8, 0.96 Nk/2S should be replaced by 0.96 
(2NRS). 

In Eq. (97), a term —N2JS should be added to the right-hand 
side. 

In Eq. (120), 0.1187 should be replaced by 0.1173/(Si—S2) 
and 2JS,S2 should be replaced by 4/S1S2. é y 

Below Eq. (121), S:S2/S(Si—S2) should appear as twice this m 
quantity, and (giS:—g252)8 should appear multiplied by a A 
factor (4). 


Erratum : Dynamics of a Lattice Universe by 
the Schwarzschild-Cell Method 


[Revs. Modern Phys. 29, 432 (1957)] 
RICHARD W. LINDQUIST AND JOHN A. WHEELER 
Palmer Physical Laboratory, Princeton, New Jersey 


E are indebted to Professor G. C. MeVittie for pointing 
out to us that the printed version of Eq. (3) omits a 
factor.! It should read 


ast=[o(T)/as}| + (ao +sinved | —d?? Ga) : 


= a (T) [du tdu: -Hdu:+du:] = dT?, 
with 
u =sinx sinô sing, 
i2=sinx sinf cosy, 
u3=sinx cos, 
U4=COSX, 
and 
r= Sinx. 
The subsequent calculations and the conclusions drawn fro 
about the dynamics of a lattice universe remain unchang 


1R, C. Tolman, Relativity, Cosmology and Thermodynamé 
Press, Oxford, England, 1934), Eqs. (149.5) and (149.10). > 


ass e 


heoretical Results on Orbital Capture 


[Revs. Modern Phys. 30, 1169 (1958)] 
H. Brysk AND M. E. ROSE 
Oak Ridge National Laboratory, Oak Ridge, Tennessee 


ould be deleted. 
paragraph following Eq. (4) the sentence beginning in 
eventh line should read: 
Re Geslete is weakened by the fact that the numerical 


; as Aidsi ee decay; however, this appears A 
set the qualitative features.” 
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Errata: Pair Production and Bremsstrahlung 
in the Field of Free and Bound Electrons 


[Revs. Modern Phys. 30, 354 (1958)] J- 
J. JOSEPH AND F, ROHRLICH 


Department of Physics, State University of Iowa, Iowa City, Iowa 4 


N Eq. (2) a factor 4 should be added to the constant factor on 
the right-hand side. The ratio of Eqs. (2) and (1) then becomes 
2/9 as stated in the text. 
In Eq. (48) Yo(g) should be replaced by po(g), the Fourier 
transform of po(r) =YWor (r)Yo(r). 
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1. INTRODUCTION 


‘HE study of hindered internal rotation in mole- 

cules has been a subject of interest for nearly 

thirty years and numerous methods have been devised 
for investigating this phenomenon. We discuss one 
of these—the method of microwave spectroscopy— 
which has, within the last ten years, been rather ex- 
tensively employed. Internal rotation interacts with the 
over-all rotation of a molecule and produces certain 
effects in its rotational spectrum which can be observed 
in the microwave region under high resolution. Each 
rotational transition exhibits a fine structure, the com- 


plexity of which depends on the height of the potential 


harrier hindering the infernal rotation. An analysis of 
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this fine structure leads to an accurate evaluation of the 
potential barrier. Considerable work has been devoted 
to the studies of the microwave spectra of molecules in 
which one group may rotate internally with respect to 
another. This review gives an account of the past work 
in this field, including certain details of the methods 
employed. 

Potential barriers are presumably caused by the inter- 
actions of two groups of electrons and nuclei. In prin- 
ciple, it should be possible to determine the barrier 
heights from straightforward quantum-mechanical cal- 
culations. The mathematical complexity of such a 
treatment, however, is so great that a rigorous com- 
putation seems highly impractical at present. An 
alternative approach, which is perhaps somewhat em- 
pirical, is to try to describe the origin of the barriers 
in terms of the forces which appear in the study of 
intermolecular interactions, such as Van der Waals 
forces and resonance forces. Although many such 
analyses have been published, none of these results is 
completely satisfying. With the new data on the barrier 
heights in various molecules, it may be possible ulti- 
mately to formulate a simple theory of the origin of 
barriers which not only accounts for all the known re- 
sults but also may serve to make reliable predictions, 
at least in a semiquantitative way. In view of the pre- 
sumed similarity in nature between the origin of the 
potential barriers and the general problem of inter- 
molecular forces, a satisfactory working theory of the 
former may throw some light on the study of the latter. 
The bibliography lists a number of papers which con- 
sider the origin of potential barriers and various em- 
pirical correlations of the barrier heights with the 
molecular structures. A review of these papers is not 
covered here. A brief survey of this subject is given in 
two recent articles by Wilson.°7-!8{ 

In addition to the method of microwave spectroscopy, 
various other procedures have been utilized in order to 
evaluate potential barriers. Such thermodynamic prop- 
erties as entropy and vapor heat capacity are probably 
the most commonly used to calculate barrier heights. 
Generally, these thermodynamic methods complement 
the microwave methods and are applicable for molecules 
with high barriers, greater than say 3 kcal/mole (1000 
cm). Although the thermodynamic procedure does 
not usually give quite as accurate barrier heights as 
the best results from microwave work, the former is 
applicable to a much larger class of molecules since the 
latter is limited to molecules with a dipole moment. We 
do not, however, review the thermodynamic results 
or methods here. The readers are directed to the re- 
views by Wilson"? and by Pitzer.® 

Infrared and Raman spectra” have been used to 
determine the frequencies of the internal torsional 
oscillations. In principle, this would be the direct 
method for determining the torsional frequencies and 


t References will be found in the Bibliography at the end of 
the paper. 


urukul Kangri University Haridwar Collection. Digitized by S3 Foundation USA 


INTERNAL ROTATION AND MICROWAVE SPECTROSCOPY 843 


the potential barriers. Unfortunately, these torsional 
oscillations are usually inactive and lie in the very far 
infrared region so that their detection and assignment 
are usually difficult, if at all possible. Nuclear magnetic 
resonance" 17,859 also has been used recently to give a 
measure of the magnitude of barriers in liquids. Barrier 
heights in the range of 5 to 20 kcal/mole may be deter- 
mined by measuring the widths of the resonance lines 
as functions of temperature. The time scale of nuclear 
magnetic resonance is such that if the barrier height is 
less than about 5 kcal/mole, the internal rotation 
appears to be free. 

The phenomenon of internal rotation is similar to the 
inversion in ammonia,’ a pyramidal molecule in which 
the nitrogen atom may be situated at either side of the 
plane of the hydrogen atoms. Although a small poten- 
tial barrier restricts the back-and-forth motion and 
causes the pyramidal structure, the nitrogen atom may 
move from one side of the plane to the other through 
the quantum-mechanical tunneling effect, which splits 
the doubly degenerate vibrational energy levels below 
the barrier into pairs of levels (one symmetric and one 
antisymmetric state). Transitions between these states 
are observed in the microwave region, and the separa- 
tion is very sensitive to the height of the potential 
barrier. 

For the case of internal rotation of molecules such as 
ethane, CH;— CHs, the situation is similar. In ammonia 
the nitrogen atom can move back and forth between the 
two equivalent positions, while in CH;—CH; one 
methyl group can rotate into one of the three positions 
which are equivalent with respect to the other methyl 
group. If each of these configurations were considered 
to be independent, the torsional energy levels would all 
be triply degenerate. However, the tunneling effect, 
analogous to the case of ammonia, splits each torsional 
level into nondegenerate (A species of the C; group) 
and degenerate (Æ species) sublevels. This is shown in 
Fig. 1 for an assumed cosine potential function. On the 
left are the torsional state quantum numbers. The 
torsional sublevel spacing increases as the torsional 
energy increases. 

Associated with each of the torsional sublevels, one 
has a set of energy levels arising from the over-all 
rotation of the entire molecule (referred to as over-all 
rotational levels). Because of the interaction between 
the over-all and internal rotation, the spacings of the 
over-all rotational levels are different in the different 
torsional states. A given rotational transition then 
appears as a number of spectral lines corresponding to 
the transitions in different torsional states. The separa- 
tions between these lines are utilized to determine the 
potential barrier height. 

In this review we first formulate the Hamiltonians 
for various molecular models. The calculations of the 
eigenvalues and eigenfunctions based on various ap- 
proximations are given and discussed. The practical 
aspects of the analysis of microwave spectra are pre- 


Fic. 1. Hindered rotational energy levels. A cosine potential 
barrier V; and its associated internal rotational energy levels. The 
torsional quantum numbers v are shown on the left ordinate. The 
sublevels are denoted by their symmetry under the C; group. The 
free rotational quantum numbers m are shown on the right 
ordinate. 


sented in Sec. 5. This section is intended for those 
interested in evaluating potential barriers from micro- 
wave spectra. Finally, the application of group theory, 
vibration-torsion-rotation interaction, and experimental 
results are discussed in that order. The casual reader 
may find it convenient to omit some of the more mathe- 
matical sections in this paper. It is recommended that 
he first read the sections on “Symmetric Molecules” in 
Secs. 2 and 3 and also Sec. 4 and Sec. 8. The notation 
has been standardized. A glossary is given in Appendix 4 
and a comparison of the notation used by other authors 
is given in Appendix 5. 


2. HAMILTONIAN 


In order to understand the problem of internal rota- 
tion and study its effects, the approximate Hamiltonian 
of the system must be known. The potential energy 
hindering a symmetric internal rotor is discussed first. 
Then the kinetic energy is written for rigid symmetric 
molecules and for rigid asymmetric molecules with an 
attached symmetric rotor. Methods of calculation of 
approximate eigenvalues and eigenfunctions are covered 
in succeeding sections. 


I. Potential Energy 


Since the origin of the potential barrier is not clearly 
understood, the only requirement that can be imposed 
on the potential function is that it be periodic in the 
relative angle œ between the two parts of the molecule. 
Within the interval of a=0 and a=2r, the potential 
function must repeat itself M times, where M is the 
number of equivalent configurations in one complete 
internal revolution. For example, V is equal to three 
for acetaldehyde (CH3;CHO) and six for nitromethane 
(CHsNOz). In most cases the symmetry of the molecule 
is such that the potential function can be expressed as 
an even function of the angle a. The potential energy 
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can then be expanded in a cosine series as 
V(a)= dinar coskNa. (2-1a) 


By ashift in the reference level of the potential energy, 
this may be written as 


V3 Ve ; 
eed o cosia): -- (2-1b) 


for the case of a threefold barrier. However, it has been 
customary to take the potential function as simply% 


(2-2) 


Un 


V 
V (a) =—(1 —cos3a) 


without rigorous justification (Fig. 1). 

Serious effort has been made recently?8:39.25b to ex- 
amine the effects and magnitude of the V6 term, and the 
experimental data strongly suggests that the sixfold 
term is indeed much smaller than the V3 term, the ratio 
being the order of one-hundredth or less. If this is the 
case, a good approximation can then be obtained by 
considering only the threefold terms in the Fourier ex- 
pansion of the potential energy, i.e., Eq. (2-2). The use 
of this simple potential function leads to solutions for 
the torsional wave equation in terms of Mathieu func- 
tions. In case greater accuracy is desired, the higher 
terms can be included and the corrections on the energy 
levels calculated by perturbation methods. 


II. Kinetic Energy and Hamiltonian 


In order to derive the kinetic energy, a model is used 
which consists of two rigid groups connected by a 
bond. At least one of these groups is a symmetric top. 
For convenience the symmetric group is regarded as ro- 
tating internally about the bond with respect to the other 
group which is taken as the framework. The entire mole- 
cule is also rotating in space. Thus there are four 
degrees of freedom: the three Eulerian angles 0, &, x of 
the framework and the relative angle between the two 
groups, œ. The method of solution is dictated by the 
functional form of the Hamiltonian which depends on 
the coordinate axes used. In the literature two methods 
of solution have appeared corresponding to two different 
frames of reference. 

In that originated by Wilson” and Crawford" et al., 
the set of principal axes of the whole molecule is used 
as the coordinate system (hereafter referred to as the 
principal axes method or PAM). Since the rotating top 
_ possesses an axis of symmetry, the principal axes of the 
= molecule are not altered by a change of the relative 
orientation of the top and the frame, and the coordinate 
system may be regarded as rigidly attached to the 
The Hamiltonian function consists of three 
] oups. The first is recognized as the energy of the 


ome authors have used the form 1+-cos3a’ which merely 
‘a shift in a, i.e., a’ =7—a. 
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rigid rotating system whose moments of inertia are 
different from those of the molecule. The second corre- 
sponds to the Hamiltonian of a simple hindered rotor 
with one degree of freedom. The third appears as the 
product of the angular momentum of the over-all 
rotation and the angular momentum associated with 
the internal motion of the top. This third group repre- 
sents the interaction of the two kinds of rotation. In 
the solution the first two groups are taken as the un- 
perturbed system and the cross terms, i.e., the third 
group, are treated as a perturbation. 

In an alternative method originated by Nielsen”! and 
Dennison et al.,15:8 the axis about which the top executes 
internal rotation is chosen as one of the coordinate axes 
(hereafter referred to as the internal axis method or 
IAM). The other two axes are fixed with respect to the 
framework and their orientation is, in principle, arbi- 
trary, but the choice is usually determined to some 
extent by the symmetry of the molecule. In this co- 
ordinate system the terms which describe the inter- 
action between the over-all and internal rotation are 
considerably smaller than those from PAM. Thus the 
interaction terms in JAM lend themselves more readily 
to simple perturbation treatment. The chief disad- 
vantage of this procedure is that, since the coordinate 
axes are, in general, not principal axes, the Hamiltonian 
is complicated by the presence of the terms containing 
the products of inertia. 


A. Internal Axis Method 


In the derivation of the Hamiltonian by the internal 
axis method (IAM), the case of a symmetric top (e.g. 
CH;SiHs) is treated first. An asymmetric molecule with 
a plane of symmetry is then discussed. Methyl alcohol 
(CH;0H) and acetaldehyde (CH;CHO) are examples 
of this class of molecules. 

1. Symmetric molecules.7°48—A set of coordinate axes 
(a,b,c) is chosen with the c axis along the symmetry 
axis of the molecule through the center of mass. One 
of the two symmetric groups is designated the frame (a 
convention set up for comparison with the asymmetric 
case) and the other symmetric group is designated the 
internal rotor or top. The a and b axes can be fixed 
arbitrarily with respect to the frame because of the 
symmetric nature of this end. The kinetic energy T can 
then be expressed in tensor notation as (see Appendix 1) 


T=hot-I-o+4(d0/d0)*- Ia: (3&/3t) 


Hot: Ia: (de/dt)+ (d0/dt)+-Ia:o], (2-3) 
or in terms of the components along the axes as 
T= wi tH wH wH H tot (2-4) 


= awe +3] yoy? +3 (I.— T)w2+3la (weta)?, 


where I is the inertial tensor and Ja, Js, (a=), and 
I. are the three moments of inertia of the whole mole- 
cule about the three coordinate axes; 7. is the moment 
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of inertia of the top about its symmetry axis, i.e., the 
C AXIS ; Wa, WD, and w, are the three components of angular 
velocity w about the a, b, andc axes; and dis the angular 
velocity of the top relative to the frame. Introduction 
of the momenta. 


Pa=0T/dwa=T wa, 
P.=0T/day=Tywn, 


P.=0T/dw.= 1 wt Iad, es) 
and 
p=9T/ðá=Ia(wt á), 
results in the classical Hamiltonian 
P£ Pe Pè 
H=—+—# 
DNA 2I, 2(I-—Ia) 
pP. Ip 
-e o (2-6) 


(Ie— Ia) 2Ialle— Ia) 


The physical meaning of Pa, Pa, and P. is not im- 
mediately obvious. The results of Appendix 1 show 
that Pa, Po, and Pe are the components of the total 
angular momentum (including the internal rotation) 
about the a, b, and c axes. Therefore they satisfy the 
usual commutation relations. The quantum-mechanical 
operators for Pa, Py, and P. in the coordinate repre- 
sentation depend only on the Eulerian angles 0, ®, x of 
the framework of the molecule but not on æ (see Sec. 
IIC2). In this case is the total angular momentum of 
the top including both the external and internal rota- 
tion. As a quantum-mechanical differential operator, 
p may be expressed as 


p > (h/i)(0/da)o,2,x, (2-7) 


and therefore p commutes with Pa, Po, and P.. 

At the limit of a very high potential barrier, Eq. 
(2-6) does not reduce readily to the case of a rigid rotor 
plus a simple restricted internal rotor. The reason is 
that the operator p defined in Eq. (2-5) contains not 
only the angular momentum of the internal motion of 
the top but also the contribution from the over-all 
rotation. When the internal torsion is completely frozen, 
the classical angular velocity & becomes zero and thus 
p becomes J qt-= (Ia/1e)Pe. When this expression for p 
is substituted into Eq. (2-6), the usual energy equation 
for a rigid symmetric top is recovered. 

The connection between Eq. (2-6) and the Hamil- 
tonian for the limiting case of a rigid rotor can be 
brought out more explicitly through the Nielsen 
transformation,” 


p =p- [UIP] =M- (Ikr, 
IP= Bi 


(2-8)|} 


|| Here, r, the reducing factor, is an important quantity used in 
both the IAM and the PAM. 
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which eliminates the cross product pP. and gives 


P2 Pè P? Ip? 
I= pM 
Dio Dio Dio WiKie=ie) 


+V (a). (2-9) 


The quantity p’ depends solely on &, so that it vanishes 
at the classical high barrier limit when a is a constant. 
Here p’ is not to be taken as the angular momentum of 
the internal motion (which is J«&) but rather rJac. 
Unlike Eq. (2-6), the coefficients of the P’s in Eq. (2-9) 
are the actual rotational constants of the entire mole- 
cule. Furthermore p’ does not commute with Pa and 
P, since from Eq. (2-8), 


p’Pa— Pap! =ih(Ta/T<) Po. (2-10) 


On the other hand, p’ does commute with Pae +P}? and 
hence the Hamiltonian in Eq. (2-9) is separable into 
two parts corresponding to the over-all rotation and 
the internal rotation. 

Equation (2-9) may also be derived through a trans- 
formation of the angular velocities. In terms of we and 
&' defined as 


we! =wet (Ia/1e)à, (2-11) 
ád =à, 
Eq. (2-4) has the form 
T= $1 (we Hor) H wtr ad (2-12) 
With the new variable w,’ it is natural to define 
Pil = (OT /dt¢)ea,oy 4 = Tend = Toe lack, (2-13) 


j= (OT 0G") oa,0b we! = [1— (Lo/Te) Wak=rl ak. 
In addition, Pa and Py are unchanged, i.e., 


|?) = (OT /Owa) 0p wea = (OT /daa) wp ,00,%, 


2-14 
Ris (OT /dw») vawe à = (OT/ðws)wa wci. ( ) 


Now the momenta P,’ and p’ defined in Eq. (2-13) are 
identical to the momenta introduced in Eq. (2-8). 
Substitution of Eqs. (2-13) and (2-14) into Eq. (2-12) 
gives Eq. (2-9). 

The coordinate transformation corresponding to Eq. 
(2-11) and also Eq. (2-8) is simply 


x'=x+L(Ta/T-)e] +a 
= (1/7) LLaxa+ (e-—La)x1], T (2-15) 


1 
aa, 


Note that x and a are related to the Eulerian angles of 
the two groups of the molecule xı and xz as x=xn 
a=x:—xı. One may regard 6,,x’ as the Eulerian 
angles for a new set of axes. This coordinate system is 
rotating with respect to the one attached to the frame 
work with an angular velocity of ([a//.)& about 
axis of symmetry of the molecule. These new coordi 


Tx is the same as ẹ used in references 8 and 48. 4 


2 E (2-30) 
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axes are hereafter referred to as “internal rotation 
axes” (called the ‘‘molecule-fixed” system by Hecht 
and Dennison”). It should be remembered that to an 
observer located in this frame of reference, both the 
framework and the top appear to be moving. The angu- 
lar momentum (z component) arising from the internal 
motion is 


Ta(xX2—-x')+ (We) (Xi —x’) =0. 


In treating the problem of vibration-rotation inter- 
action the moving axes are usually chosen so that the 
angular momentum due to the internal motion vanishes 
to the first approximation.” For the present problem 
it would seem proper to use the internal rotation axes 
as the frame of reference. By letting the three com- 
ponents of the total angular momentum in this co- 
ordinate system be P,’, Py’, and P<, it follows that 


Pa’ = (cospa)Pa+ (sinpa) Pe, 


Px’ = (—sinpa) Pat (cospa) Ps, 
P= IP. 


(2-16) 


(2-17) 


where 
D= Helo 


For asymmetric molecules,” the quantity p is a more complicated 
function of the moments of inertia: 

(Ith?) Ta 

Trplec—Ine * 

where J» and Icc are the moments of inertia about the b and c 

axis, respectively, and J; is the product of inertia [see Eq. (2-23) ]. 


For symmetric molecules one has J,-=0, so Eq. (2-19) reduces to 
Eq. (2-18). 


Insertion of Eq. (2-17) into Eq. (2-9) gives 


(2-18) 


(2-19)31.** 


197.8 


Py? P! Ip? 
Yk By Dil, PES) 


+V(a). (2-20) 


Equations (2-20) and (2-9) have an identical form. This 
situation occurs because the two principal moments of 
inertia Ja and J, of the molecule are equal. One may 
question the basic differences between Eqs. (2-20) and 
(2-9) or between Pa’, Py’ and Pa, Py. The answer is 
that Pa and P», as pointed out previously, do not com- 
mute with p’ while both P,’ and P,’ do. Therefore the 
matrix elements (diagonal and nondiagonal) of Pa’ and 
P, do not involve the quantum number associated 
with the operator 


U/I a(Ie—L«) |p’+ V (a). 


The matrices of Pa and P», on the other hand, are 
characterized by the quantum number of the torsional 
operator in Eq. (2-21) as well as by the over-all rota- 
tional quantum numbers. Consequently the Heisenberg 


** The quantity p is the same as Ttoh’s* à and can be defined 


ewhat more simply in the principal axes coordinate system 


(2-21) 


AND Jo Do 
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matrix elements of Pa and Py contain both the over-all 
and internal rotational frequencies. The same results 
can be anticipated from a classical consideration. When 
hindered rotation is present® the set of the internal 
rotation axes of the molecule rotate in the same way as 
the principal axes of a rigid symmetric top, while the 
two groups execute rapid torsion with respect to each 
other. As Pa and Ps represent the components of the 
angular momentum along the axes fixed in one of the 
groups, the values of these components vary at the 
rate of the torsional frequency. 

The cross term pP, in Eq. (2-6) implies a coupling 
between the internal and over-all rotation, while 
according to Eq. (2-20) the internal and external 
motion are separable. The discrepancy arises from the 
different choice of the coordinate systems. When the 
internal rotation axes are used as the frame of reference 
as in Eq. (2-20), the internal torsion and over-all rota- 
tion are independent of each other and hence the 
Hamiltonian is separable. When the framework of the 
molecule is taken as the moving frame, the motion of 
this set of coordinate axes consists of both rotation and 
the rapid torsional motion. Mathematically, this is re- 
flected by the cross term in Eq. (2-6). Because of the 
difference in the frames of reference in Eqs. (2-6) and 
(2-20), the meaning of the term “over-all rotation” is 
different in these two cases. 

2. Asymmetric molecules with a plane of sym- 
metry.®:0.31.32,81__For an asymmetric molecule with a 
plane of symmetry, a set of coordinate axes is chosen 
similar to the symmetric top case with the c axis through 
the center of mass of the whole molecule and parallel 
to the top (symmetric internal rotor) axis, and the b 
axis through the center of mass and lying in the plane 
of symmetry. The inertial tensor assumes the form 


Neg 0 0 
0 Iss —Toc], (2-22) 
0 all Toc Tec 


where Jaa, Io», and J,, are the moments of inertia and 
Ty. is the product of inertia. The kinetic energy, accord- 
ing to Eq. (2-3), can then be expressed as 


T= aawa +4] yor +3] we 


Irotz ad? +I awa. (2-23) 
Similarly, the momenta are defined as 
Pa=8T/ðwa= I aawa, 
Pr=9T/ ðw = Tyywr— Lo; (2-24) 


P.=OT/00.=1 cwe Tyco tI ag, 
p=0T/da=1.(a-+a,). 
As with symmetric molecules, Pa, Ps, and Pe are again 


the three components of the total angular momentum. 
With the substitution of Eq. (2-24) into Eq. (2-23) 


pS 


— 
a 
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the following quantum-mechanical Hamiltonian is 
obtained: 


H= A aP at BuPY+CP2+Ds-(PsP-+P-Ps) 


—2DyPsp—2C-Pep+kp+V(a), (2-25) 


where 
h? 
Aa = —, 
DM hess 
h(I ec—La Tice Din) ht 
bia ) 3 ( ) 
= 2(Irslec— Ialt — Ine) 2r (Tool ce— Ive) 


(Tec—Ia)h? 


, 


2d 
~ hIw hIv Typhi? 
os E ee 
2(Tovl co— Lal wo — Ive’) 2r(Isslee— Ine) 2d 
NT ye ND ye Ish? 
Doe = — j 
2 (Teel ce—Lalve—Lv2) Ir (Lepl cc—L 22) od 
h? (Lvl ce—I.) I 
P= 22A 
20a (Leoleo eleta) Wile 
Tvl co—Lopla lve Nle dle 
a =1-— = , and 
Myla Tee I, I 


= Iwl ec—Iwla— Ive =r(I wl ec Ibe) =r yl. 


Here I+, J, and J, are the principal moments of inertia 
of the whole molecule, and Az, Ay, and A, are the direc- 
tion cosines of the internal axis with respect to the 
principal axes. 

By following the method used for the symmetric case, we can 
apply the transformation in Eq. (2-8) to eliminate the coupling 
between over-all rotation and internal rotation. Unfortunately, 
this transformation does not completely remove the coupling for 
asymmetric molecules. The results of the transformation on H 
are given in Eq. (2-26), 


I com I a 
H=AqP2+BsPe+ CiP P+ — Do (PsP! +Pe' Po) 
ce 


—Dp(Pop'+ p' Ps) — ED Pep +P p+ V(q), (2-26) 


where 


r [ anani 

=A er 
tI [scDoch? 

THAI ee 


Equation (2-26) is essentially the same Hamiltonian employed 
by Burkhard and Dennison? to obtain the Schroedinger equation. 
The equivalence between Eq. (2-26) and Burkhard and Dennison’s 
wave equation is discussed later in this section. Again p’ in Eq. 
(2-26) does not commute with Pa and Py. 

Although the coupling between p and Pe can be eliminated for 
the symmetric case by the transformation (2-8), it is not removed 
completely for the asymmetric case but is reduced by the factor 
Ivc?/IvvIce. In order to minimize the coupling term the internal 


Ce 


w 
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angular momentum must be made to vanish. This is brought 
about by a slightly different transformation given by Hecht and 
Dennison” or by a modification in the formulation given by Itoh.*! 


Hecht and Dennison apply three transformations to 
the Hamiltonian given in Eq. (2-25). First, a rotation 
is performed in order to eliminate the P,p coupling. 
Secondly, a modified Nielsen transformation is applied 
to remove the Pp coupling. These two transformations 
are given in Eqs. (2-27) and (2-28), 


Pa” = Ba, 


P= (Iis Po— IoP.) (tI) >, (2-27) 
P.” = (IvcPot IoP.) (Io HI o), 
and 
p= p—pP.” 
= p—La(Loe?-+ Ty) PRH (Doel cc— Ioe). (2-28) 


The transformed Hamiltonian which results from these 
two operations is 


H” = A aP + By’ Py? + CPL 


+ Doe! (P P! + P” P”) +P p?+ V (a), (2-29) 


where 


h? 
By” ae ete esate ae = rĪ«h(1 = r)/2p*d, 


I? Toot lie Top h? 1 1 
PERCET i) 
2 Tool ce~ Ive Iwt Iie 2 If If 
h 
Dy. =F Tove (Lost + ie) = (rI a)? Doc/ pd 


py Sil Uy 

(tI) Ia F AlN 
i A) -[>( 7 ) | mm 
PANA 


I? pyp:f1 1 ) 


(2-30) 


Finally, the transformation described in Eq. (2-17), 
with p given by the more complicated expression Eq. 
(2-30), is applied to Eq. (2-29). This transformation 
defines the “internal rotational axes system.” Referred 
to these axes the internal angular momentum vanishes. 
The resulting Hamiltonian” is 


H=} (Aat Be") (Pa Py) +C PARP V(@) 
+3(Aa— Bo’) (Pa— Po’) cos2pa 
—(Pa'Po'+Po' Pa’) sin2pa | 
HDi” [(P' P! +P.' Ps’) cospa 

+(Pa'P’+P'Pa’') sinp]. (2-31) 


It can be easily shown that Pa’, Py’, and P.’ satisfy 
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the usual commutation rules and they all commute 
with p’. 

Itoh*! has derived the Hamiltonian (2-29) in an inter- 
esting and useful way. He also expressed the Hamil- 
tonian in a form convenient for comparison with the 
PAM formulation. The over-all angular velocity is 
divided into two parts w: and w2, such that the angular 
momentum of the whole molecule associated with wı 
balances the internal angular momentum. This is 
equivalent to choosing a set of axes in which the in- 
ternal angular momentum vanishes. Thus one has 


T-o14J.(00/dt)=0, (2-32a) 
or 
oi=— I. (Id) = — oå, (2-32b) 
where A 
o=I«(I-k). (2-32c) 


Here I is the inertia tensor and & is a unit vector in the 
direction of (da/di). The components of the vector o 
can be found from Eqs. (2-22) and (2-32c). Referred to 
the a, b, and c axes, one obtains 


pa=0, Po~ lalt loole lte), 
and 
peillet lotl loe): (2-33) 


The three components of ọ along the principal axes and 
the magnitude of the vector are given by Eq. (2-30). 
By expressing Eq. (2-3) in terms of o, 1, and qa, one 
obtains 


T=} (o—o:)+- L- (o— o1) 
— Zor: l- ort} (3æ/3H)t- Ia: (Ae/d!). (2-34) 
The substitution of Eq. (2-32b) into Eq. (2-34) yields 


T=3(o+oa)*-I- (ot 9a)+2(La— 9* I- o)a? 
=} D xl ju (@jt+ pj) (ort prá) +37] ad’. 


The resulting Hamiltonian is 
he hy 1 
e O 
2 Gk ; 2 \rIa 


where 
I= pP tp P. = 0: P. 


This Hamiltonian is the same as Eq. (2-25). A rotation 
Y of the c axis to the ọ direction (c’ direction) brings the 
rh Hamiltonian to the form given in Eq. (2-29) which then 
E can be transformed to Eq. (2-31). This transformation 
is not strictly necessary since the quantum-mechanical 
matrix elements can be determined from Eq. (2-29) as 
= well as Eq. (2-31) with only a different torisonal wave 
= function. This point is discussed in Sec. 3. 
Finally, two other types of molecular symmetry 
been considered in the tggu Molecules TiD 
icular planes of symmetry, €-g., nitro- 
perpendi B oe by Wilson ef al. and 
senbaum, Johnson, Meyers, and Gwinn.16107 
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Molecules with no planes of symmetry have been 
treated by Hecht and Dennison® and by Burkhard,” 
and the method is similar to that outlined in this 
section. 


B. Principal Axes Method (PAM)™+3710,19,121 


In the PAM the symmetric top Hamiltonian is given 
by Eq. (2-6) with the principal axes x, y, and z co- 
incident with the a, b, and c axes. The transformation 
(2-8) is not applied. In this method (for an asymmetric 
molecule) a coordinate system v, y, and z, rigidly at- 
tached to the framework of the molecule, is used. 
Furthermore, this set of axes is chosen to be the prin- 
cipal axes of the entire molecule with the origin at the 
center of mass so that the principal moments of inertia 
I+, I,, and Itt of the molecule are used. The orientation 
of the principal axes relative to the framework is not 
altered by the rotation of the top because of its cylin- 
drical symmetry. Therefore from Eq. (2-3), the kinetic 
energy for an asymmetric molecule has the form 


T= Iwe tH ywy tH wet I aNzUrĂ 
N Na alee 


where wz, wp, and wz are the components of the angular 
velocity along the principal axes, and Az, Ay, and À+ are 
the direction cosines of the symmetry axis of the top 
to the principal axes. In terms of the momenta defined as 


P;=0T/0w; 
p=90T/da, 


(2-36) 


(i= v, V, z), 


the Hamiltonian can be written in the form!” 


H=A,P?+ByPP~+C.P2+3 E DylP:Prh ER) 


xyi 7 
—2 Ð Q:PipHFH+V (a), (2-37) 
where > 
h? Nal h? 
a= fit |tz, 
PIE. rl Mie 


and similarly for B, and C, by the permutation (x, y, 3), 


D WX De F ( | ) 
y= > =F pip; (i, 7=x, y, z and i J), 
j 2 A j J y 
hNi 
STT = Fpi (i=x, y, 2), 
P=?/2rl a, 
Nelle NE 
p= (ototo) 2D ( l Di po=Agla/Los 
g=z,y.z\ Ig 
Mile MDa 
7=1———— ‘ i 
Iy Ils 


tt Note that the notation I+, Iy, and Z; is used instead of Iss, 
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Alternatively one may express the Hamiltonian?“ as 


H= W/W a) (p- O+ #/2)0 P/I tV (a), (2-38) 


where 


I= g: P=) oPoP = (zl a/Iz)Pz 
+ (Ayla/Ty)Pyt Qela/Is)P:. 


The similarity between Eqs. (2-38) and (2-35) should 
be noted. 

From Eq. (A1-7) it is clear that P,, P,, and P: 
represent the three components of the total angular 
momentum defined in the usual way. The commutation 
relations between the various momenta can be sum- 
marized as 


(P;,P;))=—ihP, (i,7,k=x,y,z in cyclic order), 
(p,P:)=0. 
Furthermore, the coupling between over-all rotation 
and internal rotation is apparent in the Hamiltonian 
by the cross terms pP;, etc. 

For the molecules with a plane of symmetry in their 
framework all the above formulation applies with \z=0 
(or Ay=0).6838 With two planes of symmetry both of 
the direction cosines are set to zero.™! 


C. Comparison of the Two Methods 


The chief difference between the two methods lies in 
the different choice of coordinate axes. Hence, the 
Hamiltonian functions can be transformed to each 
other simply through a coordinate transformation which 
is given in the first part of this section. In the second 
part the Schroedinger equations in terms of the Eulerian 
angles are given for each method. While it is not neces- 
sary to express the Hamiltonian and the wave function 
in the coordinate representation in order to find the 
energy levels, a comparison of the momentum operators 
in terms of the Eulerian angles does show more clearly 
the difference between the two formulations. 

1. Connecting relations —For the purpose of demon- 
strating the connection between the two procedures a 
molecule with a plane of symmetry is taken. The rela- 
tion between the components of a vector r in the 
principal axes v, y, 2 and the set of axes a, b, c, defined 
in the internal axis method is 


Va it @ 0 Tz 
| Ay || = 0 Àz —Ày Ty |- 
Te O My MINH 
The various terms in these two systems are then related 
by the equations 
Tw=r2Ty+A/yL =; 
ENA Ea Av E 
Ivc=^pA:(I:— I), 
Loa= Iz, 


(2-39) 
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From these formulas one can prove the equivalence of 
Eq. (2-25) or Eq. (2-35) and Eq. (2-37) or Eq. (2-38). 
2. Comparison by Eulerian angles — (a) Principal axes 


method: To express the angular momentum operators 
in coordinate representation use is made of the relation 


oT 060T db OT OX OT 


P,=—=— — —— — == = 
dw, ðwz Å ðwzðğ Jw, OX 
36 ab ax 
=—P »+—P4+—P,. (2-41) 
Jwz dws Dwz 


Since wz, wy, and w: constitute the angular velocity of 
the framework of the molecule, they are related to the 
Eulerian angles describing the orientation of the frame- 
work through the equations 


wz= Î siny— È sinf cosx, j 
wy= Î cosx + sinô sinx, (2-42) 
W:=xX+ cosb. 
The quantum-mechanical operators for Pz, Py, and P: 
are readily obtained if one replaces pa by (7/1) (0/00) sxa, 
etc. It follows that 


Pn ô ô 
—=sinx (=) — cosx csco(—) 
h 30 bya GRY Oxa 
ð 
+cosx coté (>) 
əx Oba 


tPy ô ð 
a -+sinx csca( —) 
h 06 bya Oo’ @xa 


F ð 
— sinx coto(—) 
ð a 
iP. ( 3 ) X? Ob 
h Ox tae 


These expressions for the angular momentum are 
identical to those for a rigid rotor, and the customary 
commutation rules can thereby be established. Since 
Wz, Wy, Wz have no dependence on & and a, one can see 
that 


(2-43) 


p= (9T/dà)oz wy w= (OT/Ad)arx, (2-44) 
and the quantum-mechanical for p is, accordingly, 
SAU 2 * 
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Inspection of Eqs. (2-43) and (2-45) confirms the com- 


mutation relation 
pP:—Pip=0, 


Equations (2-43) and (2-45) may now be substituted in 
the Hamiltonian in order to derive the desired wave 
equation. 

(6) Internal axis method: The coordinate transfor- 
mation from 6, ®, x, a to 0, ®, x’, a’, as described in 
Eq. (2-15), can be used to express Pa, Ps, P.’, and p’ 
in terms of the new set of variables 


(8/8x) a= (/AX’) a’ 
(0/da) = p(0/0x') a+ (3/ða) w. 


Equation (2-46) is valid so long as the ¢ axis is parallel 
to the bond connecting the framework and the top. 
However, after the transformation Eq. (2-27), this is 
no longer true and the Eulerian angles x, a, x’, and a’ 
are related by more complicated expressions [see Eq. 
(8) of reference (20) ]. In order to simplify the mathe- 
matical details, only Eq. (2-46) is used. 

It is now possible to write with the aid of Eqs. 
(2-15) and (2-46), tt the following angular momenta, 


l= X, Y, Z. 


7 


(2-46) 


Pa=cospa’ Px’ —sinpa’ PX, 


Py=sinpa’ P,*'«’ + cospa’ PiX, 


hf ð 
o, 
i dx’ bpa’ 
where 


hy. rs) 3) 
Pax” =- siny'(—) — cosx’ csca( —) 
1 06 Px! a’ 0P 6x’ a’ 
ð 
+cosy’ coto(—) | 
> ax’ Oba’ 
h ð ð 
Pye =-| cosy (—) +sinx’ csco(—) 
i O07 aya! OBS oyra’ 


p 
ð 
— sinx’ coto(—) | 
dx’ Oba 


The two superscripts on Pa and P, denote two of the 
four “independent variables” (the other two being 
always 0 and ®) one chooses in taking the partial de- 
rivatives. One might well recognize that Pax% and 
P,” ~ have almost the same form as P, and P, in Eq. 
(2-43) (which in our present notation, would be written 
as P,x* and PyXx*) aside from the fact that x’ and a 
are used as variables for one case while x and a are 
as 


may be obtained by a procedure similar to that em- 
Bivedin 2a: (2-41), i.e., from the relations 


(2-47) 


(2-48) 


a OE da! 
PG 20 Pot E pat Puta b etc. 


wa waco. CG fikul Kangri University Haridwar ateen tees ee Y ol reference 8 g 


Js ID. SW AIBID IN] 

used for the other. It is important to specify the inde- 
pendent variables in the partial derivatives because Eq. 
(2-46) shows that (0/da), is different from (0/da),- and 
also from (0/0a’),:. Since Pa and P, involve all four 
angular variables, the eigenfunctions of the operator 


P/a + Pè/2I 4P. /21. 


depend on the internal coordinate a’, in addition to the 
three Eulerian angles, in a rather complex manner, 
For example, these functions are not identical to the 
asymmetric top wave function. It is apparent that the 
absence of the cross product between the two kinds of 
momentum operators in Eq. (2-29) does not signify a 
complete separation between the external and internal 
coordinates inasmuch as the P’s are actually functions 
of a. When the momentum operators in Eq. (2-26) are 
written in their differential form given in Eqs. (2-47) 
and (2-48) [where p is Ia/Icc from Eq. (2-8)], the 
Schroedinger equation is obtained. This equation was 
given by Burkhard and Dennison.§§ 

The rather unsatisfactory feature of having a’ mixed 
in the P’s can be removed by a transformation which is 
the inverse of Eq. (2-47), i.e., introducing P,’ and P,’ 


so that 
Ie cospa sinpa\ /Pa Terese 
(oda IORA o 
Rri —sinpa cospa Pi Pixia 
This transformation is similar to the one employed 
previously in Eq. (2-17) in connection with the internal 
rotation axes. Equation (2-49) shows that Pa’ and Py’ 
do not contain @ explicitly and thus commute with p’. 
The Hamiltonian may then be written in the Schroed- 
inger representation with the aid of Eqs. (2-48) and 
(2-49). 


3. HIGH BARRIER APPROXIMATIONS 


In Sec. 2 the Hamiltonians were derived for various 
coordinate systems and for molecules of various kinds 
of symmetry. These Hamiltonians can generally be 
divided into three parts: the over-all rotational part, 
the internal rotation torsional part, and a coupling 
between over-all rotation and internal rotation. Since 
the separation is not complete, except in the case of a 
symmetric type molecule, the Schroedinger equation 
cannot be solved exactly; perturbation theory is usually 
applied. For ease in computation we use a matrix for- 
mulation, and two approximate methods for diagonaliz- 
ing the matrix are commonly used. The high barrier 
approximation is applicable to the cases where the 
separations of the torsional energy levels are large 
compared with the rotational energy separations. The 
other is the low barrier approximation where the prob- 
lem of the free internal rotor is first solved and the 
barrier treated as a perturbation. Section 4 deals with 
this low barrier approximation. 
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The Schroedinger equation for a high barrier can be 
solved by using either the PAM or JAM. 

In the PAM the energy matrix is constructed in a 
representation in which the over-all rotation part and 
the internal torsion part of the Hamiltonian are sepa- 
rately diagonal. The cross terms (coupling terms) be- 
tween the total angular momentum and the internal 
angular momentum are chosen as the perturbation. The 
basis functions are then the product of the rigid sym- 
metric rotor wave functions and the functions of the 
internal angle a, known as the torsional functions which 
are related to the tabulated Mathieu functions. The 
perturbation terms are treated by the Van Vleck trans- 
formation* so that after this transformation the secular 
equation can be approximately factored into blocks 
corresponding to the different torsional states. Associ- 
ated with each torsional state there are two different 
submatrices in the energy matrix and thus two sets of 
rotational levels. From the separation of the pair of 
spectral lines arising from a given rotational transition 
in the two sets of rotational levels, the barrier hindering 
internal rotation is determined. 

In the IAM the Nielsen transformation (or the 
modified form given by Hecht and Dennison” or Itoh?!) 
is applied to the Hamiltonian so that the coupling terms 
between the internal and external coordinates in the 
transformed Hamiltonian are completely removed or 
become much smaller than those which occur in the 
PAM, and hence can be treated by perturbation method 
more readily. The basis functions of the energy matrix 
are again taken as the product of the rigid symmetric 
rotor functions and the torsional functions which are 
the solution of a differential equation similar to the 
Mathieu equation. The torsional functions here are 
different from those in PAM because of the different 
boundary conditions as explained in Sec. I. B. Because 
of the interaction of the external and internal rotation, 
the torsional functions depend on the quantum number 
K and, in principle, the differential equation for the 
torsional functions should be solved for each value of 
K. Since it is necessary to have the torsional functions 
in order to evaluate the elements of the energy matrix, 
the determination of the torsional functions becomes one 
of the major difficulties of this method. Approximate 
formulas?” have been derived for the matrix elements 
involving the torsional functions. At the high barrier 
approximation the infinite energy matrix can be reduced 
to a series of (2J-+1)X(2/-+1) blocks, corresponding 
to the case of limiting rigid rotor. Since the coordinate 
axes used in setting up the Hamiltonian in Eq. (2-25) 
are not the principal axes for the asymmetric rotor, a 
coordinate transformation to the principal axes should 
be made so as to simplify the solution of the secular 
equation. When this is done, each of the submatrices 
becomes very similar to that of a rigid rotor with some 
additional elements which represent the effect of the 
internal torsion on the over-all rotation. From the 
structure of the energy levels it was found that each of 


the rigid-rotor spectral lines is split into a doublet as 
also predicted by the PAM. The relations between the 
separations of the doublets to the barrier height have 
been given by Hecht and Dennison” and also by Lide 
and Mann.® These approximate equations are equiva- 
lent to those derived by PAM so that one has in many 
cases an equal choice as to which method to use: PAM 
or IAM. In the discussion to follow the symmetric 
molecule is covered first by the PAM and the IAM. 
With these principles in mind the effect of asymmetry in 
the molecular frame is introduced into both the PAM 
and the IAM. 


I. Symmetric Molecules 
A. Principal Axes Method (PAM) 


As derived in the previous section [see Eq. (2-6) ], 
the Hamiltonian for this class of molecules can be 
written as 


H=Hpr+Hrt+Hrr, (3-1) 
Hr=A,(P2+P,)+C,P?, 

Hr=F p+ V (a), 
Hir=—2C:P:p, 


where 


he R 
{,=—__=—_,, 
Mlk Ws (3-2) 
h? h? 
C.=———-=—+ Fp’, 
UN) Pit - 
h? hl. 
F =—— = 


nie Nihil), 


Here, Hr and Hr comprise, respectively, the terms de- 
scribing the over-all rotation and internal rotation, and 
Hrr represents the interaction term.|||| In the calcula- 
tion of the energy levels of H, we take Hr+Hr as the 
unperturbed Hamiltonian and treat Hre as the per- 
turbation. Since Hrr does not vanish at the limit of 
infinite barrier, the effect of the perturbation is not 
small, Consequently, for molecules with an intermediate 
barrier, fourth-order perturbation procedure must be 
applied in order to obtain accurate results for the barrier 
heights. Nevertheless, the advantages of this approach 
will be seen. 

1. Rotational equation Hpr—The eigenfunctions of 
Hp are the symmetric rotor functions and they are 
usually expressed as 


Sux (O,e)e*x, 


where J is the total angular momentum and K and M 
are the projection of J on the body-fixed and space- 
fixed axes, respectively. The energy and the matrix 


_ ||| For a discussion of the “interaction” between over-all and 
internal rotation see the last paragraph of Sec. 2 IT. Al. 
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elements of P are 
E=4AJ(J+1I)+(C:—4:)5”, 
(JIK|P|JK)=J(J+1)k, 
(JK|P.|JK)=Kh. 


The rotational constant, Cz, associated with P}? is not 
that of the limiting rigid rotor and hence Hr does not 
represent the entire Hamiltonian for the case of infinite 
barrier height. The reduction of H into the rigid-rotor 
Hamiltonian for the limiting high barrier is discussed 
in Sec. I. AS. 

2. Torsional equation Hr.—(a) Eigenfunctions and 
eigenvalues: The eigenfunction U(a) and the eigen- 
values E of Hr are to be determined from the differential 
equation 


[—Fa?/do?+ V (a) JU (@) = EU (a). 


Physically we know that U (æ) must be periodic in 2r. 
When V (æ) is taken as a simple sinusoidal function 
such as 3V3(1—cos3a), the above equation can readily 
be transformed through the substitutions 


3a+7= 2x, s=4V,/9F, 


(3-3) 


UL (Ga+m)/2]=M(x), b=4E/9F, G2 
to the Mathieu equation 
d?M (a) /da2-+-[_ (b—3s)— 4s cos2x ]M (x)=0. (3-5) 


The solutions of Eq. (3-5) may be expanded by Fourier 
series 

M (x)= 0 c; coska-+d;, sinky. 
Since Eq. (3-5) is invariant under the operation 
%— —x, x— «+7, the eigenfunctions of Eq. (3-5) have 
one of the following forms’: 


Seo;(s,x) => Deo?” cos2ka (period r) (3-6a) 
S€2r41(5,4) = 2 Derr cos(2k+1)x 

(period 27) (3-6b) 

Soor(s,”) => Doo.” sin2ka (period m) (3-6c) 
Sona (s,4) =Z Dorr ™® sin (2k+1)x 

(period 2r). (3-6d) 


Here s is a parameter associated with the differential 
equation (3-5). For each type of the functions, Se(s,x) 
and .So(s,x), one finds a series of eigenfunctions and 
they are labeled by the superscript r. The values of b 
corresponding to these functions are denoted by berr, 
: Beory1) 0027) b02r41. When Eqs. (3-6) are substituted in 
Sig. (3-5), four different infinite secular equations are 
3 obtained from which the eigenvalues can be determined. 
This procedure can be carried out in a systematic 
fashion which is described in Appendix 2. The Fourier 
= Be nis of the eigenfunctions are then obtained from 
i E recursion relations associated with the infinite 


er te 
howe Ee., 


re 
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Fourier coefficients of the associated eigenfunctions can 

be found in the Tables Relating lo Mathieu Functions. 
From the torsional equation one may readily see 

that since the Cs group operations leave Eq. (3-3) 

invariant {f ff; the solution of this equation can be 

expanded in the form 


Tid seka (A) (3-7a) 
DA spe Da (E) (3-7b) 
1A gk—16? Bk -D& (Eo) (3-7c) 


corresponding to the symmetry species A and Æ of the 
C3 group. The first form of the expansion in Eq. (3-7) 
gives rise to nondegenerate eigenfunctions while the 
last two constitute the doubly degenerate pairs. The 
series of eigenvalues and eigenfunctions of the non- 
degenerate solutions can be obtained from the tabu- 
lated Mathieu functions with period r, Eqs. (3-6a) and 
(3-6c), after the slight modification as set down in 
Eq. (3-4). The degenerate solutions have been tabulated 
recently by Kilb.*® The relations between the notations 
adopted here and those used in the various tabulations 
may be summarized as follows: 
Ess =9Fbv/4, s=4V3/9F, 

A levels (s =0) 


2x=3a-+r 


E levels (e= 1) 


v bye reference 104 by, reference 36 

0 boo beo boi(bo-1) bo 

1 bio boz biu (bi) by 

2 bzo bez bzi (b21) bı 
3 ie Uon Be bain Ba (3-8) 
4 


bar (ba) be 


Each member of the series of the degenerate and non- 
degenerate solutions is labeled by the index v with 
v=0 for the lowest eigenvalue. At very high barriers 
the spacings between the nondegenerate and degenerate 
energy levels of the torsional equation associated with 
a given v are much smaller than those between levels 
with different v (see Fig. 1). For this reason, the index v 
is called the torsional quantum number and the dif- 
ferent energy levels associated with a given v are 
thought of as the torsional sublevels belonging to the 
same torsional state. The sublevels are distinguished 
by the index o with o=0 for the nondegenerate level 
(species A) and c=+1 for the levels of species Æ 
(Ai: c=+1, Ez: c=—1). In Fig. 2 the energy levels 
are plotted as a function of the barrier height from free 
internal rotation to a relatively high barrier. ‘The solu- 
tion of the free rotational energy levels and the meaning 
of the “m”? quantum number are discussed in Sec. 4. 

A solution of the torsional equation can then be 
written as 


Url) =Z Aono, 9) 
TT Strictly speaking, Eq. (3-3) belongs to the Cs» group. How- 


z ever, in the present discussion, it is more convenient to consider 
RIA Th envalues and the onl ? 
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INTERNAL ROTATION 
and the eigenvalues associated with these functions are 
correspondingly denoted by Eys. 

(b) Harmonic oscillator approximation: With a very 
high barrier the internal motion degenerates into small 
oscillations, and the potential energy 4V3(1—cos3a) 
may be expanded about the equilibrium point yielding 
(9/4) Va. The torsional wave function in the vicinity 
of the equilibrium configuration can be approximated 
by the harmonic oscillator functions which are denoted 
as H,(a). The corresponding eigenvalues are 


E,=3(V3F)}(v+4). (3-10) 


There are three functions of H,(a) ; one situated at each 
of three potential minima. These functions are denoted 

; by H,™, H,®, and H,“. When the tunneling effect is 
considered, the correct zeroth-order torsional func- 
tions are. 


1 
U, oe) =—( HY +H, +H, ], 
v3 


1 
Veale) =F [Ho + oH H, (3-11) 
3 


1 
U,,-1(a)=—[ A, +H, +H, ®], 
V3 


where w= exp(2mi/3) and 2, as before, denotes the tor- 
sional quantum number. The first function in Eq. 


POTENTIAL BARRIER—*- 


Fic. 2. The hindered internal rotational energy levels as a 
function of the barrier height. The free rotation quantum numbers 
m are at the left ordinate, i.e., a zero barrier, and the torsional 
quantum numbers v are on the right ordinate, i.e., a relatively 
high barrier. The diagonal line V, shows the actual height of the 
barrier in relation to the energy levels. Those energy levels below 
* the top of the barrier are clearly discernible. 
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Fic. 3. The torsional matrix. Each block on the diagonal con- 
tains the over-all rotation and internal rotational matrix elements 
associated with that torsional level. The shaded area represents 
the coupling between the rotational levels of the various torsional 
states. With the Van Vleck transformation the matrix elements 
in these shaded areas are folded into the blocks on the diagonal. 


(3-11) is nondegenerate (A levels) and the last two are 
degenerate (E levels). 

3. Perturbation treatment.—The energy matrix for H 
as given in Eqs. (3-1) and (3-2) is now constructed 
in a representation in which Hr and Hr are diagonal. 
From Eq. (3-9) it is obvious that p is diagonal in the 
quantum number g but not in v. Consequently, the 
secular equation can be grouped into blocks corre- 
sponding to different v and o. Within each block the 
Hamiltonian is given by Hr+Hz, and blocks of dif- 
ferent v are connected by the matrix elements of Hrr. 
If the energy separations between different torsional 
states are large compared to those between rotational 
levels of the same v, a VanVleck transformation™ may 
be applied to the energy matrix to reduce the matrix 
elements nondiagonal in v to the second order; such 
matrix elements can then contribute only fourth-order 
terms to the energy and can usually be neglected. The 
elements diagonal in v are, of course, modified by this 
transformation and the effect of this transformation is 
to fold the elements not diagonal in u onto the v 
blocks.*** The matrix is illustrated in Fig. 3 where the 
shaded areas contain those matrix elements off diagonal 
in v which are transformed into the v blocks shown on 
the diagonal. Therefore, the infinite secular equation is 
effectively factored into separate blocks corresponding 


*** For this particular class of molecules, both Mr and 
are diagonal in K and v is the only nondiagonal index; 
quently the Van Vleck transformation is identical to th 
perturbation treatment. However, the difference be 
two methods becomes apparent for the asymmetric mole 
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block need be considered at a time. To the second-order 
approximation the matrix elements of a v, o block are 
given by the Hamiltonian operator"! 


Hee — Ae (Ce -| en) 1- Ovgloe— 22h posve t Evo, (3-12) 


where 


2r 1 d 
beowte= | Uso (a)= —U,.(a)da, 
0 1 da 


C= C.(1H4C. a) 


z 
1 — fi 
z vo Be 


rooe 
=c+re( 1447 ee), 


vo Liyg_ Bate 
c= 42/2I.. 


Here H,, can be regarded as the effective rotational 
Hamiltonian for a given torsional state. The energy 
is then 


E(JK M10) =A 2J (J+1)+ (Cre—Az)K? 
Ta 2C-K pre, AFA (3-13a) 


The perturbation treatment can also be carried to 
higher order. The energy obtained from the nth order 
perturbation calculation can be expressed simply as?°* 


Eye=AJ(J+1)+(C—A)K? 


HE X nwe ™ (pK), (3-13b) 
where 
p=I./I:, 
Wye =9b,,/4, 
Wyo) a DDG 
h | Povo,v'c | 2 
Wyo” = 1+ (16/9)} RE] etc. 


The first two terms of Eq. (3-13b) are the energy of a 
rigid symmetric rotor while the last term, which is a 
power series in pK, represents the effect of the internal 
torsion on the rotational energy levels. 

The convergence of the power series depends mainly 
on the magnitude of pK. Although the coefficients wss 
are strongly dependent on the barrier height (more 
precisely s) of the approximate form”? As? exp(—Cv/s), 
they do not converge very rapidly with increasing n 
(see Fig. 8). Consequently the magnitude of the factor 

n must be examined and the perturbation calcula- 
tion should then be carried out to the desired accuracy. 
4, Energy levels and spectra.—For a given torsional 


ae ‘state v there are two torsional sublevels characterized 


ies A and E (orao=Oand +1). 
e symmetry properties 4 a ; 
i ith each such torsional sublevel is a set 0 
paea a the structure of which is 
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For the nondegenerate A level the torsional functions 
are real and hence the diagonal elements of odd order 
in p vanish, i.c., W» =0. The rotational Hamiltonian 
is given to the second order by the first two terms of 
Eq. (3-12). This is formally identical to the Hamil- 
tonian of a rigid symmetric rotor with a modified rota- 
tional constant C,,. Consequently, the rotational levels 
associated with a nondegenerate A type torsional level 
(hereafter referred to as A rotational level or simply A 
levels) have the same structure as the levels of a rigid 
rotor (referred to also as the pseudo-rigid rotor levels). 
For the case of the degenerate torsional levels, py1,.1 is 
different from zero, so that the structure of the rota- 
tional levels here (called Æ rotational levels or Æ levels) 
differs from that of a rigid rotor. The linear term in P; 
in the Hamiltonian has the effect of removing the K 
degeneracy characteristic of the rigid symmetric rotors. 
Furthermore, since the second-order perturbation term 
Wyo) in Eq. (3-13b) is different for the A and E levels, 
each torsional sublevel will have a different effective 
rotational constant, C,,. The rotational levels with 
a==t1 are doubly degenerate with respect to ø but 
nondegenerate with respect to K, and vice versa for 
the levels with c=0. Hence, all the energy levels are 
doubly degenerate with the exception of the ones for 
which c=AK=0O. Here the spatial M degeneracy has 
been disregarded. The three pairs of doubly degenerate 
levels c=0, +K; o=+1, +K; c=—1, —K; and 
o=—1, +K;c=+1, —K are shown in Fig. 4. 
Physically, the presence of two sets of rotational 
levels with c=0 and +1 may be understood from a 
semiclassical point of view. For a high barrier the in- 
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Fic. 4. Correlation of rotational energy levels. The rotational 
energy levels and rotational parallel (AK=O) transitions for 
various conditions of asymmetry and barrier height. At either end 
the symmetric rotor energy levels and transition are shown. Intro- 
duction of hindered internal rotation splits each energy level in 
three levels for a threefold potential barrier. The three transitions 
however are coincident, giving only one line. Asymmetry splits 
the A level (K doubling), but only shifts the Æ levels. Two weaker 
lines on each side of the pattern are possible transitions between 
the Æ levels, but these are not shown in the figure. If the barrier 
height is raised, the energy levels tend toward the asymmetry 
doublets, and two doublets are observed in the spectrum. Finally, 
if the asymmetry is removed we are back to the symmetric rotor. 
The thickness of the lines for each energy level is approximately 
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INTERNAL ROTATION 
ternal motion can be pictured as a back-and-forth 
oscillation about the equilibrium configuration of the 
molecule. The over-all rotation of the molecule along 
with this mode of internal torsion gives rise to the set 
of the A levels. The E levels can be associated with a 
different mode of internal motion in which the molecule 
passes from one equilibrium configuration to another in 
a circulatory manner through the tunneling effect. 
The double degeneracy of the Æ torsional levels is 
related to the two senses (counterclockwise and clock- 
wise) of tunneling internal rotation. One may then 
expect the Æ levels and wave functions to have some 
free rotational character. 

As the dipole moment of the molecule is independent 
of a, the selection rules for the rotational transitions are 
AJ=+1, AK=0, and Ac=0 (see Sec. 6). Equation 
(3-13) shows that the frequencies of the rotational lines 
are not changed by the internal rotation. Both the A 
and £ levels give rise to the same spectral lines and the 
three transitions shown in Fig. 4 are all coincident. 
Thus the effects of the internal rotation in the sym- 
metric molecules are not detectable in the rotational 
spectra associated with the ground torsional state. This 
result can also be understood from the classical descrip- 
tion by noticing that the rotation of a symmetric top 
about its figure axis causes no change in the orientation 
of the dipole moment and thus no dipole interaction 
with the radiation. When internal rotation is present 
in a symmetric top molecule, only the angular velocities 
of the two groups along the figure axis are affected.” 
In the excited torsional states, however, the frequencies 
of rotational transitions are appreciably altered by the 
internal rotation through the interaction with the other 
molecular vibrations.*? The shifts of the frequencies of 
these lines then serve as a means of determining the 
barrier heights for the symmetric top molecules. This 
method is more fully discussed in Sec. 7. 

5. Reduction to a rigid rotor at a very high barrier.— 
At the limit of a very high barrier, the torsional func- 
tions can be taken as Eq. (3-11). Furthermore, the over- 
lapping between H,, H,, and H, can be neglected, 
i.e., the diagonal elements of p (i.e., wes”) approach 
zero. At first sight one might conclude that the effect 
of Hrr vanishes for infinitely high barriers and the 
Hamiltonian is then given by Hr. This, however, is 
not correct because the rotational constant of P.* in 
Hp is not the rotational constant of the rigid molecule. 
The reason for this is that the off-diagonal elements of 
increase with the barrier height so that the second- 
order term 


(Ore ca Dye) | Pro, v'o| 2 


remains finite in such a way that w,®) tends to zero. 
This latter fact would be expected since Eq. (3-13b) is 
in the form of a rigid molecule with the actual rotation 
constant. With Eq. (3-11) it can be shown that the 
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only nonvanishing matrix elements are 
1 /9V3 ‘( S 
Prv,o;v+1,6——— ) (a 
V2i\ 4F : 
(3-14) 


1 /9V3\# 
Ps ,0:0—1,0 = -—(—) 1. 
V2i\ 4F 


Furthermore, the energy of a torsional harmonic oscil- 
lator was given by Eqs. (3-4) and (3-10) as 


E,=3 (0+4) (VF)? = (9/4) Fbs. 


When this equation and Eq. (3-14) are substituted 
into Eq. (3-13) the second-order term Www? becomes 
zero. Therefore, the effective rotational constant Cy. 
becomes the rigid rotational constant C. 


B. Internal Axis Method (LAM) 


The essential feature of the JAM is the application 
of the Nielsen transformation to remove the coupling 
term between the angular momentum for over-all rota- 
tion and that associated with the internal rotation. 
Therefore, external and internal rotation become sepa- 
rable for the case of symmetric molecules. The IAM, 
however, has the disadvantage that the torsional func- 
tions are difficult to determine and are not conveniently 
tabulated. 

For the PAM the coupling term Arr (which is not 
small) is reduced by the Van Vleck transformation to a 
power series in K whose coefficients vanish at the limit 
of an infinite barrier. On the other hand, the coupling 
term Mrr is removed in the IAM by the Nielsen trans- 
formation, Eq. (2-8). The difference between these two 
transformations is that the Van Vleck is barrier de- 
pendent while the Nielsen is barrier independent. 

The Hamiltonian was given in Eq. (2-20) as 


H=A,q(Pa*+-Ps*)-+-C-Pe*4- Fp? V (a), 
where 
A a =Å z= hP/2L 4, 


C.=C=h?/2I.(AC:). 


The a, b,c axes are the same as the principal axes for 
symmetric molecules, and the differential operators for 
P,', Py’, P.’, and p’ are given in Sec. 2 IT. C2(b). Since 
the P”s do not contain a, the wave function can be writ- 


ten as a product of the symmetric rotor wave functions 
and the torsional wave function: 


Y=Ssru (0 p) tX M kuala’). (3-15) 


The symmetric rotor functions are the eigenfunctions 
of the first three terms of the Hamiltonian Eq. (2-20) 
and Mxve(«’) represent the eigenfunctions of the 
equation 


d? 
|- ae V ce) ar (e’)= EM (c'), (3-16) 


da’? 
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with the appropriate boundary conditions. The sym- 

~ metric top wave function of Eq. (3-15) contains x’ in- 
stead of x as a variable. The angle x’ was defined in 
Eq. (2-15). 

The boundary condition for Eq. (3-16) must now be 
considered. Obviously, the wave function must be 
periodic in 2m with respect to xı and x2 (the Eulerian 
angles of the two parts), i.e., 


ACE 25X1;X2) =y (6,9, Xit 2mm, x2 +2mnə), 


where 22; and 7» are integers. In terms of x’ and a’, the 
wave function y becomes 


y(0, px a’) =V{6, P, x+ (2r/I) U an:+ (Te—Ia)n], 
a'+2r(nə—nı)}. (3-18) 


In order to satisfy this condition use is made of the 
fact that according to Floquet’s theorem, a particular 
(nonperiodic) solution of the Mathieu equation can be 
written as 


(3-17) 


M(a)=e'P(a), (3-19) 


where P(a) is periodic in 27 and f is a real constant 
which is chosen so that Eq. (3-19) has the periodicity 
as demanded by the physical situation. For the present 
case one has 


(K/T.)[Lete+ IT e—I a)n ]+f(n2— n) =n 
(an integer). (3-20) 
This can be satisfied by setting f equal to either —pK 
D.e., —W.K/I.)] or (1—p)K [i.e., +(Ic—I«/I)K]. 
_ In the first of the series of papers on the internal rota- 
tion in methyl alcohol, Koehler and Dennison chose 
(1—p)K. [Also their internal angle x is equal to —a 
which is used here (see Appendix 4). ] However, it turns 
out, for most cases, to be convenient to take f= —pK. 
The resulting wave equation is then 


P= Syxm l0, p) Ex eE P(a'). (3-21) 
Equation (3-21) shows that the internal and external 
motion are separable inasmuch as the wave function 
can be expressed as a product of a function of the three 
f- external Eulerian angles and a function of a (a’~a) 
= alone. The fact that the quantum number K appears 
in the torsional part of the wave function does not 
signify a physical coupling between the over-all rota- 
tion and the internal motion in the sense that the 
= torsion of one group of the molecule about the other 
does not affect the motion of the ‘“nternal rotation 
= axes” (see Sec. 2 II. A1). On the other hand, the reader 
av recall that in the treatment of the symmetric 
ma cules by PAM there is a coupling term Hrr be- 
e s oe i ion. In the PAM 

en the external and internal motion. in the 
et of moving axes is rigidly attached to one group 
Je; while in the IAM the internal rotation 


molecu : 
ec are used as the coordinate system, are 


with respect to both of the groups. Since only 


2 
Ç 
$- 


ee 
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the rotation of the internal rotation axes is unaltered 
by the internal motion, the coordinate frame used in 
PAM also oscillates at the frequency of the internal 
torsion. 

1. Rotational equation—The first three terms of the 
Hamiltonian Eq. (2-20) are the same as a rigid sym- 
metric rotor molecule with energies as 


Hr=A(P2+P)+CP2 Gap 
Eyxm=AJ(J+1)+ (C—A)K?, F 
where 


ASe and C=f?/2I,. 


2. Torsional function.—Although the torsional equa- 
tions for the PAM and the IAM are identical, the 
torsional functions for these two cases are not the same 
because of the different boundary conditions. As ex- 
plained above, the torsional function of IAM may be 


written as 
M (a) =e EaP (a). (3-23) 


The differential equation for P(a) is obtained by sub- 
stituting Eq. (3-23) into Eq. (3-16) with the result 


—F(@P/do?)+2iF pK (dP/da) 


+3V3(1—cos3a)P= (E—Fp?K*)P, (3-24) 
where 
h? i? ile 
Fp=—— and fp?=————_. 
2(1-—TIa) 2 I-(I-—I«) 


This equation [or Eq. (3-16) ] can be solved in the 
most general way by the method of continued fraction 
which is discussed in Appendix 2. 

(a) Properties of the torsional levels: Since the solu- 
tions of Eq. (3-23) are different for different K, the 
value of Ķ should be included as a labeling index for 
P(e) and E. Furthermore, as œ appears explicitly in 
Eq. (3-16) only as cos3a, an eigenfunction M (aœ) can 
be classified into one of the three general types corre- 
sponding to c=0, --17]T in the expression 


a >A sepo er Skt ola 
or (3-25) 
M Kro = —ipKa DDA Pig LOO AME. 


Analogous to the solution of the torsional equation in 
the PAM, for each given a one has a family of the func- 
tions Px»-(a) and these functions are again labeled by 
the index v. Unlike the U(a)’s given in Eq. (3-9), the 
eigenfunctions Px,»,(a) here are not degenerate with 
Px,»,-1(a) because the differential Eq. (3-24) for P(e) 
contains imaginary coefficients. On the other hand, 
since P*(qa) is a solution of the equation obtained by 
taking the complex conjugate of Eq. (3-24) (the same 


tit The quantity o is the same as Itoh’s —p and is related to 7 
eed oy Dennison ef al.8%3348 by the expression K-+7=o0+1 
mod 3). 
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as the differential equation for P(a) with —K), we have 
Pr» 1(@)=P_x,»,-1* (a), 
Px,»,0(a)=P_x,»,0*(a), 
M K v(a) = M-K v0" (a), 
Po, = Wag 


(3-26) 


and 
Ex,»1=L_K»,-1, 
Ex,»,0= E-K v.0. 


At high barriers the arrangement of the eigenvalues 
(Exvo) in the IAM is similar to that in the PAM, i.e., 
the levels are grouped in a number of clusters corre- 
sponding to different values of the torsional quantum 
number v. Within each torsional state there are three 
distinct levels (rather than two in contrast with the 
results of the PAM) characterized by c=0, 1, —1, for 
K+0. The three levels are shown in Fig. 4. When K 
is equal to zero, the levels with c=1 and c=—1 are 
degenerate. The result that the eigenvalues are different 
for the PAM and the IAM comes as no surprise for the 
torsional equation is actually defined differently in 
these two methods. Physically, this discrepancy arises 
from the fact that the external rotation in the IAM is 
not the same as that in the PAM. However, the com- 
posite energy levels for the combination of the external 
and internal rotation calculated by both methods are 
identical as one would expect. This is illustrated in 
Sec. I. B3. 

(b) Eigenvalues of the torsional equation: The tabu- 
lated solutions of the Mathieu equation are not directly 
applicable to the torsional function in the IAM since 
these functions do not have the periodicity of 2x. The 
general method for obtaining the solution by the use of 
continued fraction is described in Appendix 2. The 
procedure for determining the eigenfunctions is in 
general complicated but at the very high barrier these 
functions can be approximated by the harmonic oscil- 
lator functions. In addition, the eigenvalues can be 
simply obtained by the following approximate method: 
when pK—c is replaced by pK—o+3, the form of Eq. 
(3-25) is not altered, and therefore the eigenvalues re- 
main unchanged. One may then regard the eigenvalues 
as periodic functions of (2m/3)(epK—o) and expand 
them in a Fourier series as 


Exva=F Yo ndn™ cos(2xn/3)(epK—c). (3-27) 
If pK—c is equal to zero or a whole multiple of three, 
then Eq. (3-25) reduces to the solution of the Mathieu 
equation (with a periodicity of m) when the substitu- 
tions given in Eq. (3-4) are made, i.e., 
E(pK—c=0) > b(r). 
Here b and E differ by the scale factor given by Eq. 
(3-4). Similarly, one can show that 
E(pK—c=3) > b(2m), 
E(pK—c=1) > b(n), 
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and more generally 
E(pK—c=x) — b(3x/x). 


Up to this time only the low integral values of 3/2 have 
been tabulated (see Appendix 3). i 

If the series in Eq. (3-27) converges so rapidly that 
E may be approximated by the first two terms in the 
Fourier expansion, i.e., 


E= F[ao+a; cos(2x/3)(pK—c)], (3-28) 


then the coefficients ao and a; may be determined from 
the tabulated values! of b(r) and b(2r). When this is 
done, Eq. (3-28) can be used to calculate the approxi- 
mate values of b(Nr). To this approximation the equa- 
tion for b(Nr) is 


1 2r 
b(NT) =- (1+ cos J6 


1 Qa 
Ep z ( 1— cos) b(2mr), (3-29) 


such that any Mathieu eigenvalue can be approxi- 
mated” from b(r) and b(2r) provided that the Fourier 
coefficient a; is small. If the coefficient a; is not small 
then (3m), b(4r), etc., can be used to obtain the co- 
efficients a. and a3 and a better value of b(Vz) (see 
Table III). Therefore, the Mathieu eigenvalues can 
be readily approximated for any periodicity. 

(c) Harmonic oscillator approximation: As in the case 
of the PAM, the torsional function can be approxi- 
mated by harmonic oscillator wave functions at very 
high barriers. Since the classical amplitude of the in- 
ternal torsion approaches zero at the limit of very high 
barriers, the factor e~*?** in the torsional function 
M(a) may be replaced by unity and Mx..(a)— 
Px»(a). Under such circumstances M(a) can be ap- 
proximated by Eq. (3-11). The energy levels of Eq. 
(3-16) are then given by Eq. (3-10). 

3. Energy levels and spectra.—The total energy of a 
symmetric top exhibiting internal rotation is equal to 
the sum of the energy associated with the over-all rota- 


tion Eygu and the torsional energy Eve, Or 
E(JKMvc) = EsxutExvs- (3-30) 


Since the K degeneracy remains in the rigid symmetric 
rotor energy, from Eq. (3-26) it may be seen that 


E(J,K,M,v,c) = E(/, —K, M, 2, —o). 


Therefore, all the energy levels are doubly degenerate 
except the A=0, o=0 levels—a result which was also 
found by the PAM. Now substituting Eqs. (3-22) and 
(3-27) into Eq. (3-30), we obtain 


E(JKMvc) = AJ (J-+1)-+(C—A)K? 


2h ae 
+E ai coss (aan a (3-31a) Pees 


A 
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The last term in this equation represents the effect of 
the internal torsion on the rotational energy levels. If 
pK is reasonably small and the Fourier series converges 


rapidly, the cosine terms in the last member of Eq. 
(3-31) can be expanded in the following manner: 


2am 
> a,™ omen 2) 


2rof 2rn 
+> a, sin | ok : + (3-31b) 
3 E 3 


For c=0 the K? term from Eq. (3-31b) can be incor- 
porated into the term (C—A)K? in Eq. (3-31a), and 
therefore it has the effect of modifying the rotational 
constant. The rotational levels for this case have the 
same pattern as a rigid symmetric rotor—pseudo-rigid 
rotor level. In the case of the degenerate torsional 
levels (s= 1), the expansion in Eq. (3-31b) introduces 
a term linear in K, into the energy equation. This 
causes a splitting of the A degeneracy of the rigid 
symmetric rotors. The conclusions obtained here are 
therefore in complete agreement with those derived 
from the PAM. When pK becomes larger, the (pK)? 
and (pK)‘ terms should be included in the expansion of 
cos(27/3)(pK—c). In terms of the PAM the higher- 
order Van Vleck perturbation treatment must be 
applied to the Hamiltonian in order to obtain accurate 
results. 

By following the selection rules AJ= +1, AK=0, and 
Ac=0 one finds again that the rotational frequencies 
depend only on the rotational constant A and the 
quantum number J, and are not affected by the internal 
torsion. Therefore, only one line is observed as shown 
in Fig. 4 for this model. 


C. Comparison of the Two Methods 


The main difference between the two methods is that 
the IAM gives a relatively simple Hamiltonian but 
more complicated torsional functions, whereas the PAM 
gives a more complex Hamiltonian but simpler torsional 
functions. The chief difficulty in the IAM is associated 
with the determination of the torsional eigenfunctions 
and eigenvalues. On the other hand, the coupling term 

pP: in the PAM is not a small perturbation, so fourth- 
order perturbation calculation is sometimes necessary 
to obtain accurate results for the energy levels. 

A comparison of Eqs. (3-13b) and (3-31a) shows that 
the effects of the internal torsion on the rotational 
energy levels are expressed by a power series in ee 
and a Fourier series in the JAM. Generally, calculation 


i _ and second-order terms in the power series 
oa aratively simple, but the 


AND 
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higher-order terms are more difficult to evaluate. 
Herschbach” has suggested that the tabulated eigen- 
values of the Mathieu equation and the first two per- 
turbation terms in PAM be used to determine the 
Fourier coefficients in the IAM which are otherwise 
tedious to calculate by the method of continuous frac- 
tion. Since the Fourier series in IAM converges much 
faster than the power series in Eq. (3-13b), the former 
may be used to obtain the rotational energy levels and 
also the higher-order perturbation terms of the PAM. 
This technique is really very interesting for we have 
solved the problem approximately by the PAM where 
the boundary conditions are simple and have then con- 
verted this to the more exact solution of the IAM. 


II. Asymmetric Molecules 
A. Principal Axis Method (PAM) 


The Hamiltonian given in Eq. (2-37) can be grouped 
in the following manner: 


Ho=}(A,+B.)(P2+P,)+C.P2+F p+V(a), (3-32a) 


Hi=—2 5 O:Pif, (3-32b) 
and i 
H2=3(Az—Bz)(P2—P,?) 

+} È Di(PiPj+P;P). (3-320) 


ij 
The energy matrix is now constructed with the basis 


functions 
Syrxu(0,e)e%*U o(a). 


In this representation Ho is completely diagonal; H2 
is diagonal in v, but not in K; and Hı has diagonal and 
off-diagonal matrix elements in K and v. As in the case 
of symmetric molecules, the Van Vleck transformation 
is applied to remove the nondiagonal elements in v so 
that the new energy matrix assumes the block structure 
shown in Fig. 3. The effective rotational Hamiltonian, 
correct to second order, for the vth torsional state isț}} 


Hive = Argo beet By okt CrP Lye 
slay) DirfWee ARB T BR) PA OxP Wye, 


ij z 


(3-33) 


where 


h? Nes 
A =| 140,00, 


DII Alz 
ho = =n 
16 vo,v'o Z 
Wy = 1-+— yll 
9 boro — bo'o 


tit Here we have assumed that the torsional levels of different 
v are much more widely spaced compared to the rotational levels. 
When this is not the case, the (K |K’) elements, introduced by Hı 
to the diagonal v blocks through the Van Vleck transformation, 
have a very complicated form.25> 
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and similarly for B, and C, with the appropriate per- 
mutations of x, y, and z. 

Alternatively the Hamiltonian given in Eq. (2-38) 
could be used for the perturbation treatment. The Van 
Vleck transformation now gives the following??: 


(PIP 
Eee T Lo tk >. Wro IL", (3-34) 
where 7 ; 


I=ọ: P, 


and the w..”s are the same perturbation terms given 
in Eq. (3-13b). Naturally, Eqs. (3-33) and (3-34) are 
identical. Equation (3-33) shows the perturbation cal- 
culation more clearly and shows that the rotational con- 
stants are modified by the internal rotation. Equation 
(3-34) shows the similarity to the symmetric molecule 
formulas and leads more naturally, as we will see later, 
to higher-order perturbation corrections. 

1. Energy levels and spectra—From Eq. (3-33) the 
rotational constants for the over-all rotation of the 
molecule are modified by the internal rotation. Also the 
orientation of the “effective” principal axes is changed 
by the cross terms Diy. (P;P;+P;P;). Both of these 
changes depend on the torsional state v and ø. As in 
the case of the symmetric molecules, there exist two 
sets of rotational levels for each torsional state. The 
Hamiltonian for the pseudo-rigid rotor levels (¢=0) 
to second order is given by Eq. (3-35), 


Hy = A voP + Beda CyoP? 
+I $Dipon (PPAP,P), 


ix 


(3-35) 


where the rotational constants Avo, etc., differ from the 
respective constants of a rigid molecule by terms pro- 
portional to w,0®. Also the cross terms between the 
P’s contain w 0. As mentioned before, the quantity 
wo decreases rapidly with increasing value of the 
reduced barrier height s [defined in Eq. (3-4) ] and 
approaches zero for very high barrier. 

For the E levels the Hamiltonian is also given by 
Eq. (3-33). Because of the linear terms 1 IP Map and 
P, the spacings of the E levels are different even quali- 
tatively from those of a pseudo-rigid rotor. However, 
in some cases, especially for high barriers and low K 
energy levels, when the asymmetry splitting of the K 
doublet is not too small, the effect of the terms con- 
taining pve,vo IS negligible.§§§ The effective rotational 
constants A»,41, etc., are, however, appreciably different 
from those for the A states as well as those for the rigid 
molecule. Under such circumstances both sets of the A 
and E levels are similar in structure but have different 
effective rotational constants. 


§§§ It should be noted that the effect of the linear terms is 
negligible only when it connects no near degenerate levels. Also 
if the quantity pK becomes large, higher-order perturbation terms 
are needed [see Eq. (3-34) and Sec. 3]. 
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The selection rules are identical to those for a rigid 
asymmetric top with the additional restriction of Ac=0 
since the dipole moment is independent of the angle a. 
The most intense spectrum arises from the molecules in 
the lowest torsional state, i.e., v=0. For each rotational 
transition one finds two lines; one originates from the 
rotational levels associated with the nondegenerate 
torsional levels and the other one from the degenerate 
E levels. The essential feature of the spectrum is that 
doublets occur instead of a single line as in the case of a 
rigid molecule. This is illustrated in Fig. 4. The lines 
belonging to the A species follow a pseudo-rigid spec- 
trum to the second order and the E lines may or may not 
be fitted to a pseudo-rigid Hamiltonian depending on 
the height of the barrier and the asymmetry of the 
molecule. The two members of the doublet are of more 
or less the same intensity; the ratio of the intensity is 
governed by the relative nuclear statistical weight 
assigned to the A and E levels (see Sec. 6). The separa- 
tion of the two lines of a doublet depends very critically 
on the height of the barrier. It is from these splittings 
that we determine the barrier height. In the next sec- 
tion we discuss the calculation of the doublet splittings. 
The methods of evaluation of the barrier height from 
the measured values of the splittings are presented in 
See, &: 

2. Calculation of energy and doublet splittings.—The 
matrix elements of H.a in Eq. (3-33)! are 


(K| Hee | K)=3 (Avot Bro) J(J+1) 
+[Cre—3 (A vet Bro) JK? +Q:t6 PK, 
(K| Hye | K1) = 3[ (Dy:biDz:) Wee (2K-+1) 
+ Qyti02)wee LJ J+ 1)—K(K£1) Ff, 
(K| Hre| K2)= —4[ A re — Brot UD yoo | 
X (J+1)—-K(K+1)]: 
XO (J+1)—(KÆ1)(K=2)]!. (3-36) 


The indices J, M have been deleted since Hy, is 
diagonal in these quantum numbers, and the constant 
term Evs has been dropped. Here, in addition to the 
(K|K+2) contribution from the usual asymmetry of 
Ho, there are off-diagonal elements from the cross terms 
in DijWs®. Also Hı has contributed both a diagonal 
term in Q.w,." and off-diagonal terms in Qt. 
and Qywre. 

The Hamiltonian Hes in Eq. (3-33) may be simplified 
by performing a rotation of coordinate axes to eliminate 
the cross terms between the components of the total 
angular momentum P. Referred to this new coordinate 
axes system which is the “effective principal axes” 
system for a torsional level, the effective Hamiltonian 
becomes 


Aye = A fee a Be 
FC PPF Oi P'e (3-37) 


The orientation between the two sets of axes can be 
determined either by diagonalizing the 3X3 matrix in 
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TABLE I. Low J energy levels.” 

Esx 

Eoo=0 


Ex0=Are’ HBr — (gt+-87) 
En= $ (4 ve +Bro’) +Cy.’+4 (gt+¢-) 
+E (Q w) (A ve — Bro’)? = (A vo — Bie) (gy a gr) J 
E2= W,(Ave Bre’ Cro J = 2,K= 0)—3 (ete) 
+0[(A—B)(2C—A—B)*g] 
En= (5/2) (Ave’+Beo’) +Coo’ +3(g*+87)/2— (k*-+k-) 
+[(Q2’we™)?+ (9/4) (A ve’ —Bvo’)?—9 (Are —Bvs’) (gy—gx) T 
where 
g= (0:40?) Cwe PL2C re! — Ave! — Bro! +20: wr OJ 
a= Q: [eo V PEC —Aro —Bre M (i=x, y) 
gz+gu™i (gtts) 
k*=2 (Q: 2+0”) [wee Pl6Cro —3A ve’ —3Bro' 202'Wro'?) J! 


a The Arg’, Bro’, and Cro’ are not necessarily identical with the A, B, 
and C of King ef al.,41 and the quantum number K is no longer a good quan- 
tum number, but it is still used for labeling. In the expression for the E20 
energy We is the rigid rotor energy with the rotational constants Av’, By’, 
and Cy’. 


the usual manner or by applying successive 2X2 rota- 
tions. The relations between the new coefficients 4,,’, 
etc., in Eq. (3-37) and the old ones in Eq. (3-33) can 
be set down by noting that A, B, C, Dzy, etc., trans- 
form like the six components of a symmetric second- 
rank tensor and Q+, Qy, Q:, transform like the com- 

' ponents of a vector. Very often the change in the 
effective rotational constants introduced by such a rota- 
tion is negligible. For instance, in the case of acetalde- 
hyde (CH;CHO) the cross term P;P; is removed by 
rotating the body axes through an angle of approxi- 
mately one degree. 

For the nondegenerate A levels the last term of Eq. 
(3-37) vanishes. The rotational energies are just those 
for a pseudo-rigid rotor with slightly corrected rota- 
tional constants and can be calculated by well-known 
methods or obtained from published tables.4!:97°8 Tn 
the case of the Æ levels the energy matrices for both 
Eq. (3-33) and Eq. (3-37) contain a term linear in K 
on the diagonal and also (K|K-+1) elements. The two 

kinds of factoring available for rigid asymmetric tops 
are spoiled, and for a given J, a (2J+1)(2J+1) de- 
terminantal equation must be solved. Except for low J, 
the straightforward diagonalization of this matrix is 
very difficult. The procedure of diagonalization may be 
somewhat simplified by applying a Van Vleck trans- 
formation to reduce the (K|K-+1) elements. Except for 
the energy levels with small asymmetry splitting, after 
a second-order Van Vleck transformation the new 
(K| K=1) elements are usually such that their con- 
tributions to the diagonal terms are negligible for low 
J levels. The approximate formulas for some of the 
Is are given in Table I. Also approximate 
energy leve : 

the splitting of the A and £ levels (not the 

these are given in Table II. 


derived ; 
en Pa sig very high the value of ws” 


barrier 


When the 
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becomes so small that the Æ levels for low K also con- 
form to the pattern of a rigid rotor with effective rota- 
tional constants different from those belonging to the 
A levels. The energy splitting between the A and E 
levels, AW, is simply equal to the difference between 
the “rigid rotor energy” of the two types of levels, 
Wa, and Wz, as 


AW=Wa—Wr=(aW/aA)AA 


+(dW/dB)AB+(dW/dC)AC. (3-38) 


If the rigid rotor energy is written in the usual form*! 


W=}(A+C)I(J+1)+3(A—C)E(0),|| || | (3-39) 
then it follows that 
OW /dA=3S (J+1)+3E (kK) —43 (kK +1) LOE (x) /dx J, 
OW /AB=0E(k)/d«, (3-40) 


OW /AC=4I (J+1)—3E (x) +4 (k—1)[0E(x)/d«)]. 


The values of [0E(x)/dx] can be estimated from the 
tabulated eigenvalues of rigid rotors.*!°8 By substitut- 
ing the results of Eq. (3-40) into Eq. (3-38) the energy 
separation between the A and Æ levels is obtained. 

3. Higher-order effects—(a) Third- and fourth-order 
perturbation terms® >: Up to this point the discussion 
of the PAM represents the second-order theory in the 
Van Vleck perturbation treatment. When the barrier is 
not sufficiently high and/or the quantity pK is large 
(as in the case of CH3COOH),! the third- and fourth- 
order perturbation terms should be taken into con- 
sideration. The form of the effective Hamiltonian 
given in Eq. (3-34) is very convenient for the calcula- 
tion of the higher-order terms. Also, the perturbation 
coeflicients are the same as those given in Eq. (3-13b) 
for a symmetric molecule. The higher-order perturba- 
tion corrections can be calculated directly from the 
matrix elements of p or by the labor-saving method 
given in Sec. I. C. Herschbach*.?5» has tabulated a num- 
ber of these terms. The readers are directed to Sec. 5 for 
more detail of this method. With the higher-order terms, 


Taste II. Low J energy splitting (A —£). 


AEq=0 
AE\=AA+AB+gt+¢- 
AE, =AC+43(AA+4AB)—3(g* +e) 
ee 


= E (4A —AB) + (gy—g2) -—— 


A—B 


AE =AC+ (5/2) (AA +AB) — (3/2) (g* +87) + Se : 
Q Wyg) 2 

—AB 3 View beet ot ae 

a| emas AB) +3 (gu— g=) 3s) | 


where 
|Q,w6| <|(A—B)| 


|| || || In this equation the rotation constants A, B, C, are such 
that A>B>C and they are not necessarily identical to Aw, 
Bye, and Cre. 


CC-0. Gurukul Kangri University Haridwar Collection. Digitized by S3 Foundation USA 


INTERNAL ROTATION AND 
matrix elements of the form (K|K-+3) and (K|K-+4) 
are introduced and these can be treated in a manner 
similar to Kivelson and Wilson’s treatment‘ of the 
centrifugal distortion problem. 

(b) Sixfold potential energy term, V8: So far we 
have assumed that the potential function is sinusoidal 
in a without rigorous justification. If the expansion of 
the potential function in Eq. (2-1b) converges rapidly, 
the Vs term can be treated as a perturbation. When 
this is done, the Vs contribution merely changes the 
values of all the perturbation coefficients wy,“ in such 
a way that its contribution cannot be separated from 
that of Vs without data from more than one torsional 
state. In order to evaluate the Vs coefficient, it is 
necessary then to study the internal rotational effects 
in more than one torsional state.ś[ $| This Vs term 
should be generally smaller than the threefold potential 
by a factor ranging from ten to a thousand. 


B. Internal Axis Method (IAM) 


Since the treatment by the IAM for the most general 
type of molecule (no symmetry aside from that of the 
CH; group) is algebraically very complicated, the dis- 
cussion here is confined to those molecules which have 
a planar framework. Papers by Pitzer and Gwinn,™ 
Hecht and Dennison,” and Burkhard’ discuss the 
general type of asymmetric molecule by the IAM. 

1. Hamiltonian and matrix elements —In Sec. 2 three 
different forms of the Hamiltonian were derived. They 
were given in Eqs. (2-26), (2-29), and (2-31). We are 
now at liberty to use any one of these Hamiltonians to 
solve for the energy levels. Let us first consider the 
Hamiltonian (2-29).°?% Hamiltonian 
divided as follows: 


Ho=4(Aat Bo") (Pa? HPY) P! +E p?+ V (a), 
Hy=3(Ao— By")(Po!2— Py!) Doel! (Pu Pd + Pel Py). 


Pa, P, and P” are the components of the total 
angular momentum along the axes fixed in the frame- 
work of the molecule (rather than the internal axes 
which rotate with respect to both the framework and 
the top), i.e., the orientation of this axes system is 
described by the Eulerian angles 0, ®, and x (but not 
x’). Consequently, we choose the eigenfunctions of Ho 
as the basis functions for constructing the energy 
matrix. These functions are 


v=Syxu(0,2)e*xXO(a), 
where ((a) is the solution of the torsional equation 


[¥p?+ V (@)JO(@)= £Q(@). (3-42) 


This can be 


(3-41) 


TTT Kilb eż al. calculated a Ve term but their approximation 
is now realized to be inadequate because they ignored the third- 
order correction in the evaluation of the Vs term from the split- 
tings in the ground torsional states. 
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Since P.” is diagonal in Ho, using Eq. (2-28) we can 
write 


p'=p— pP." = (1/i)(0/da)o,%,x.—pK, (3-43) 


which can then be substituted into Eq. (3-42) with the 
result. 


[F(—i(d/da)— pK} +V (a) ]O(a)= EQ (a), 


[—F (d2/do2)+V (a) JM (a)= EM (a), (3-44b) 


where M (æ) is e~'PX¢O(a). Equations (3-44) are just 
the same as Eqs. (3-16) and (3-24) only with the scale 
change in p and F [see Eq. (2-30) ]. Analogous to the 
function P(e) in Eq. (3-25) the eigenfunction Q(a) can 
be expanded in a Fourier series as 


(3-44a) 


Q(a) = DKA seo rei mtoa, (3-45) 
Furthermore, it can be easily seen that 
) Cro YR cae ? 

QOkvo(a)=Q_Kr—o* (a) (3-46) 


Exkuc=E_Kv—c- 
Before deriving the matrix elements of H let us con- 


sider the Hamiltonian* given in Eq. (2-31). This 
Hamiltonian can be divided as follows: 


Ho=3(Aat Be’) (Pa? + Po?) HO Pe? + Fp?-+ V (a) 
Hy=4(A.— By") 
XL (Pe — Ps”) cos2pa— (Pa Po’ + Po’ Pa’) sin2pa | 
+ Doel’ (po pet P Pi) cospa 
+ (pa P +P Pa’) sinpa]. (3-47) 
Here Pa, Py’, and P’ are components of the total 
angular momentum along the internal rotational axes. 


The orientation of this axes system is described by the 
Eulerian angles 06x’ where 


x =x+F pa, 


[see Eq. (2-28)]. The basis functions again are chosen 
to diagonalize Ho. They are 


Y= SJK™M (0$) El M (a’) ) (3-48) 


where M(a’) is the solution of the torsional equation 
(3-44b) which must satisfy the boundary conditions 
[see Eq. (3-20) Sec. I. B]. Furthermore, we can see 
that the basis functions given in Eqs. (3-41) and (3-48) 
are indeed identical as 


e'KxO (a) == e'Kx'e-ipKaQ) (q) = eX M (a’). 


The energy matrix can then be constructed from the 
Hamiltonian Eq. (2-29) with Eq. (3-41) as the basis 


functions or from Eqs. (2-31) and (3-48). The results 
of these two schemes are identical. 


The matrix elements of th i i 
} e Hamilto 20,31 
follows: D a 


(Kro|H| Kro)=3(4 at Be )LJ (J+1)— K?] 


| 


= 


| 
| 
f 
i 
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(Krol H| K+1, o)= De” (K+3) 
[U4 —K(K+1) Pl ryv, 


(Kvo|H|K+2, vo) 
=—1(4,—B,”")[J (J+1)—K(K+1) }? 
XE OIH) (K+1) (K+ 2) M krv", 


where 


(3-49b) 


(3-49c) 


2 


211 
Tzw | Okre* (a)Qk'rrola)da.*** (3-50) 
0 


Tf the Hamiltonian form given in Eq. (2-26) were to 
be used,**! then the scale factors p and F would be 
those corresponding to the symmetric molecule, i.e., 
p=Io/Iee and F= (#2/2)[Tec/Ta(Tec—Ia) |. Unfortu- 
nately, there are more matrix elements (those linear in 
p) which needed to be calculated and generally this 
formulation has no advantage over the other two. As a 
result we do not discuss it further. 

2. Integrals involving the torsional functions.°—In 
order to obtain the energy levels it is necessary first to 
compute the integrals given in Eq. (3-50). Evaluation 
of such integrals is generally quite complicated because 
of the difficulty involved in the determination of the 
torsional functions by the procedure of continued frac- 
tions. Fortunately, for the case of high barriers (s2 20), 
one may use the harmonic oscillator approximation for 
the torsional functions and the integrals can be ob- 
tained in a much simpler way. 

At the limit of infinite barrier, Ix-»,%°” approaches 
zero for væv and unity for v=v. It is convenient then 
to introduce the terms 


ÔK4, vo = i= Í Okro (@)OK+1,»c(a)da, 
0 
; (3-51) 
6K+42,n08 "= 1- f Okro (a@)Ox+2,v0(a)da. 
0 


By using the harmonic oscillator approximation for the 
torsional wave function Hecht and Dennison” obtained 
the following formulas for the ground torsional state 


v=0: Í 
ôK+1,0 = f (0/3), 
6x+42,00%°"= f(2p/3), 


f(x) “+ 


(3-52) 


where i 
EA. 

2s 

Expressions for I Kiya?” (vv) have also been derived 


by Hecht and Dennison from the harmonic oscillator 


approximation [see Eq. (28) of reference 20]. An im- 
Ce 


++** Hecht and Dennison” chose the torsional function in the 


form of M Krala) = eill PKA) Kg (a), 
leads to the same integrals J. Kista vo jn different forms. The 
which e ve is the same as Itas Gurukul Kangri University Haridwar 


form given abo 


-a 
Bet 
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portant property of the integrals in Eq. (3-50) is their 
relative independence of the quantum numbers K and 
o. Furthermore, unlike the case of PAM the matrix 
elements off-diagonal in v are usually very small and 
their contributions to the energy are rather unimportant. 

3. Diagonalization of the energy matrix.—The di- 
agonalization of those elements off-diagonal in v can 
be accomplished by a Van Vleck transformation. After 
this transformation the energy matrix takes on a block 
structure in which each block corresponds to a certain 
value of J with the dimensions (2/-+1) X (2/-++1). Both 
(K|K+1) and (K|K=:2) elements are present in these 
submatrices because the coordinate axes are not prin- 
cipal axes. One then applies a transformation S (=$,S2) 
which diagonalizes the limiting rigid rotor Hamiltonian. 
The off-diagonal elements of the transformed Hamil- 
tonian now consist of only the terms containing the 
integrals Ixe“ and Eks. Examples of the trans- 
formed Hamiltonian have been given by Hecht and 
Dennison for /=1.j7t{ The energy levels can then be 
obtained by the usual perturbation method (or a 
modified procedure may be recommended in case near 
degeneracy occurs). One may, of course, apply a trans- 
formation S’ to H to bring it directly into the diagonal 
form. The advantage of using S instead of S’ is that the 
former transformation matrix can be set down in a 
systematic fashion. 

Let S be written as SiS2 where S, transforms the co- 
ordinate axes to the principal axes system and Sə is 
the matrix which diagonalizes the asymmetric rigid- 
rotor Hamiltonian (referred to principal axes) in the 
symmetric top representation. The elements of S2 can 
be determined from the asymmetric rotor energy levels 
by well-known techniques.” The elements of Sı can be 
written directly from the representation of the rota- 
tional group as given by Wigner!” as 


WK p= KD! (8) x pxWsK= UK (Sik pkWr, 


where K, and K refer to the quantum number along 
the z principal axis and the internal axis, respectively, 
and 


DJ (8) Kpx= 2 (—1)” 
POE) '\J—K)\(I- Ky) J-K»)! 
(J—K,—n)'(J+K—n) 'n!(n+K,—K)! 


BY IHK- Krn _ B20 KH K 
x cos. om . (3-54) 


The summation is from the larger of 0 or (K—K,) to 
the smaller of (J—K) or (J—K,). The angle £ is the 
angle between the vector o and z principal axis, i.e., 


(3-53) 


ttit See p. 38 of reference 20. In these matrices the elements 
nondiagonal in v have been neglected because they do not con- 
tribute to the splittings of the A and JZ levels but only cause a 
shift in frequencies. 
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the angle between the c’ and the z axis: 


sinf = p,/p. 


For small angles of 8 Eq. (3-54) can be expanded (to 
the order of 8?) with the results®: 


B? 
(Si)kx= see lar) LI) 


B 
E (Si)k,kK+41= Ar (Si) cu1,K=-L(J—K) (J+ K+1) J}, 
2 (3-55) 


2 


B 
(S)11= CSa a Ce), 


(B= py/P, a= pz/p). 


4. Energy levels and spectra.—Since the energy matrix 
given in Eq. (3-49) is diagonal in ø, one may expect to 
have three series of energy levels corresponding to c=0, 
1, —1. From Eq. (3-46) the energy submatrices for 
o=1 and c=—1 are identical, therefore the energy 
levels for c=-+1 are doubly degenerate. The eigen- 
functions for o=--1 correspond to the symmetry 
species Æ of the C; group, and the levels with o=0 
have the symmetry properties of the A species. Since v 
is nearly a good quantum number (the elements off- 
diagonal in v are small), a torsional level is labeled by v 
and c, and associated with each torsional level one has a 
set of rotational energy levels characterized by the 
quantum numbers J, K, M. 

The selection rules Ac=0 (derived in Sec. 6) predict 
two lines for each rotational transition associated with 
the ground torsional state v=0 for relatively high 
barriers, one from the nondegenerate torsional levels 
o=0, and one from o=+1. Figure 4 shows two pairs 
of such transitions arising from the two K doublets of 
a slightly asymmetric top. The splitting of the doublets 
depends mainly on the tunneling, and the separation 
between the two pairs of doublets depends on the asym- 
metry. When the splitting and the separation become 
comparable in magnitude a mixing of states occurs 
such that “forbidden lines” can now occur. This is 
discussed in Sec. 5. 

The doublet structure in the microwave spectra 1s 
characteristic of the threefold high barrier case, and 
when the splitting is not too large, the doublets can 
easily be recognized because the Stark effect is virtually 
identical for the two lines. Since the 6’s and the in- 
tegrals I are to a good approximation independent of K 
and o, the difference between the two corresponding 
eigenvalues of the degenerate and nondegenerate sub- 
matrices for a given J arise only from the terms con- 
taining Ex»,. Therefore, the splittings of the doublets 
in the microwave spectrum depend to the first order 
only on Exy, but not xas” and Igu”? which cause 
only a shifting in the positions of the lines. The effect 


of the 6’s and J’s can be included,” but the relations are 
rather complicated. After the transformation S, the re- 
sulting Hamiltonian contains some elements which 
carry the 6’s as coefficients and have the same structure 
as the ordinary rigid asymmetric rotor matrix elements.” 
As a result the 6’s add a correction to the rotational 
constants which is the same for the degenerate and non- 
degenerate levels. 

Since the energy levels (except K=0, =O) are all 
degenerate for symmetric tops with high barriers, it 
can be seen that the presence of the asymmetry splits 
some of these degenerate levels so that the levels with 
a=0 are now nondegenerate. The splitting of the levels 
by asymmetry can be seen in the following way: to 
split a pair of degenerate levels of the unperturbed 
system (the symmetric molecule), say K=2, c=0, and 
K=—2, o=0, it is necessary that the perturbation 
(the asymmetry) offers matrix elements connecting 
these two levels directly or through high-order terms. 
In the present case these two levels do indeed interact 
with each other through the asymmetry terms as 


K=?2, c=0<— K=0, c= 0S K=—2, o=0. 


On the other hand, since the asymmetry terms are 
diagonal in g, the pairs of levels with quantum number 
K, c=1, and —K, ø= — 1 which are degenerate in the 
symmetric top case, can never connect each other under 
any high-order interaction. Consequently all the levels 
with c=+1 remain doubly degenerate. This is shown 
in Fig. 4 where the thickness of the levels was drawn 
to illustrate the degeneracy. Each J, K level has a 
weight of six from the two K levels and the three in- 
ternal rotor levels. 

At the limit of infinitely high barriers, the 6’s and 
Is approach zero and the energy matrix in Eq. (3-49) 
reduces to that of a rigid rotor. 

5. Splitting of the hindered rotation doublets—The 
splittings of doublets associated with the ground tor- 
sional level are utilized to determine the barrier height. 
Hecht and Dennison” have given expressions for the 
differences between the A and E component of a rota- 
tional state in the ground torsional level up to J=3. 
With these formulas the barrier can be evaluated from 
the splittings of the low J rotational lines. For higher J 
levels the splitting of the A and E levels can be calcu- 
lated as follows: A transformation SS> is applied to 
the Hamiltonian where Sı corresponds to a rotation of 
the axes as defined previously and S% is the matrix 
which changes the basis vectors from the symmetric 
top wave function to the Wang symmetric and anti- 
symmetric combinations. Note that Sv’ is different from 
Sz. Let the transformed Hamiltonian be written as 
Hr+Hr+Hrr where Hp is the rotational part, Hr is 
the internal rotation part dependent on Ex and H 
is the internal rotation part dependent on the int 
which, as pointed out before. h li aoe 

ie € » have little effect on the 
splittings of the internal rotation doublets. For slightly 
asymmetric tops the splittings are caused primarily by 
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the difference in the diagonal elements of Hr between 
the A and E levels provided the asymmetry splitting is 
not too small. 

Approximate formulas for the diagonal elements in 
the Hr were derived by Lide and Mann® for the case of 
small values of 8. These formulas for the splitting of 
the energy levels follow directly from Eq. (3-55) and 
are given in Eq. (3-56): 


AEsr=Ax+t (6/4) _LJ+K) J—K+1) Ax 
—2[J(J+1)—K*JAx 
+ (J—K)(J+K+1)Ax+1], 
AEn=Ax+ (6/4)LJ—1) +2) (A2—Ai) J, 
(J+K1+K,1 even), 
AE y=Aist+ (6/4) [27 (J+1)Ao 
= @P43I—2) Art O DUFA], 
(J4+-K14+-Ky1 odd), 
AE so= Aot (62/2)J (J +1) (41—40), 


where 


2r 
Ag=— 4 (Ern tH Er) H Erom +4Fa cos( Zox ) 


2r 
=Apo cos( Zox ) , (3-57a) 
3 


Ao= +żFa= + (9/4)FAb.ft}}  (3-57b) 


In general the off-diagonal elements of Hr should also 
be considered for the £ levels especially when the asym- 
metry splitting is small. They connect the states corre- 
sponding to the symmetric and antisymmetric com- 
bination of the symmetric top functions and have the 
form of 


Q 
GAEO r= H (Ere E_xee) 


EN n(2x) (3-58) 
aE G 
where 
a=p:/p= (1—6). 


Both the quantities Ax and 6x can be approximated by 

the trigonometric form only when the barrier is rela- 

tively high such that Ex» can be approximated ac- 
curately by the Fourier coefficients ao and a; with the 
neglect of the higher terms. The term 6x is equivalent 
to the linear terms in P, in the PAM and the term Ax 
is equivalent to the shift in the rotational constants, 
Den Cro— Cri. ; 

The asymmetry term is of the form 


j= —}(A-B)L(I-K)(J—-K-1) 
X (J+ K+1)\(J+K+2)]}}. (3-59) 
fttt Here, Ao is the negative of Ao in reference 20 because we 


level minus the Æ level. Here Ab is the difference 
Ne values of b [see Eq. (3-4) ] for the A and Æ sublevels 


OAs 


i ee — — ——— Â 
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For the K=1 levels only, the asymmetry and the off- 
diagonal elements of Hr compete directly with one 
another for the shifting of the levels. If the two effects 
are of the same order of magnitude, then the correction 
can be determined from the 


z (46x7+6asy”)?. (3-60) 


Here dasy’ is the asymmetry splitting and not the asym- 
metry term (note that for K=1, they are the same). 
If 5x>>basy’, the suitable zeroth-order basis functions 
are no longer the Wang functions but rather the sym- 
metric rotor functions. After the transformation Sj, 
the energy matrix can then be diagonalized exactly or 
by perturbation methods. If, on the other hand, 
dasy >ôx, then the splitting due to the off-diagonal 
elements of Hr must be calculated by the proper asym- 
metric rotor wave functions, i.e., the effects of 6? and 
higher-order terms should be considered. Lide and 
Mann® have derived expressions to the order 8? for 
the frequencies of the parallel transition, J41, KJ, K. 

6. General case:-—The most general type of molecules, 
where the internal rotor is arbitrarily located with re- 
spect to the principal axes, has been considered by 
Pitzer and Gwinn,™ Hecht and Dennison,” and Burk- 
hard.” The method follows in an analogous manner, 
only with added complications. As an aid in the com- 
putation it is usually easier to determine the molecular 
parameters such as 9 and 7 by the PAM and then con- 
vert them to the IAM. 

7. Intermediate barrier case—I{ the barrier hin- 
dering internal rotation becomes too low for the high 
barrier approximation to be valid, the matrix elements 
off-diagonal in v would have appreciable effect on the 
energy levels. The whole matrix can then be diagonal- 
ized either exactly or approximately by using modified 
perturbation procedure. 

Methyl alcohol, methyl amine, and methyl mercaptan 
are examples of molecules with intermediate barriers, 
i.e., low s values. Considerable work has been done on 
these molecules and this is discussed in Sec. 8. The 
low s values in these molecules are caused in part by 
the small reduced moment of the methyl group. With 
the light asymmetric frames, these molecules have only 
a slight symmetry. All the off-diagonal matrix elements 
due to asymmetry and internal rotation are small and 
their effect can be calculated by perturbation theory. 
For the main Q branch series (AJ=0, AK=-+1) re- 
ported, the transition positions are determined essen- 
tially by the limiting symmetric top rotational transi- 
tion plus the internal rotational transition. The off- 
diagonal elements contribute small terms which can be 
expanded in a power series of J (J+1). The coefficients 
of these power series are the perturbation correction 
terms including centrifugal distortion. Usually some 
parallel transitions are needed in addition to aid in the 
structural determination. 

Should the asymmetry be large, the solution of the 
low reduced barrier problem is quite difficult. With the 
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matrix elements given in Eq. (3-49) a diagonalization 
of the low J energy matrices could be made. The (v|v’) 
connections probably could be treated to sufficient 
accuracy by perturbation calculations. Evaluation of 
the torsional functions needed for calculating the matrix 
elements is discussed in Appendix 2. i 


4. LOW BARRIER APPROXIMATIONS 


The barriers of most molecules determined from the 
microwave spectra are in the category of a “high 
barrier,” i.e., of the order of 0.7-5.0 kcal. Here the 
“high barrier approximation” (Sec. 3) can be used, in 
which both a rotational and a torsional equation are 
solved, and any interaction removed by a transforma- 
tion. In the low barrier approximation the free internal 
rotational problem is solved exactly and then the 
barrier is added as a perturbation. So far two molecules 
have been treated by this method; they are nitro- 
methane (CH;NO,)!°!7 and methyl difluoroboron 
(CH;B Fo).7475 

The possession of a sixfold symmetry in these two 
molecules is believed to be the cause for the low value 
of the barrier. Consider the case of CH;NO». The inter- 
action of a single oxygen with the methyl group gives 
rise to a barrier to internal rotation in the form 


1V,(1— cos3a)+4V6(1—cosóa)+4Vo(1—cos9a)+: - -. 


The second oxygen contributes another potential bar- 
rier which is 180° out of phase with the one from the 
first oxygen. The potential due to both oxygen atoms 
is then 

V6(1—cos6a)+ Vi2(1—cos12a)+ +. 


The present discussion is therefore confined primarily 
to molecules with sixfold symmetry. The treatment 
given can, in principle, be extended to the more general 
case though the actual analysis might become quite 
complicated. 


I. Free Rotation 
A. Symmetric Molecules 


For symmetric molecules the difference between the 
solution by the principal axes method and the internal 
axis method is not as distinct as in the high barrier 
problems because the torsional equation with its dif- 
ferent boundary conditions is now eliminated. 

In the PAM, the Hamiltonian is!" 


H=A,(P2+P,)+C.P2+kp—2C.pP., (4-1) 


where 


A,=12/2,=/21,=A, (4-2a) 
C.=7?/[2(1.—Ta)], (4-2b) 
P=?2/2rlg=WI./[2La(12—1a)]. (4-2c) 


This equation can be rearranged to Eq. (4-3), 


H=A(P2+P))4+CP2+F (p— La/I2)P:)*. (4-3) 
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But this is just Eq. (2-9) in the IAM.‘® The eigenfunc- 
tions of these two equations are 


SJKM (0, goje Exes, 


(4-4) 


where m is an integer. The eigenvalues corresponding 
to Eq. (4-1) and Eq. (4-3) are then, respectively, 


E=AJ(J-+1)+(C.—A)K2-+-Fm?—2C.Km, (4-5) 
E=AJ(J+1)+(C—A)K2-+F[m— (Le/I.)KP. (4-6) 


However, if the IAM Hamiltonian Eq. (2-20) is 
used in which x’ and a’ are the variables, then the eigen- 
functions are 


SJKM (0 p) Ex ei a, á (4-7) 


The choice of m is governed by the boundary condi- 
tions, i.e., the wave function must be invariant under 
the transformation 


xı Xi 2rni x> X27 2TM2. (4-8) 


With Eq. (2-15) and following the same method as 
used in Eq. (3-20) we obtain the result 


m =l— (I./1:)K, (4-9) 


where / is an integer. Consequently the eigenvalues 
given in Eq. (4-10) are obtained: 


E=AJ(J+1)+(C—A)K2-+Fll—Ia/I) KP. (410) 


Clearly, if we identify 7 with m, Eq. (4-10) is identical 
to Eq. (4-6). 

The selection rules for the microwave spectra can 
be stated as AJ=+1, AK=0, Am=0O (see Sec. 6) (or 
Al=0). The frequencies of the rotational transitions are 


vyys = 2A (J+1), (4-11) 


which are exactly the same as those for a rigid sym- 
metrical rotor. Therefore, the effect of completely free 
rotation in symmetric tops cannot be detected from the 
microwave spectra. 


B. Asymmetric Molecules (CH;NOs2 type) 


If the methyl group of CH;NO» were allowed to exe- 
cute free rotation (zero hindering barrier), the micro- 
wave spectrum would be considerably different from a 
rigid molecule. The Hamiltonian may be split into that 
of a symmetric internal rotor plus the asymmetry 
term, Le., 

H=Hyo+H,, 


where 
Hy=3(A+B) (P?+P,2) 

+C.P2+FP—2C,pP., (4-12a) 
Hi=3(A—B)(P2—P,), aD 
By using the basis functions Eq. (4-4) which diagonal- 
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ize Ho, we obtain the matrix elements 


(JKm|H|JKm)=3(A+B)J(J+1) 
A[1C,—3(A-+-B)]K2--l'n?—2C.Km, 


(JKm|H\J, K2, m) 
—7(A-—B)V J+1)—-K (K+1)]} 
XJ (J+1)— (K#1)(K+2)}}. 


The Hamiltonian is diagonal in J, m, and M and the 
energy matrix consists of submatrices of the order 2/+-1 
for a given set of J, m, and M. As different values of K 
are connected only in steps of two by the off-diagonal 
elements, each of the submatrices can be factored into 


. 


(4-13) 


4(A+B)]—2mC.—» 


om 
7l -NAE 


[C:—}(4+B)]—2mC.—) 


=2 
$ -4 (4-3) 


For each m there exists an identical secular equation 
corresponding to —m and hence the energy levels (with 
m0) are all doubly degenerate. The selection rules 
(see Sec. 6) for dipole radiation are identical to those 
of rigid asymmetric tops with the additional restriction 
of Am=0. If the dipole moment is along the z axis, the 
parity of K cannot change. 

From the secular equations (4-14) and (4-15) the 
frequencies of the J=2 <1 odd K, parallel transitions 


(AK =0) are given by 


Vo- 1°94 K— 9 (A +B)+r2— At 
=2(A+B)+ {([4mC 2+ (9/4) (4—B)?} 


—[4m?C?+4(A—B)?}}}. (4-16) 


: For each value of m, two lines are predicted which are 
symmetrically spaced with respect to the midfrequency 
of 2(A+B). As m increases, the separation between the 
pair becomes smaller and finally the transitions con- 
verge to the band head of y=2(A+B). This effect has 
been observed on CH;NO2 by Tannenbaum, Myers, 
and Gwinn? and on CH;BF2 by Naylor and 
Wilson.™:7® The intensities of the higher members 
would tend to decrease on account of the unfavorable 
Boltzmann factors. 

The other J=2 + 1 transition which is even in K, 
can be treated in a similar manner. As Naylor and 
Wilson™:7* have pointed out, the /=1, K=0 eigenvalue 
is zero and the /=2, K=0 eigenvalue is positive, de- 
creasing with increasing m. This group of lines therefore 
converge toward 2(A+B) from the high-frequency side 

as m increases. For large values of m the energy matrix 

Eq. (4-13) may be solved by second-order perturbation 


: theory which leads to 


Fra 
rra 


LFR 
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even and odd K blocks. When m is equal to zero, the 
secular equation degenerates into one identical to that 
of a rigid asymmetric rotor with the exception that C, 
is the rotational constant of the framework alone rather 
than the entire molecule. For m0, the Wang trans- 
formation does not lead to further factoring of the 
secular equation because of the presence of the elements 
linear in K. As examples, the secular determinant with 
odd values of K for /=1 and J=2 are given below. The 
terms independent of K in the diagonal elements have 
been omitted since they can be regarded as additive 


constants for a given J and m. 


—4(A—B) | 
C-e Pcs z me 
—}(A—B) 
Es el" Cae 
E=}(A+B)J(J+1) 
+[C.—3(A+B) ]K?+Fm?—2C.Km 
(A—B)? 
64{(C.—3(A+B) ](K—1)—C.m} 
KILA lett) 2— (Ki — 1) *)] 
A-B): 
64{[C.—3(A+B) (K+1)+C.m)} 
X- (KHIL) (K+1)2]+---. (4-17) 


The transitions of the type AJ= +1, AK =0,§$§§ Am=0 
again give rise to a band-like structure for each given J. 
II. A Low Sixfold Barrier 

The sixfold potential barrier introduces to the Hamil- 
tonian an additional term of the form 
V (a)= (Ve/2) (1—cos6a) 
= (V,/4) (2—e%«— e7 i62), 


The first term $V. merely represents a common addi- 
tive constant and can be safely dropped. The energy 
matrix analogous to Eq. (4-13) is 


(JKm|H|JKm)=3(A+B)J(J+1) 
+[C.—3(A+B) ]K?+ Fm?—2mC.K 


(JKm|H|J, K42, m)=—}(A—B){[J(J+1) 
—K(K+1) JV (J+1)— (K+1)(K+2)]}? 

(JKm|H|JK, m&6)=—4V6. 

When Ve is small, the nondiagonality in m may be re- 


(4-18) 


(4-19) 


§§§§ In the present case K is nearly a good quantum number. 
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moved by a Van Vleck transformation. To the second- The two resulting matrices are 
order approximation the matrix elements reduce to 
those given in Eq. (4-13) with the additional diagonal a’ tga | (4-23a) 
term b+iV 6 a 
Vee Ver and 
92[C.K—F RB d b= te 
192[C. (m+3)] 192(C.K—F(m—3) ] | 4 | (4-23b) 
FVe b—łV 6 @ d 


= . (4-20) 
32L (C.K —Fm)?—9F?] 


The submatrices for m and —m are still identical and 
the energy levels remain doubly degenerate. Calcula- 
tions show that the two correction terms from the Van 
Vleck transformation in Eq. (4-20) have very little 
effect on the frequency provided 4 VF. The spectrum 
is therefore almost identical to that for free rotation. 
The validity of Eq. (4-20) depends, of course, on the 
relative magnitude of Vs and F. When the potential 
barrier becomes larger it is necessary to consider the 
third- and fourth-order terms from the Van Vleck 
transformation, but it has been indicated™ that even 
when the effect of the higher terms may be quite 
appreciable as far as the positions of the energy levels 
are concerned, the effect on the frequency may be 
negligible. 

The foregoing treatment is not directly applicable 
to the |m|=3 levels because a near degeneracy exists 
and it becomes necessary to consider the m=—3 and 
m=3 block as a single secular equation. The matrix 
elements connecting 3 with 9 and —3 with —9 can, 
however, still be treated by the Van Vleck procedure. 
An example is furnished by the block corresponding to 
J=1, K odd: 


mK (—3,—1) (—3,1) (,-1) (3,1) 
(—3,—1) a —iVe 0 
(—3, +1) b a’ OQ ea C2) 
O=) See 0 a’ b 
(3, 1) 0 —iVe b a 
where 
Ver 


=4(A+B)+C,+9F -—6C.-——,, 
Weert a Gr 192(6F—C) 
peu 
'=1(4+B8B CG 9F+6 FT E a E) 
UEA Cear 192(6F+C) 


D A =I) 


The symmetry about the antidiagonal allows this 
matrix to be factored through a transformation which 
is equivalent to choosing wave functions of the type 


EWE m=3)+y(—K,m=—3)]. (4-22) 
v2 


The double degeneracy discussed in the previous para- 
graph and in the free-rotor case is now split by the 
barrier when |m|=3. In fact the energy levels corre- 
sponding to m equal to whole multiple of three are all 
nondegenerate (see Fig. 2). The higher the multiple of 
three, the higher the order of perturbation required to 
demonstrate the splitting. The method for calculating 
these splittings is very similar to that used for the 
d doubling in diatomic molecules. 

There are now four lines for the /=2 <— 1 transition 
with odd K and |m|=3. These four lines appear as 
two pairs, both symmetrically spaced about the band 
head y=2(A-+B) and the splittings of these two pairs 
of lines are extremely sensitive to the barrier height. 
In CH;BF» the splitting of one of these pairs, which 
would be practically coincident at the limit of zero 
barrier, amounts to about 4000 Mc, for a barrier of 14 
cal/mole. These splittings are what make it possible to 
determine the barrier height with a high degree of 
accuracy from the microwave spectra. 

The selection rules are given approximately by 


AJ=0,+1; AK=0; Am=0. 


Exact selection rules may be obtained by the group 
theoretical treatment (Sec. 6) or by an explicit evalua- 
tion of the transition moments. 

In principle three sets of lines should be observed. 
The first set, which has the structure of the ordinary 
rigid asymmetric top spectrum, corresponds to m=O. 
Analysis of these lines yields the two principal moments 
of inertia perpendicular to the axis connecting the in- 
ternally rotating groups (the C—N or C— B bond) || || || || 
and also the moment of inertia of the framework about 
this axis. The second set consists of all the lines with 
m0 and m3. A band structure is observed if a 
sufficient number of high m transitions is detected. 
The |m|=3 lines constitute the third set and the 
splittings of these lines are used to determine the 
barrier height accurately. 


III. A Low Threefold Barrier 


The problem of a low threefold potential barrier can 
be handled in essentially the same fashion. The paper 
by Wilson, Lin, and Lide?! gives a treatment for a 
general NV’-fold barrier. Here, however, there are no 


IIl {II Nuclear spin symmetr i 

r y (see Sec. 6) in CH 2 
only those energy levels with K—m even so that the man 
also needed to determine the moments of inertia. 
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degenerate levels like the |7| =3 for the sixfold barrier 
which are connected by matrix elements of the barrier. 
Therefore, the barrier height cannot be as accurately de- 
termined. Recently CH;C=CCH.CI has been studied**” 
and an estimate of the barrier height has been given. 


5. CALCULATIONS OF BARRIER HEIGHTS 


After the rather lengthy discussions of the high 
barrier approximation and the low barrier approxima- 
tion for the IAM and PAM, the casual reader may be a 
little perplexed as to when the approximations are 
valid and which representation to use. Consequently, 
it seems desirable to give a few suggestions in regard 
to the analyses of microwave spectra, characteristics of 
the spectra, and methods of calculation. 


I. Analyses of Microwave Spectra 


As pointed out previously, the effects of rigid internal 
rotation are not observed in a symmetric top molecule. 
As a result, all the applications of internal rotational 
theory have been done on asymmetric molecules. The 
methods for determining the barrier heights for the 
symmetric top molecules from the microwave spectra 
are discussed in Sec. 7. 

For convenience the effect of internal rotation in an 
asymmetric molecule can be subdivided into the follow- 
ing groups according to the magnitude of the parameter 
s as defined in Egs. (3-4): 

(1) The pseudo-rigid rotor case (s>30).—With a 
relatively high barrier and large asymmetry splittings, 


I 


e,4° 127,5 195 19 ~ '8i0,9 


SWALEN 


i.e., the asymmetry splittings are larger than the in- 
ternal rotation splittings, both members of a transition 
follow a rigid-rotor spectrum. This is especially true of 
the low K energy levels with the possible exception of 
the K=1 levels. 

The splittings due to the internal torsion may be 
ascribed to the Ax term [Eqs. (3-57) ] in the IAM or 
the AW term [cf. Eq. (3-38)] in the PAM. Acetyl 
fluoride™ is an example of this class of molecules. 

For the ground torsional state, the frequencies of the 
A transitions are generally somewhat higher than those 
of the corresponding lines of the rigid molecule while 
the Æ transitions usually have lower frequencies. The 
shift of the A line from the “unperturbed position” is 
approximately twice that of the Æ line. This follows 
from the fact that according to Eqs. (3-57) and (3-28), 
2(Exn+Ex,1) is approximately equal to —3£xvo or 
from Eq. (3-33) and Table II, wo is very close to 
—woo®. Many exceptions to this rule can occur, 
especially for the K=1 energy levels, but generally it is 
a useful first approximation. 

(2) A departure of the E levels from a pseudo-rigid 
rolor pattern (15<s<30).—The ôg term in the JAM 
or the wo: term in the PAM causes the E energy levels 
to depart from a rigid-rotor pattern and for a given K 
the two AK=0 transitions tend toward one another so 
that the spectrum becomes similar to that of a slightly 
asymmetric top. In other words, the internal rotation 
cancels out part of the asymmetry. This 6x term is 
especially important for the K=1 energy levels because 
both the asymmetry and this “nonrigid-rotor term” 


M Al il 


25h 1 222,11 3015 15° 296,14 


——— 
1.0 Mc/sec 


Fic. 5. High K transitions in propylene oxide.”? 


The two Æ lines are superimposed on the left of the first transition but split in the 
In the third transition the Æ line is now under the A line and finally comes out the other side on the 
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connect the same pair of levels. If either the asymmetry 
term or the internal rotation term dominates to a large 
degree, the other can be treated as a perturbation. If 
the two terms are of the same order of magnitude, then 
a 2X2 matrix must be solved (cf. Sec. V). 

(3) High K energy levels—The high K energy levels 
of an asymmetric rotor are nearly degenerate even for 
very large values of asymmetry, so the internal rota- 
tional splittings often dominate the asymmetry split- 
tings. These high J, high K energy levels are very 
similar to symmetric rotor levels except for a shift due 
to the asymmetry. For the parallel type transitions 
(AK=0) very little splittings are observed, but for 
the perpendicular transitions (AK=+1) a triplet is 
observed consisting of an A symmetry line and two Æ 
symmetry lines. This is explained in Sec. 6 II. B. 
Propylene oxide? is an example of this group and some 
of these triplets are shown in Fig. 5. 

(4) Intermediate barrier heights (s<15).—When the 
internal rotational splittings are of the order of the 
spacings between different rotational levels, then the 
various approximations used in the high barrier case 
are not generally valid. This is true especially for the 
perpendicular transitions. Consequently, a more de- 
tailed calculation is needed. The IAM is usually recom- 
mended for the intermediate barrier problem, espe- 
cially for the slightly asymmetric molecules, e.g., methyl 
alcohol. As a guide to the first analysis, the low J 
parallel transitions should be used to obtain the ap- 
proximate barrier height and molecular structure since 
these lines are relatively barrier insensitive. A more 
detailed analysis of the perpendicular transitions is then 
made to determine the barrier height more accurately. 
For molecules with large asymmetry (such as acetic 
acid) many times the PAM is more convenient. 

(5) Low barrier heights (s<2).—¥For barrier heights 
in this region the low barrier approximation given in 
Sec. 4 is recommended. A considerable number of lines 
will be observed with all the possible m values. Naturally 
the Boltzmann factor eventually decreases the intensity 
of the high m lines. 

Generally, the m=0 lines should be analyzed first, 
followed by the low m (except |m| =3) transitions which 
are relatively barrier insensitive. Then an accurate 
barrier height can be obtained from the |m|=3 levels 
which are widely split by the sixfold potential energy 
term. 

Finally, in the analysis of the spectra, regardless of 
the s values, the choice of the axis of quantization is 
gauged by the axis of internal rotation in both the IAM 
and PAM. This follows directly for the IAM. In the 
PAM the principal axes closest to the top axes are 
used as the z axes even if this does not conform to the 
best choice for the solution of asymmetric rotor energy 
levels.*! 


(a) o 


-100 -50 o +50 


(b) 


OBSERVED ĉe 


-350 
PREDICTED 


, FREQUENCY , MEGACYCLES/ SECOND 


i300 SOM: io i 450 


Fic. 6. A rotational line in the ground state and excited vibra- 
tional (or torsional) states. (a) The rotational lines of the ground 
state and first two excited vibrational levels (or torsional levels 
without internal rotational splitting). (b) The same pattern in- 
cluding internal rotational splitting into the A and Æ lines. The 
observed spectrum is from propylene oxide.*7 


II. General Characteristic of the Spectra 


While the most intense part of the spectrum for 
molecules with a relatively high barrier arises from the 
transitions in the ground torsional state, the rotational 
transitions in the excited torsional states are usually 
observable. For example, at room temperature the in- 
tensities of the spectral lines from the first excited state 
could be reduced by a factor of two or three and those 
from the second excited states by a factor of about ten. 
These weaker lines (known as satellites) are shifted 
from the main line associated with the ground torsional 
state because of the influence of the vibration of the 
molecule. This total effect is taken into account by 
using effective rotational constants B; which are related 
to the actual rotational constants Be as B;=B.—2; 
X(v;+3)a;. The splittings between the A and E 
satellite lines in the “heavy” molecules are, however, 
rather insensitive to the effect of molecular vibrations 
and, to a first approximation, are caused by the internal 
rotation only. The general formulas for the A and E 
splittings developed in Sec. 3 are applicable to the 
rotational lines in the excited torsional states provided 
that such torsional levels are far below the top of the 
potential barrier. As the internal rotation splittings be- 
come magnified in the excited torsional states, the po- 
tential barrier many times can be determined from the 
splittings of the satellites.” In propylene oxide?™:10 the 
same barrier can account for the splittings of the rota- 
tional lines in the three lowest torsional levels (see 
Fig. 6). In the lighter molecules, such as methyl alcohol 
and methyl silane, the splittings of the satellite rota- 
tona lines cannot be completely ascribed to the in- 
Vibrations contribute anonce nini cg ae ee 
well as the frequenc ie a ie 3 splittings as 

l y shifts of the rotational lines in the 
excited states. 


In the identification of microwave transitions, a rigid 
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model of the molecule can be used in order to first 
locate the approximate positions of the lines. The 
Stark effect can be used in a similar manner as in a 
rigid asymmetric molecule. The A symmetry levels can 
be thought of as those of a rigid molecule possessing 
torsional oscillations. The Æ symmetry levels, on the 
other hand, can exhibit some first-order Stark effect 
(see Sec. III), but this is usually quite small for cases 
of high barrier and large asymmetry. However, in other 
cases the Æ transition can be distinguished from the A 
transition by the Stark effect. The nuclear spin weights 
can also aid in the identification. For the normal methyl 
group, the intensity ratio of A and Æ is 1:1; but ina 
deuterated methyl group, the ratio is 11:16 with the Æ 
level stronger’: (see Sec. 6). ¥ 
With the identification of the various transitions, 
approximate unperturbed rotational constants can be 
determined by using either the average position of the 
A and £ doublet or else two-thirds the doublet separa- 
tion from the A line. The latter method is a better 
approximation except for the K=1 levels. With the 
approximate rotational constants one can determine the 
structure of the molecule which along with the A-E 
splittings may be used to obtain the barrier height. 
Successive iterations lead to more accurate results. 


Ill. Stark Effect 


For the case of free rotation, the lines with m=0 
show a second-order Stark effect (first order if acci- 
dental degeneracy exists) which is a characteristic of a 
rigid-rotor spectrum. When m is not equal to zero the 
situation is different. As stated before, the Wang com- 
binations of the K and —K symmetric top functions 
are no longer the proper zeroth-order wave functions 
because of the presence of the linear K terms in the 
secular equation. For example, an eigenfunction for 
J=1 and odd values of K may be written as 


AND 


Yo IDs SWE INy 
ever, for the case of |m|=3, the transformation [see 
Eq. (4-22) ] eliminates the diagonal elements of cos@. 
The resulting off-diagonal elements then give rise to a 
second-order Stark effect with a large Stark coefficient 
analogous to the case of near degeneracy in rigid asym- 
metric tops. This serves as a means for identifying the 
|m] =3 lines. 

When the barrier becomes higher, the asymmetry has 
a larger influence. The coefficients a and b in Eq. (5-1) 
depend on ôx and ôasy (defined in Sec. 3) as 


ve—= +25 (46x? +5 rsy?) +. (5-3) 


Since ôx is very barrier sensitive, the first-order Stark 
effect rapidly vanishes as the barrier height is increased. 


IV. High Barrier Case 


In the symmetric top case, the energy associated 
with the internal rotation for the ITAM is expanded in 
a Fourier series [Eq. (3-27)] while for the PAM it is 
expanded in a power series [Eq. (3-13)] as 


2r 


Dreo n| atar cos—(pK —c) 
3 


dr 
+a: co N 0 | (5-4) 


Ekr =F[ w + Wo pK + Wyo (pK)?2+ +++]. (5-5) 


In Fig. 7, the various Fourier coefficients are plotted 
from s=10 to s=100 on a semilogarithmic scale, and 
in Fig. 8, the coefficients of the power series in Eq. 
(5-5) are given. From Eqs. (5-4) and (5-5) the con- 
necting relations between these two types of coefficients 
can easily be established. They are given in Table III. 


TABLE ILI. The relations between the Fourier coefficients and 
the power series coefficients [see Eqs. (5-4) and (5-5) ]. 


b. R Factor N2/4 ao a a: a3 a as 

U=ay(K=1)-+b)(K=—1), (Gl) 35 SS - 

where a and b are not identical. The expectation values 20 elem : =i í a 

of the direction cosines between the electric field and the ae z E 5 m = 

aati a 2 
body-fixed z axis is then b(6n) 1 n i eee k 
2 72 N?/4)b (ax) aota, cos(27/x) + a2 cos (42/2) 
(2—B)KM ( 
$ aL Gofleystoon 
fe cosð¥dr= TJ+1) Á (5-2) Threefold case (N =3) eaa 

2Lb (r) —b (37) J 1 1 0 1 1 

This leads us to the conclusion that, in general, a first- (9/8) [b (mr) —b (2x) J 1 0 1 0 1 

order Stark effect is observed provided m, K, and RESEND Ys 1 —-2 0 4 =§ 

M are all different from zero. For example, with FLIR A 1 4 9 16 25 

J=1<0 only AK=0 and AM=0 are allowed and the ee ise $ ; mr ig as 

transitions have a second-order Stark lobe while the ae o4- eo) /(3n2) i 6 a D 6 

Stark effect of the J=2 — 1 transitions (with odd K) ~27NBt043 © /2r poe aes a) GS 

a rst order for M=+1 and second order for M=0. 243w, /2n4 1 16 81 256 625 
is fi 

When the barrier is low” the Stark patterns of the  — 2432, /m4 ie <DEL 625 


Jines with |»|3 are practically identical to those for 
free rotation as the Van Vleck transformation offers 
I only minor modifications to the wave functions. How- 
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e.g., (N?2/4)b (3m) =do—}41—402+0;—40,—}a5 and 
—[9wy0 /2x? |= a1+4a2+9a3+16a,;+25a5+--- . 
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[see Eq. (5-4)]. 
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Fic. 7. The Fourier coefficients as a function of the reduced b 
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) are very barrier 
sensitive, as shown in Figs. 7 and 8. They approximate coefficients 


top molecules, they are also applicable to the asym- 
Each of the coefficients (except ao 


metric molecules with the necessary modification of 


Although these formulas were derived for the symmetric 
F and p. 
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Fic. 8. The power series coefficients as a function of the reduced barrier height s for the various torsional levels [see Eq. (5-5) ]. 
In order to distinguish the various perturbation terms the following notation is used: w =—2¢, w® =n, w®=¢, w) =r. The first 


subscript denotes the torsional state v, and the second denotes the sublevel o. ——, v=0; —.—, v=1; -~--, v=2. 


nance of a is seen in Table IV by the fact that the 
empirical constants B and C have all about the same 
value for v=0. The small deviations are caused by the 


low s values. 


For low values of J, K, and p, the agreement between 
the IAM and PAM is very good. The splittings of the 
J=1 energy levels are given as a comparison in Table V. 
These results become identical if pK is small and the 
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barrier is high. Under such circumstances Eq. (5-4) can 
be approximated by the first two terms and sin(27/3)pK 
may be replaced by (21/3)pK. A similar agreement be- 
tween the IAM and PAM is found for the higher J 
levels. 

These connecting relations also indicate the limita- 
tions of the two methods: the IAM is restricted to low 
J because of computational difficulties, and the PAM 
is restricted to low values of pK. If J and pK are large 
and the asymmetry is small, a method of solution 
similar to that used by Burkhard and Dennison? on 
methyl alcohol probably would be the best. If J, pK, 
and the asymmetry are all large, both the PAM and 
IAM involve a considerable amount of calculation. As 
the barrier is lowered, more expansion terms in Eqs. 
(5-4) and (5-5) are needed to determine the energy 
levels. 


V. Sample Calculation on the Spectrum 
of Acetaldehyde (CH;CHO) 


Some of the points previously discussed in regard to 
the determination of the barrier height from the ob- 
served spectrum are illustrated by considering acetalde- 
hyde as an example. The values of the parameters 
obtained in this section differ slightly from the reported 
values by Kilb eż al. $ because our calculation is not 
performed to the same refinement as that of reference 38. 

| Four of the transitions reported in reference 38 for 
the normal isotopic species are given in Table VI. The 
estimated unperturbed frequency given in Table VI 
was taken as 3 of the doublet separation from the A 
line. For the 2:2.<—11 transition and the 2ıı — 1yo 
transition, this procedure is not valid because of the 
6, term. Here the unperturbed position was estimated 
from the A line alone. For the 212 — 11: transition two- 
thirds of the 101 — Ooo splitting was subtracted from the 
A line; for the 2::<—110 transition three times this 
amount was subtracted from the A line. This procedure 
follows directly from the PAM (see Table I). 

In order to evaluate the barrier height from the in- 
ternal rotation splittings, the bond lengths and the bond 
angles of the molecule must be known. These can best 


: TABLE IV. Approximate formulas (threefold case). 
LogX =logA +B logs—C(s)#+Ds. 


Range 


xX v of s Error A B (0 D 


=a 0 20-80 0.1% 0.84469 0.92376 0.89751 8.46 10-4 
2 =a, 1 36-100 0.1% 1.21631 1.89036 0.95594 20.5510 

=i 2 60-100 0.1% 0.66776 3.45716 1.09452 50.19X10~ 

—Ab 0 20-80 0.3% 0.68248 0.85539 0.87703 

—4w 10 20-80 0.3% 0.83388 0.83347 0.87430 

wo) 0 20-80 0.3% 1.18418 0.87348 0.87918 


F —wor®) 0 20-80 0.3% 0.89533 0.85530 0.87658 
Aw 0 20-80 0.3% 1.36463 0.86703 0.87825 
wor 0 20-80 0.3% 1.02871 0.79882 0.87022 


a o ———_— 
pe 
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Tanie V. Comparison of the splitting formulas for an 
asymmetric molecule with a plane of symmetry. 


PAM 
AE(G o1) =AB = Fp Aw, ”. 
(O21)? 67 
AE (11) =AC——————= Fp Aw. 
A—B A—B 
(Q:wr1 (1))2 62 
AE (110) =AB+ACH—— = FPAw, + z 
A—B A—B 
IAM 
2r? 
AF (1o1) = Ao+8?(A1—Ao) eee 
ô? 2r? ô? 
AF (111) =A, (A — Ao) — = Ap ——p2Ao— T 
A—B A—B 
ô? 2r? 62 
AE (110) =Ai+ = A,——p*Ao+ - 
A-—B 9 A—B 
Connecting relations 
px” p? 
<=, e=—, 
p? p 
FAw,® =FAb=F (ba—be) = (4/9) Ao= 3 Fa, 


Qn? T 


—FAw,°?) =—Ao=— Fa, 
9 3 


2rp 2r? 
A,—Ao=Ao cos———1 |= —Ao-—p’, 
3 9 
ao 2r Qa T 
ôı = +— sin—p ~ +—p:Ao=—p:Far=Fp.tu™, 
v3 3v3 v3 
= Qty ™ = 20:Pev- 


be determined from the rotational constants of a number 
of isotopic species of the molecule.** In this sample 
calculation, however, we use an approximate assumed 
structure for acetaldehyde. The various molecular pa- 
rameters calculated from this assumed structure are 
shown in Table VI. 


A. IAM Calculations on Acetaldehyde 


From Eqs. (3-56), (3-57), (3-28), and (3-4) we can 
obtain the expressions for the splitting of the various 
transitions. For the J=11 < Ooo transition a splitting 
of +3.0 Mc was measured. The approximate formula 
for this transition is as follows: 


VAT VE= Av= £8? (Ay— Ao) = —0.22134A 6? 
= — (9/4) FAbB? (0.22134) = —678.5Ab Me. 


By equating this to 3.0 Mc, we obtain a value of — 
— 0.004422 for Ab. From the formulas in Table IV, fro nee 
the curve in Fig. 7 for a, (Ab=+2a,/3), or from — 
references 3b and 104, the corresponding s value 

be determined. A value of 22.943 was obtair 
corresponds to a barrier height of 40’ 
cal/mole). The calculations for 

a similar manner. __ Pe 


TABLE VI. Parameters for acetaldehyde. ting in 212 — 111 transitions from Eq. (3-56) is then 


} 
i 874 CGAI N AND J. D. SWALEN 
i Av=B?(Ae+2A9—3A1) +61 terms 


1. Spectrum 


Bik Reseed =378.01Ab+ (A2— 1594.8) — (Ay — 531.6). 
j Seated ASME By equating this. to the observed splitting of —222.6 
H 101—000 T ASRS Ca 197263.3 Mc we obtain a Ab of —0.003993 after successive ap- 
i} an 5 proximations. This leads to a value of 23.56 for s and 
2ıe— 1n Si Rare tm 974202:0 418.7 cm™ (1197 cal/mole) for the barrier height. In a 
ih ; 38 512.3 (4) 38 508.4 similar manner the expression for the 211 — 110 transi- 
38 506.4 (E) tion 1s 
2u— lio 3 ae eS PISS Av=£B?(A2—A)+6; terms 
2. Rotational constants = —1759.0Ab— (A2— 1594.8) + (A1— 531.6). 
C (A) 56.9 kMc 
B (B) 10 163.2 Mc By equating this to the observed splitting of +-232.4 


A (C) 9100.0 Mc 


The rotational constant notation in parenthesis refers to 
reference 38. 


Mc we obtain a Ab of —0.00405. A value of 23.48 is 
obtained for s and 417.2 cm™ (1193 cal/mole) for the 


~ 3, Assumed molecular structure barrier height. These results are compared in Table VII 


don 1.09 A (methyl group) <HCH 108°15’ with the results obtained in reference 38 and by the 
dco 1.54 <CCH 120° PAM. 
dco 1.21 <CCO 120° 


dcu 1.09 (aldehyde group) 
4. Moments of inertia 
I, 55.553 amu A? Mer 55.553 From Table II and Eq. (3-33) we can obtain the ex- 


B. PAM Calculations on Acetaldehyde 


Z a 42 Fee pressions for the splittings of the various transitions. 
I. 3,079 ips 14.563 The equation for the 101 — Ooo transition is 
5. Internal rotation parameters ih NTE 
À : *= 236.9 kMc=17.7 i Wie: 2 2047 2 
w 03910 P=2369 Mom 17.77 em Py 138.7400 
r 0.6931 A2=0.21262Ao Sr? rl? 
B? 0.00583 
p 0.31941 (2x/3)pK =38.86°K By using the approximate formulas in ‘Table IV or the 


=~~ curves in Fig. 8 we obtain an s value of 22.95 and a 
barrier height of 407.8 cm (1166 cal/mole). The 

In the calculations of the barrier height from the transition 292 <— 10: follows in a similar manner. 
212 — 11; transition and the 21; <— lio transition, the 6; In order to obtain the splittings for the 212 111 
term must also be considered. For the 212 and 211 transition and the 2;:< 110 transition it is necessary 
energy levels we have the following 2X2 matrix (formed to diagonalize two 2X2 matrices because of the linear 


by the K and — K state) with the eigenvalues +)2: terms in K. These matrices are 
ae 400Ab 1594.8 | le ial at | 
1594.8  —190 400Ab 1594.8 151 040 poo—7680wo1® J 
where and 
51= (1/3) Ao sin (2x/3)oK = 190 400Ab Mc [ 151 040p00+7680w 531.6 | i 
from Eq. (3-58); 531.6 +151 040 poo— 7 680w10® 
asy = —1(A — B)J(J+1) =1(1063.2)J(J+1), TABLE VII. Barrier height in acetaldehyde by various methods. 
from Eq. (3-59) ; 
IAM PAM KLW (38) 
and for the 14) and 11 energy levels we have the follow- ‘ty LGA ane AGE MBO 
ing matrix with eigenvalues +)1: 2o2— 1o01 409.7 410.1 b 
pare ae 418.0 417.3 
z > 418.0 419.0 
190 400Ab 531.6 | a f l 3 — ihe Mc —763 Mc" 760 Mce 
. 14 cm $ 
531.6  —190 400A% DA a 


h h O Bo th B mnothirdorder correction included mm F 
7 i re comm o individual value was reported for this transition in reference 38. 
E erms on the diagonal which are í A e A sixfold potential energy t i ; i 
Here, t E ression for the split- end of Sec, 3 ITAS (b)? rgy term was included. See the footnote at the 


= states have been omitted. The exp 
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Here the third-order terms have been included. The 
expressions for the 2:2.<—-11, transition and for the 
211 — 110 transition from Table II are 


Av=AB-+ (d2— 1594.8) — (A1— 531.6) (212 — 111), 
Av=3AB— (d2— 1594.8)+ (A1— 53.1.6) (21, — 110). 
By equating these to the observed splittings we 


obtain the following parameters after successive 
approximations: 


poo=5.46X 10-3 
Aw = 1.92 10-2 
wor =8.10 10-3 


QO.) = — 824.7 Mc 
AB= 2 
F(Q:/Fyw®= 62. 


From the curves in Fig. 8 or the approximate formulas 
in Table IV, a value of 23.52 is obtained for s which 
corresponds to a barrier height of 418.0 cm (1195 
cal/mole). The results are summarized in Table VIT. 


6. SELECTION RULES AND NUCLEAR 
SPIN STATISTICS 


I. Selection Rule for Symmetric Molecules 


The selection rules for symmetric molecules can be 
derived easily from the wave function of the internal 
rotor. Consider, for instance, the wave function ob- 
tained from the IAM 


Sax (0, e)e* xe? Ke P x, (a). 


Since the orientation of the dipole moment is inde- 
pendent of the angle a, the usual selection rules for the 
dipole radiation of a rigid symmetric rotor are appli- 
cable, i.e., 


AJ=0, +1; AK=0. 


In addition, it may be shown from Eq. (3-25) that the 
transition moments are different from zero only when 
Ao=0. 


II. Selection Rules for Asymmetric Molecules 


The selection rules for an asymmetric molecule in 
the quantum number J is AJ=0, +1. This follows 
from the fact that the total angular momentum for a 
rotating molecule remains a good quantum number 
even when internal rotation is present. The selection 
rules governing the changes of the other quantum 
numbers are more complicated. In general, these rules 
can be derived by group-theoretical methods. 

The general selection rules are derived from sym- 
metry considerations. Application of these rules to the 
cases of high barrier and low barrier are then discussed 
in detail. 


A. General Selection Rules 


The Hamiltonian of a rigid asymmetric rotor is in- 
variant under the operations 
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pe len, IP... Py — Py, Pz Ps 
(OAK P,— —P,, Py —Py, Wo IP 


(6-1) 
Cz: Pa > Pa Nyy IP I 1 
Gis P, —> — Pz, P,— Py, P: —> —P,, 
and therefore has the symmetry of the four 


group TIT V. 

When the effect of internal rotation is included, the 
Hamiltonian is no longer invariant under the operations | 
given in (6-1) because of the cross terms between the l 
P’s and p [e.g., Eq. (2-25) or Eq. (2-37) ] or between the 
P’s and functions of the angle a [e.g., Eq. (2-29) or Eq. 

(2-31) ]. The group properties of the various types of 
molecules are discussed in the following sections. 

1. Molecules with a planar frame.—The Hamiltonian 
Eq. (2-31) or Eq. (2-37) (with Q.=0) is invariant 
under the following operations: 


PY: Pone > Baun AES 

Cg o= Pz, 2y a= S15 ry E= —a(p =: —?), 
C3: 
C37: 


1a py PE IE a—>at (27/3) (p> b), 
IP. ye? I 
a— at (42/3) =a— (21/3) (p > $). 


(The operations C-C; and C-C}? follow directly from 
these operations.) These symmetry operations satisfy 
the properties of the group D3, and therefore the wave 
functions have the symmetry of the species A1, A», or Æ. 

For this type of molecule the dipole moment lies on 
the plane of the framework and has the transformation 
properties of A». The selection rules for the dipole 
transitions are then”? 


(6-2) 


EOE. 


ASA», 


2. Molecules with no symmetry in their framework. — 
For this class of molecules only the operations E, C3, C3? 
leave the Hamiltonian unaltered. The energy states then 
fall into the species A or E of the C; group. Since the 
dipole moment has the symmetry property of A, the 
selection rules can be summarized as AA, EOE. 

3. Molecules with a frame of Co. symmetry (CH;NO> 
type)."'\—The group properties are of particular interest 
because the cross term between P and p appears only 
as pP,. The Hamiltonian is now invariant under the 
following operations: 


E: Pr, y, 2 —> Pr mz &— alp — b), 


Cz: Pe> Pr Pya —> — Py z a —> —a(p— — h), 6-3 
Ge I pd eb P, — E as a—> —a(p— — p), $ R ) 


ORI E R —Pe m a> alp — p). 


Because of the sixfold symmetry, the Hamiltonian is 
also invariant under the operation a— at (2nr/6). 


TITT The operation Pa > —P,, P, > P, P pins 
alter the Hamiltonian but does chan R n Pa Pa d 
between the three components of t ige the commut ion 

not considered to be a sym 


E as 


Fa 


Taste VIII. Character table for the point group Der. 


FE 2C3 3CxCs C: 2C:C3 3CyC3 C2 2C. 3CrCe C:C2 2C:Cs 3CyCe 


aes ie 1 1 1 1 1 1 1 1 1 1 
Bre 1 1 -1 1 1 -1 1 1 -1 1 1 —1 
Er 2 —1 0 2 -1 0 2 -1 0 2 -1 0 
tt Bre 1 1 1-1 -1 -1 1 1 1 =1, 1 -1 
tt Bye 1 -1 -1 -1 1 1 1-1 -1 -i 1 
Eu 2 — 0 —2 1 0 2 -1 —2 1 0 
Ao 1 1 1 1 1 1-1 -1 -1 -1 -1 -i 
į po E —f —1 a a =1 1 
Ew 2 —1 0 2 1 0 -2 1 o -2 1 0 
Bro 1 1 1-1 -1 -1 -1 -1 -1 1 1 1 
Byo 1 1 -1 -1 -1 11 1 1 1 —1 
Ex» 2 —1 Oo -2 1 0 —2 1 0 21 


These six operations are denoted by E, Ce, Ce? (C3), 
Cè (C2), Ct (C3?), and C6, where the operation Ce repre- 
sents a — a+ (1/3). These six operations along with the 
“four-group” operations form a new group of twenty- 
four elements which is isomorphous to the point group 
De. The character table of this group is given in 
Table VIII. 

The dipole moment for this model would ordinarily 
be along the z axis. If this is the case, the transition 
moment will have symmetry B.,.. The selection rules 
for dipole absorption will then be A. B.., BreOByo, 
EEr, EE, AB, BoB, EwoE, 
Ex Ex. 


B. Selection Rules for High Barrier 


Molecules with a planar frame may be considered as 
an example. The limiting rigid-rotor Hamiltonian be- 
longs to the four-group. As the operation C, is common 
to both the D; group (of the torsional rotor) and the 
four-group, the A; species of the Dz group corresponds 
to A and Bz in the four-group while the A,» levels of the 
hindered rotor reduces to the B, or B, levels of the 
limiting rigid molecule. As shown before, the selection 


| 
i 


rules for the rotational states associated with the non- 
3 degenerate torsional levels are A; >A». In terms of the 
| transitions for the limiting rigid rotor these correspond 
to A, B.<>B,, Bz, which are indeed the selection rules 
for a rigid asymmetric top molecule with dipole moment 
i on the yz plane. Hence, for the internal rotor with 
high barriers the selection rules governing the transi- 
tions between the rotational levels belonging to the 
nondegenerate torsional levels (the pseudo-rigid rotor 
lines) may simply be taken as those for the correspond- 
ing rigid molecules. 
In the case of degenerate torsional levels, according 
- to the general symmetry selection rule EZ, transi- 
f tions between all the Æ levels are, in general, possible. 
This can be understood by considering the effective 
rotational Hamiltonian for a given torsional level. For 
‘the Æ levels the nonrigid-rotor terms, ôr, tend to elimi- 
nate the asymmetry and the transitions approach those 
of a symmetric rotor. The eigenfunctions no longer 
have the symmetry of the four-group, and the pee 
metry selection rules for the wave functions associate: 
sire 


rkt 
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with these symmetry species are not applicable. For the 
case of high barriers and low K, it may be expected 
that the transitions in the hindered rotor which are 
forbidden for the limiting rigid rotor are usually weak. 
However, if transitions involving higher K values are 
observed, the effect of the internal rotation can now 
exceed the asymmetry splitting even in the case of 
relatively high barriers, and the spectrum becomes very 
interesting. Consider the perpendicular type transition 
(AK==:1) between two groups of four energy levels 
corresponding to the A and £ levels of a K doublet. 
At low values of K four lines are expected (two doublets) 
similar to those shown in Fig. 4. As K increases, all 
four £ transitions are possible and six lines appear (two 
A and four Æ). At high values of K the wave function 
may be approximated by the symmetric top wave 
function. When this is the case the two A levels of a 
given K become nearly degenerate and the two A lines 
merge into a single line. The two £ levels, however, 
remain separated because of the linear term P, in the 
effective Hamiltonian Eq. (3-33), and two of the four 
E lines now become forbidden so that for large values 
of K one observes a group of three spectral lines—(one 
A and two £). Such a transition from four lines (low K) 
to three lines (for high K) has indeed been found by 
Herschbach and Swalen.*’ In the transition region be- 
tween these two patterns all four Æ transitions are 
observed. Those transitions forbidden in the limit of a 
rigid rotor can be seen to increase in intensity at the 
expense of the rigid-rotor lines until finally the triplet 
pattern results (see Fig. 5). 


C. Selection Rules for Low Barrier 


So far the only molecules which have been found by 
the methods of microwave spectroscopy to have low 
potential barriers to internal rotation are CH;NO» and 
CH;BI*.*****; our present discussion is confined mainly 
to this class of molecules. The symmetry selection rules 
have been given in Sec. A3. In order to apply these rules 
to the analysis of the microwave spectrum it is necessary 
to establish some correlation between the symmetry 
properties and the rotational quantum numbers. 

1. Limiting free rotator—The Hamiltonian and the 
energy levels were discussed in Sec. 4. Since the quan- 
tum number m is diagonal, the wave function can be 
expressed as a product of e’”"« and a function of 0, ®, 
and x. This function may be determined from the 
transformation matrix which diagonalizes the energy 
matrix in Eq. (4-13). An inspection of the structure of 
this matrix reveals the fact that a typical wave function 
has one of the following forms: 


Yperen K — gima SS aS 7K M (0, pje x, (6-4) 


even K 


W,,ou4 K= gima SS akS KM (0, penx, (6-5) 
odd K 


#**** Recently CH;C=CCH;Cl** has been studied and re- 
ported to have a low barrier. 


T Anak ee 


E 
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where Syx.1(0,¢)e** denotes the symmetric top wave 
function and the summation in K in Eqs. (6-4) and 
(6-5) extends only from — J to J. As explained in Sec. 4, 
each energy level with m0 is doubly degenerate. This 
double degeneracy occurs because the energy matrix 
Eq. (4-13) remains unchanged when m and K are re- 
placed by —m and —K. Now those energy levels where 
m is a multiple of three (and not zero) are associated 
with the pairs of symmetry species: Ae, Bze; Ao, B20; 
Bre, Bye; or Bzo, By. Here each pair constitutes the 
a double degeneracy which would be split by a barrier. 
Those energy levels where is not a multiple of three 
are associated with the symmetry species: Ere, Eze, E10, 
Ezo. Here the degeneracy is not split by the barrier. 
Consideration of the C. operation (x — x-+7) shows 
that the wave function Eq. (6-4) remains unchanged, 
but the wave function Eq. (6-5) changes sign. For the 
Cè operation (a— a+r) both Eq. (6-4) and Eq. (6-5) 
remain unchanged if m is even but change sign if m is 
odd. With the aid of the character Table VI, the sym- 
metry properties of the energy levels characterized by 
different # and K may be summarized as follows: 


m (m0) K 
JW olen 1a. even even 
Dre Dye t2 even odd 
Ao, Bio, E10 odd even 
Bzo0, Byo, E20 odd odd 


The selection rules for mm and K are such that the parity 
is not changed. For the quantum number m the selection 
rule is even more restricted in the case of free rotation. 
Since the dipole moment is independent of a and neither 
wave function Eq. (6-4) nor Eq. (6-5) involve states of 
different 7, we have the selection rule Am=0. For the 
K quantum number, the selection rule is AK=0, +2, 
+4, etc. Here K is nearly a good quantum number and 
the transitions with AK=O will in general be stronger 
than those with AK=+2, +4, etc. These latter types of 
transitions result from a mixing of the K levels in 
Eqs. (6-4) and (6-5). Note, however, that the Wang 
functions are no longer the appropriate zero-order wave 
functions because of the linear term in K on the di- 
agonal of the energy matrix. 

The discussion presented above pertains to the energy 
levels with m0 only. When m=0, the exponential 
factor ema in Eqs. (6-4) and (6-5) and also the linear 
term in K in the energy matrix disappear. The secular 
equation degenerates into that of a rigid rotor. Since 
the expressions in Eqs. (6-4) and (6-5) with m=0 are 
invariant under the operation Ce and Cs, the energy 
levels with m=0 have the symmetry properties of Ae, 
Brey Bye, or Bz. The selection rules for the transitions 
with this series of energy levels are identical to those 
for a rigid asymmetric top, i.e., 4<?Bze and Ba Bye. 

2. Low barrier —As explained in Sec. 4, the addition 
of a small barrier term $Vo(1—cos6e) to the Hamil- 
tonian of a free internal rotor does not greatly alter 
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ones with m equal to whole multiples of three. The cog 
nondiagonal elements in m may be removed by a Van 

Vleck transformation so that the secular equation can 

be factored approximately into blocks corresponding to 
different values of m. The basis functions for this effec- 

tive energy matrix (for m 3n) are then 


(eima t aei™t a beim-Hat... \SoKa (0, g)e£x, (6-6) 


The coefficients a and b do not depend critically on K. 
The final composite wave functions are taken as linear 
combinations of such basis functions. The structure of 
the secular Eq. (4-19) again is such that the basis func- 
tions of odd and even K do not combine in forming the 
final wave function. Furthermore, since the coefficients 
a, b are usually very small, one may still use m as the 
labeling index for the energy states. The selection rule 
for m is Am equals a whole multiple of 6. Since the 
Am=x0 transitions are weak and usually beyond the 
microwave region, the selection rules for the dipole 
transitions may be given approximately as Am=O, 
AK=0, AJ=0, +1. Here again it is assumed that K is 
nearly a good quantum number. Remembering that 
each energy level characterized by (m,K) is degenerate 
with the one by (—m, —K), one may easily show that 
the selection rules Am=6n (n being an integer) are 
equivalent to those derived from the symmetry 
consideration. 

For m=3 (or whole multiple of three) the solution 
of the secular equation becomes more complicated as 
discussed in Sec. 4 II. As an example the block in the 
secular equation corresponding to J=1, m=3, odd K 
was given in Eq. (4-21). This secular equation can be 
factored by choosing as the basis functions the type of 
linear combination given in Eq. (4-22) i.e., 


v(K, m=3)+y(—K, m= — 3). (6-7) 


To the zeroth-order approximation these functions may 
be expressed as 


CRS ye (0, pe Xe SeS yay (0,¢)e—**x, (6-8) 


These two linear combinations have the symmetry 
property of the Ao and Bz when K is even; and Bz and ; 
B,o for odd K. In the case of the example in Eq. (4-21) =< 
the energy eigenfunctions arising from the secular Eq. 
(4-23a) have the symmetry of B.o while those corre- Dr 
sponding to the determinant Eq. (4-23b) belong to Ao _ 
species. The approximate wave functions for these for À 

states are ih 


TE 


ow eee eee TORE ET 


$ 


Le 
Į Er Bebe 
EPT 


Y4+=a(1,3)+y(—1, —3)] 


Y+-= — b (1,3) (—1, — 
so eeshalne 


— 


878 CRC ENT AND 


Here the first and second number inside the parentheses 
of the functions refer to the values of K and m, re- 
spectively. For CH;NO: the difference between the two 
diagonal elements in Eq. (4-23a) and Eq. (4-23b) are 
roughly ten times greater than the off-diagonal terms; 
so the ratios of the coefficients a, b, c, and œ in Eq. 
(6-9) may be estimated by perturbation methods as 
|a/b| ~|c/d|~10. One may construct four approxi- 
mate wave functions similar to Eq. (6-9) for the states 
with J=2, m=3, and odd K. The coefficients a, b, c, etc., 
here are somewhat different from those in Eq. (6-9), 
but the ratios of these coefficients are still of the same 
order of magnitude as those for /=1. In accordance 
with the symmetry selection rules A@B., Bx>B,, 
00, ee, derived previously, there are eight possible 
transitions for J=2<—1,m=3 with odd K. Among 
these eight lines, four are much more intense than the 
others, and indeed only the four strong lines are ob- 
served in the microwave spectrum. These transitions are 
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Similar results are also found in the microwave spec- 
trum of CH;BF». 

The selection rules for the transitions involving the 
energy levels with m=0, are A «Bz, and Bz ye. 


IHI. Nuclear Spin Weight 


To obtain the relative intensities of the components 
of the internal rotation multiplets (such as the doublet 
for the high barrier case), the correct combinations of 
nuclear spin functions y, with the internal torsional 
function y, must be found. The method for finding such 
appropriate nuclear spin functions has been given by 
Wilson.!!8 This procedure will be applied to the various 
types of nonsymmetric molecules. 


A. Molecules with No Symmetry 


By following Wilson,” we observe that for the pro- 
tons in the methyl radical, the group of permutation 
operations which are equivalent to the internal rota- 
tions of the molecules is isomorphous with the group 
C3, as is the group derived from the internal rotation 
operations. For CH; there are eight nuclear spin func- 
tions which can be schematically represented by speci- 
fying the quantum number m, for each nucleus as 
follows: 


} j 3 (one) 

4 4 — etc. (three) 

4 —} =; AKo (three) 
=Î =r —i (one) 


These eight spin functions form a reducible representa- 
tion of the C3 group. The characters of this representa- 
tion are x(Z)=8, x(Cz)=2, x (Cs) =2. This reducible 
representation then belongs to the species 4A+2Ey 
+ 2E».** The over-all wave function must be aniono 
metric to the interchange of protons, i.e., of species 
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with respect to the operations of C3. Since the non- 
degenerate torsional function is of species A, the 
symmetry of the composite wave functions is A 
. (44 +2Eı+2E:)=44+2Eı+2E:; therefore, the sta- 
tistical weight is four. For the degenerate torsional 
sublevel the reducible representation formed by the 
total wave function is (E+E): (44 +2Eı+2E:)=44 
+6Eı+6E». Consequently the members of a torsional 
doublet (in the case of high barriers) of a particular 
rotational transition should have equal intensity. 

The twenty-seven nuclear spin states for a CD, 
group form a representation of species 114 +8#,+ 8.38 
Thus the nondegenerate levels yield A - (114 +8E1+ 8E) 
=114+8£,+8£2, whereas the degenerate levels give 
rise to (Ei + Eo) o (114 +8E\+8E>) =16A+ 194,+ 19Fo. 
The relative intensity of the two members of a doublet 
is therefore 11/16. 

This discussion is also applicable to the molecules 
with a planar frame such as acetaldehyde. 


B. CH3NO2 Type Molecules 


The method outlined in the previous section may also 
be applied to find the statistical weights of the de- 
generate and the nondegenerate levels for this type of 
molecule. When the barrier is low the separation be- 
tween the nondegenerate level and a degenerate level— 
say between a level with m=0 and one with m=1—is 
of the order 2Fm which may be comparable to kT. 
Hence the relative intensity of the various members of 
the multiplet arising from the internal rotation is not 
simply equal to the ratio of the statistical weights. 
Rather, the Boltzmann factor must be taken into 
consideration. 

In addition, for the case of CH2NO,.!® the wave 
function must be even with respect to the interchange 
of the two oxygen (J=0). This operation is C.C,°. Thus, 
only the levels with symmetry properties of A», Bze, 
Eie, Bx, By, Eo!" can exist in CH;NO.,!8, For CH3BF.2 
these species will have one-third the statistical weight 
of the others because of the fluorine nuclear spin (J=4) 
and the requirement of antisymmetry for exchange of 
the two fluorine nuclei.!! 75, 


7. VIBRATION-TORSION-ROTATION INTERACTION 


The methods for the determination of the potential 
barrier described in Secs. 3 and 5 utilize the splittings 
between the same rotational transitions corresponding 
to the two ground torsional levels. These methods, 
however, fail when the splittings of the rotational lines 
vanish as in the case of symmetric tops (see Fig. 4) or 
fall below the limit of resolution of the spectrograph for 
very high barriers. The latter difficulty may be over- 
come, at least partially, by measuring the splittings of 
the rotational lines in the excited torsional levels 
(satellites) which are considerably larger than those in 
the ground torsional state. Because of the influence of 
the molecular vibrations on the internal torsion and the 
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over-all rotation, these satellites are shifted from the 
“main lines” (the lines associated with the ground 
torsional state). Furthermore, the splittings of the 
satellites sometimes contain sizeable contributions from 
the effect of molecular vibration (cf. Sec. 5 II) as 
well as the internal rotation. With a careful analysis 
of the interaction of vibration, torsion, and over-all 
rotation, it is possible to determine the barrier height 
either from the splittings of the satellites or from the 
shifts of the satellites from the main line. The latter 
procedure is particularly significant because it is 
applicable to symmetric top molecules. 

Besides the frequency shifts, the intensities of the 
satellites are decreased by the Boltzmann factor. From 
the intensity ratios between the satellites and the main 
line, one can calculate the energy differences between 
the ground and excited torsional states from which the 
barrier height may be determined. 

In the present section two methods—the satellite 
frequency pattern and the relative intensities—for de- 
termining potential barriers are presented. Neither of 
these methods gives as accurate results as the “splitting 
method” described in the previous sections. Therefore, 
these methods are used only when the splitting method 
fails to yield the barrier height or when an analysis of 
the torsional satellites is desired to study torsion-rota- 
tion interaction. 


I. Satellite Frequency Pattern 


In the discussion of the theory of internal rotation 
in Sec. 2 the effects of molecular vibration have not 
been considered. This is justified if the torsional fre- 
quency is well below the frequencies of the other 
vibrations. Since the moments of inertia do not depend 
directly on the torsional normal coordinate, the only 
way the torsional mode can affect the moments of 
inertia is by means of higher-order interactions via the 
other vibrations of the molecule. When the nonrigidity 
of the two halves of the molecule is taken into account, 
some interesting results are obtained. Consider first the 
symmetric molecules such as methyl] silane, CH;SiH3. 
Classically, as the CH; group oscillates about its equi- 
librium position, the centrifugal force produces a slight 
change in the molecular dimensions. Because both the 
angular velocity and the potential energy are functions 
of the internal angles a, the instantaneous configuration 
of the molecule depends also on a. The rotational con- 
stant B determined from the microwave spectra is 
actually related to the average value of the reciprocal 
of the moment of inertia for a given torsional and 
vibrational state. One would expect that the rotational 
constants and thus the rotational lines associated with 
the different excited torsional states would be shifted 
by different amounts (even for the case of a symmetric 
top). From the pattern of these lines one can then de- 
termine the height of the potential barrier. Further- 
more, this type of internal torsion-vibration interaction 
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also gives rise to an additional splitting of the pairs of 
lines associated with the A and £& sublevels of a given 
torsional level; but as shown by Kivelson, such a 
splitting for the ground torsional state of CH3SiH; is 
less than 0.05 Mc which is experimentally unobservable. 

The quantum-mechanical treatment can be carried 
out in this way: The Hamiltonian, including the rota- 
tion, internal torsion, and vibration,{f{{{ according to 
Kivelson,*:**> is written as 


H=4 Diky P:P; Hr +H,, 


where P; is the component of the total angular mo- 
mentum along the 7 principal axis, mij is the 77 com- 
ponent of the inverse of the instantaneous inertia 
tensor and Hr and H, represent the part of the Hamil- 
tonian connected with the internal torsion and vibra- 
tion, respectively. A Van Vleck transformation™ similar 
to that used by Howard and Wilson”? is applied to 
remove the nondiagonality in the vibrational quantum 
number v. The results may be expressed approxi- 
mately as 


H=W, +3 Xilol uyl) P:P + (| A] 2) 
+ Xoo’ | (3 Eru Pie t HAm A 


The first term W,, the vibrational energy, represents 
merely an additive constant and can be safely dropped. 
If only the splittings of the rotational lines in various 
torsional states are desired, then the terms independent 
of the rotational quantum numbers can be neglected. 
Finally, the terms corresponding to the ordinary cen- 
trifugal distortion effect can usually be ignored. By 
using Eq. (7-2) as the Hamiltonian, Kivelson® calcu- 
lated the effective rotational constant B for a symmetric 
rotor and arrived at the equation 


Byg= B,+F, (Kvo | 1—cos3a| Kvo) +G» (Kvo | p’| Kvo) 
+LK (Kro|p'| Kvo), (7-3) tht 


where p’ is the angular momentum associated with the 
torsional motion as defined in Eq. (2-8), B, is the ef- 
fective rotational constant averaged over the vibra- 
tional functions, and F,, G,, and L, are empirical con- 
stants which depend on the interaction of the vibra- 
tions and the internal torsion. It is now possible to 
modify Eq. (7-3) slightly in order to see clearly the 
difference between this expression and the familiar one 


for ordinary vibrations. The torsional equation in- 
reduced form is 


(7-1) 


(7-2) 


9s 
[+a = cos3a) =- (9/4)bvoP Kvo. 


ttttt Here the word “vibration” means all the inter 
of motion other than the hindered rotation. i 


ee ear used the quantum number m for K 
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In this equation the kinetic and potential energy may be 
identified 


(T)= (Ato | p”| Kvo) 
9s 
(V =a 1—cos3a| Kvo). ~ 


For K=0 Eq. (7-3) becomes 


Bye= By+Fy'(V)+G,AT), (7-5) 
where 
F,'=8F,/9s; 
or 
B,= B,—B (1) — B21 (T)—(V)], (7-6a) 
= B,— (9/4) B1br6— Bevo, (7-6b) 
where 


fi=—3(G,+F,’), Xre=(T)—(V), 
B= ICR), (H)=(T)+(V)= (9/4) bee 


Now for harmonic vibrations 
(T)=(V), 


and the asymptotic form for b, as s—> œ is },~2(s)} 
X (v+3). It follows that 


Byo= B,—ar(v+3), (7-7)° 


ar= (9/2) (s)3Bi. 
For the /=1 < 0 transition of CH;SiH; Kivelson found 


F,=—65.322 Mc and G,=—0.8824 Mc, 
s=30.22 (a=17.0=;%;), 


and therefore F,’= —1.9212 Mc, 6:=+1.4018 Mc, bz 
=—0.5194 Mc, and ap=+34.68 Mc. For an ordinary 
vibration, the constant a is of the order B2/w which 
is ~22 Mc. Consequently, we see that this analysis is 
very similar to that for ordinary vibration rotation 
interaction; only the extreme departure from har- 
monicity must be corrected by the 82 term. 

Kivelson’s method of analysis has also been ex- 
tended to include asymmetric molecules*-*445 and the 
Stark effect.*** Application of this theory has been 
made in the study of the vibration-torsion interaction 
of methyl alcohol,” methyl mercaptan‘*.** and pro- 
m pylene oxide*’ (cf. Fig. 6). : 

Hecht and Dennison” have also investigated the 
= nonrigidity effects in methy] alcohol. In addition to the 
centrifugal stretching, these authors point out that the 
= effect of Coriolis force (from the internal motion) on 
a the distortion of the equilibrium configuration of the 
molecule should also be considered. While the OH 
group executes the internal torsion, the Coriolis force 

Ze ode from the end-over-end rotation of the entire 

Ga. e causes the hydrogen atom to rock back and 


where 


ok ie 

ee 
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forth on the COH plane. The effect is to produce a 
change in the average moments of inertia (over one 
external revolution) which, analogous to the distor- 
tion by centrifugal force described in the beginning of 
this section, depends on the torsional state. Hecht and 
Dennison™ also take into account the fact that the 
height of the barrier is a function of the molecular di- 
mension. Inasmuch as the origin of the barrier is not 
clearly understood, the dependence of the barrier height 
on the displacements of the atoms cannot be calculated 
in any simple way; it can best be described in a para- 
metric form as 


V=V 1H} abat] (7-8) 


where the ôa: are the displacements of the atoms. A 
rigorous theoretical treatment which takes into con- 
sideration the effects of centrifugal and Coriolis forces 
on the average rotational constants as well as the change 
of potential barrier height could become very lengthy 
because of the large number of normal vibrations which 
could effect the moments of inertia through Coriolis 
interaction. Hecht and Dennison”! carried out the calcu- 
lation in two steps. They first examined the interaction 
between the rotation and the normal vibration in which 
the O—H bar moves in its own plane relative to the 
methyl group. In other words, they used a relatively 
simple model for CH;OH which has five degrees of 
freedom—three for over-all rotation, one for internal 
torsion, and one for the O—H rocking. All the other 
modes of vibration were considered to be frozen. In 
the second step the remaining normal vibrations were 
introduced as perturbation and the effects of these 
additional terms on the energy levels were investigated 
with the aid of appropriate approximations. As the 
vibrational frequencies of methyl alcohol can be ob- 
tained from the vibrational spectrum, and the potential 
barrier height has been determined from the rotational 
transitions in the ground torsional states, the only 
parameter that needs to be introduced is the one which 
describes the variation of the potential barrier height 
with respect to the vibrational coordinate. Hecht and 
Dennison! have been able to explain quantitatively 
some thirty splittings of lines of normal methyl alcohol 
and five other isotopes using only six empirical 
parameters. 

This treatment can, in principle, be applied to deter- 
mine the potential barrier of symmetric top molecules 
for which the methods described previously (PAM and 
IAM) fail. However, the force constants for most of 
the symmetric top molecules are usually not known to 
a sufficient degree of accuracy and therefore it becomes 
necessary to introduce several empirically determined 
parameters. A reliable result on the barrier height is 
obtained only if a sufficiently large number of lines in 
the excited torsional levels can be observed. 

Recently, Swan and Strandberg’ reanalyzed the 
results on methyl alcohol considering vibration rotation 
interaction more completely. By using the observed 
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infrared vibrational spectrum for the normal frequencies 
they diagonalized the vibration-rotation-internal rota- 
tion matrix, i.e., they attempted to calculate the em- 
pirical constants F», G», etc., from the vibrational wave 
functions. As Hecht and Dennison point out, this pro- 
cedure can, at present, only be used to verify constants 
but not calculate good values for them. This is borne 
out in Swan and Strandberg’s results in which the 
quantitative fit of the calculated frequencies to the 
observed is only fair. Nevertheless, this type of calcula- 
tion is valuable, and with improved wave function and 
energy values a better quantitative agreement on 
molecules might be obtainable. 


II. Intensity Method 


The ratio of the intensity (1/0) of given rotational 
transitions in the first excited torsional state to the 
ground torsional state is equal to the relative population 
in the two torsional states, i.e., 


T1/To= (g1/go) et *T, (7-9) 
where go and g; are the statistical weights of the ground 
and the excited torsional states and AF, is the difference 
in energy between these two states. Since the ratio of 
gı and go can be determined from group-theoretical 
consideration, a measurement of the relative intensities 
between a rotational line and its satellite gives the tor- 
sional frequency which, with the aid of the Mathieu 
tables 610% leads to a determination of the potential 
barrier. Where there are uncertainties as to the ratio 
of the statistical weights of the two levels, the rela- 
tive intensity can be measured at several different 
temperatures. 

One of the first molecules studied by this technique 
was CH;CF3.4:? The barrier heights of a number of 
other molecules obtained by this procedure are sum- 
marized in Table X. The main virtue of the intensity 
method lies on its simplicity and its applicability to the 

: case of very high barrier in which the splitting between 
the A and £E lines are undetectably small. However, the 
experimental error involved in the relative intensity 
measurement is rather large—usually 10 to 20% or 
even higher. It is therefore not surprising that some of 
the resulting barrier heights obtained represent little 
more than estimates. 

Baird and Bird? studied the problem of the measure- 
ment of relative intensities in the microwave region. 
They propose a procedure by which one can obtain an 
improvement in accuracy and reproducibility of the 
intensity ratio. Unfortunately, their technique or one 
comparable has generally not been used in the work 
reported in Table X. Recently, Verdier and Wilson" 

z obtained more refined measurement on the relative 
intensities for the spectral lines of CH3CHO and 
CH;CH.F using a Stark-modulated cavity. Their re- 
sults on the barrier height compare favorably with 
those obtained by the “splitting methods” (see ‘Table 
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IX). They point out some limitations of the intensity 
method. 

First, the quantity measured is usually the ratio of 
the peak absorption heights. The absorption coefficient 
of a line at its center for a Van Vleck-Weisskopf line 
shape is: given in Eq. (7-10)!8: 


a (vo) = (167720 | Umn | 2yor) /3kcT, (7-10) 


where zo is the number of molecules per unit volume in 
the lower level of the transition, umn is the matrix ele- 
ment of the dipole moment between the upper and the 
lower level, and 7 is the mean interval between colli- 
sions. In order to compare the peak absorption co- 
efficients of two lines vo, umn, 7, and T must be the same, 
otherwise any difference will contribute to a correction 
term in the intensity ratio. The temperature T will 
undoubtedly be the same for the two lines. The correc- 
tion for the difference in vp can be made easily. How- 
ever, the two are usually so nearly equal that this cor- 
rection is negligible. The transition moment umn may 
be different, because the asymmetry parameter could 
change from the ground to the excited state. For cer- 
tain transitions in nearly symmetric tops, mn may be 
quite sensitive to a small change in the asymmetry 
parameter and in such case an appropriate correction 
in Eq. (7-10) should be applied. The variation in + 
between torsional levels should normally be within the 
limit of experimental accuracy. 

The second difficulty which might arise is that the 
A and E lines are not resolved, and yet the separation 
between these lines is about the same order of magni- 
tude as the line width; the measured peak intensity 
then does not correspond to Eq. (7-10). For example, it 
may happen that the A and & lines are resolved in the 
excited torsional state but overlap (with only one peak) 
in the ground state. In this case serious error would re- À 
sult if a correction were not applied to the measured 
intensity ratio. In order to correct for this effect, one 
can measure the integrated intensity of these lines in- 
stead of the peak value. The integrated intensity of the 
resolved A or £ line is then one-half of that of the com- 
posite line. Alternatively, if the splitting of the un- 
resolved doublet is calculated from theory, then a rela- 
tion between the actual and apparent peak absorption 
can be derived by adding two Van Vleck-Weisskopf 
shaped lines separated by a frequency equal to the 
splitting of the doublet. 


8. EXPERIMENTAL RESULTS 


Table IX gives the experimental values for the barrier _ 
heights determined from frequency measurements of _ 
the microwave spectra. If more than one val 
available in the literature, the best value aceord 


our opinion is reported.§$$§$ a 


§§$§§ The major references on each molecul 
one which gave the best value for the barrier as sl 
is underlined. ` cas es 
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TABLE IX. Barriers by the frequency method. 


Compound Formula V (em~) V (cal/mole) Method References 
Threefold barriers 
Methyl alcohol CH;0H 374.8 1070 IAM Sh YA (if 
A PAM 98 
nonrigidity 21, 77, 103 
experimental 10, 12, 28, 
30, 109, 110 
Methyl amine CH;NH, 691.1 1976 IAM 31, 78, 79 
experimental 28, 56-58 
88-91 
CD;NH:2 684.7 + 2 1958 TAM 59, 46 
Methyl mercaptan CH;SH 444 + 10 1270 IAM 49, 50, 35 
Acetic acid CH;COOH 169 483 + 25 PAM 105 
Acetaldehyde CH;CHO 406 1150 + 30 PAM 38 
Acetal fluoride CH;COF 378 1080 PAM 80 
Acetal chloride CH;COCI 472 1350 PAM 93 
Acetal cyanide CH3;COCN H4 1270 = 30 PAM 52 
Methyl formate HCOOCH; 416 1190 = 40 PAM 11b 
Ethyl bromide CH:CH-Br 1248 3567 = 30 IAM Ist 66a 
Ethyl chloride CH;CH:Cl 1245 3560 + 12 IAM Ist 66a 
Ethyl fluoride CH;CH2F 1158 3306 PAM 1st 22 
1,1 difluoroethane CH;CHF» 1112 3180 PAM ist 22 
Methyl silane CH;SiH; 558 + 17 1595 nonrigidity 42 
595 1700 +100 PAM 39 
Methy] fluorosilane CH;SiH2F 545 1560 PAM 8la 
Methyl difluorosilane CH;SiHF2 439 1255 PAM 102 
Methyl germane CH;GeH; 433 1239 + 25 PAM 53a 
Methy] allene CH;:CH=C=CH> 556 + 2 1589 + 6 IAM 63 
Propylene CH:CH = CH: 692 + 6 1978 = 17 IAM 62 
1-fluoropropylene CH;CH=CHF 752 2150 +150 PAM 92 
2-fluoropropylene CH3CF=CH: 848 2424 + 25 PAM 81c 
Propylene oxide CH;CH—CH, 895 + 25 2560 = 70 PAM 101, 27 
SA IAM a 
1-chlorobutyne 2 CH;C=CCH:Cl <30 <100 aoe 53b 
Viny] silane CH.CHSiH; 523 + 18 1495 + 50 PAM 81d 
Ethy] iodide CH;CH.I 839 2400 PAM 33c 
Sixfold barriers 
Nitromethane CHNO: 2.11 6.034 0.03 low barrier 106, 107 
CD:NO: 1.82 5.19+ 0.03 low barrier 10 
Methyl boron difluoride CH;BF2 4.82 13.77+ 0.03 low barrier 75 
Two threefold barriers 
Acetone (CH3)2CO 266 + 20 760 PAM 100 
Dimethyl ether (CH:3)20 950 + 50 2720 PAM 33b 
- Dimethy] silane (CH) SiH» 582 1665 PAM 81b 
Twofold barriers 
Hydrogen peroxide H202 113 323 IAM 70, 71 
Phenol CHOH 1100 +100 3140 wee 50b 


As discussed in Secs. 3, 4, and 5, the ratio V3/F is 
determined from the analysis of a microwave spectrum. 

= In general, this ratio can be determined to an accuracy 
of two percent or better. Uncertainties in the molecular 
= structure and limitations in the approximations are the 
as two main sources of error. The spectra of molecules with 
= Jower s values are influenced to a much larger extent 
py the internal rotation and usually these can be 


ed to give somewhat more accurate values for the 
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ratio V3/F. Structural errors, in regard to the reduced 
moment of inertia of the methyl group, cause a large 
share of the error in the barrier. In our opinion, the 
barrier heights reported in Table. IX are accurate to 
somewhere between 2% and 5%. 

The spectra of methyl alcohol and methyl amine 
have been studied very thoroughly in the microwave 
region. Furthermore, a considerable amount of work 
has been done on their theoretical interpretation using 
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the IAM given in Sec. 3. To a lesser extent methyl 
mercaptan has also been studied. These three molecules 
are of the class of an intermediate barrier molecule with 
slight asymmetry. Mainly the perpendicular (AK = +1) 
Q branch lines (AJ=0) have been analyzed for the 
barrier height. The effects of the asymmetry and the 

yi off-diagonal internal rotational matrix elements are 
calculated by second-order perturbation. Methyl amine 
has the additional complication of inversion similar to 
ammonia. 

Also the J=1 — 0, K=0 transition has been analyzed 
for methyl alcohol*!,77.!% including the effects of non- 
rigidity. The splittings of the internal rotation doublets 
are caused by the rigid internal torsion as well as the 
molecular vibration, but the division between these two 
effects is not known accurately since it depends rather 
critically on the structure of the molecule. Slight shifts 
in the methyl group axis affect this division. With six 
empirical constants a good agreement between the 
theory and experiment was obtained for the frequency 
shifts of the excited states transitions for a number of 
isotopes of methyl alcohol. Nishikawa” obtained a good 
fit of the spectrum with only the three symmetric top 
constants. Attempts have been made™ to calculate 
these “empirical constants” from the vibrational wave 
functions and qualitative agreement between the calcu- 
lated and observed values has been obtained. 

The results of the barrier height of methyl alcohol 
and methyl amine obtained by various methods are all 
in good agreement. However, for methyl mercaptan the 
situation is somewhat confusing. Intensity work by 
Solimene and Dailey® gave a value of 371 cm™. Kilb*® 
determined a value of 246 cm by rigid hindered theory 
(PAM to second order) on the /=1 <0 transitions of 
a large number of isotopic species measured by Solimene 
and Dailey.” Kojima and Nishikawa first obtained a 
barrier‘? of 280 cm, but subsequently they revised 
their value® to 444 cm™!. Also the J=1 <— 0 lines of the 
isotopic species C?H;SH, CH;SH, C?D;SH, and 
C"H;SD were analyzed by the symmetric top non- 
rigidity theory. According to Kojima and Nishikawa 
the J=1<—0 of C?H;SD was incorrectly assigned by 
Solimene and Dailey. The results of Kojima and 

á Nishikawa*® are probably more accurate because their 

analysis was carried out to a higher order of 

approximation. 

The molecules acetaldehyde (CH;CHO), acetyl 
fluoride (CH3COF), acetyl chloride (CH3COC)), acetyl 
cyanide (CH;COCN), propylene (CH;CH=CH2), 1 
fluoropropene (CH3;CH= CHE), methyl allene (CH;CH 
= C= CH3), methyl silane (CH3SiH3), methyl fluorosi- 
lane (CH;SiF Hy»), and methyl difluorosilane (CH;SiF:H) 
are examples of molecules with relatively high barriers. 
A pseudo-rigid rotor splitting was used to determine the 
barrier height. In some cases transitions up to J=12 
were analyzed using the PAM. Also in the near sym- 
metric top molecules the first-order (nonrigid rotor) 
terms produced a large effect. This was especially true 
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in acetaldehyde’?! (see Sec. 5) and methyl silane, mono-, 
and di-deuterated.*° Methyl silane has been analyzed 
by nonrigidity theory and all of the determined 
barrier heights agree very well with one another. 

The barrier heights in ethyl fluoride,” 1,1-difluoro- 
ethane,” ethyl chloride,’ and ethyl bromide®®* were 
determined by the splittings in the first excited tor- 
sional state since the ground state splittings were too 
small to observe. To a good approximation the splittings 
of the satellites of the first excited” torsional state are 
caused entirely by the hindered rotation (but not the 
molecular vibrations) because of the high barrier. 

In propylene oxide? the barrier was determined by 
the splittings in the ground and first two excited tor- 
sional states. The agreement is exceedingly good in- 
dicating the validity of using excited state splittings 
for the determination of barrier heights in molecules of 
high barriers. No effects of a V6 potential were observed. 

Acetic acid is a molecule with an intermediate 
barrier and large asymmetry. Unusual difficulty was 
encountered in the analysis, but, by using the PAM 
to fourth order, Tabor was able to assign a number of 
higher J lines. The assignment was one of the more 
difficult aspects of this problem. The IAM would have 
been very difficult to use in this particular case because 
of the relatively high J transitions necessary for its 
analysis. 

Low barriers have been observed in nitromethane 
and methyl boron difluoride. Here the m=3 lines give 
an accurate determination of the barrier height. In the 
low barrier problem the initial assignment is quite 
difficult and an extensive searching of the lines is 
needed. The accuracy of these barrier heights is ex- 
ceedingly good and generally better than that in the 
high barrier case. This results from the wide splittings 
of the order of 1000 Mc as compared with small split- 
tings of a few Mc of the high barrier problems. The very 
low barrier obtained in these two cases for a sixfold 
potential seem to indicate that the ratio of V3 to Ve is 
generally of the order of 100. 1-chloro butyne-2 is 
another molecule with a low barrier which has been 
recently studied.» Only an upper limit of the barrier 
height has been obtained. 

The barriers have been determined for a number of 
molecules by the intensity method of Sec. 7. These are 
listed in Table X. Generally, an error of 10 to 20% can 
by assigned to these determined barrier heights. It is _ 
interesting to compare some of the results in Table X _ 
with the more accurate values of Table IX. Et 
bromide'®.!8 is 21% low; ethyl chloride!!?.4, 4 5% 
ethyl fluoride" is in agreement; and 1, 1-difluc 
is 12% high. By taking into account a 
values in Table IX, we can see that 
the values of Table X is quite r 

Tt is interesting to compare th 
heights obtained by microwave 
from other means, particul: 
methods (see Sec. 1). V 


fa 
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TABLE X. Barriers by the intensity method. 
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Compound Formula V (cm~) V (cal/mole) References 
Acetaldehyde CH;CHO 386 1103+ 60 111 
Ethyl bromide CH;CH2Br 980 2800+500 115, 116 
Ethyl chloride CH;CH:Cl_ > 1190 3400+ 600 112, 114 
Ethyl cyanide CH;CH:CN 1150+ 100 3280+290 54 
Ethyl fluoride CH;CH:F 1160 3310210 111, 51 
1,1 Difluoroethane CH;CHF»2 1250+200 3570+580 94 
1,1,1 Trifluoroethane CH;CF; 1200 3480 14, 72 
Methyl mercaptan CH;SH 371+ 43 10604-120 95 
Methyl silane CH;SiH; 400+ 80 1315 60 
Methyl trifluorosilane CH;,SiF3 420 1200 PE sk, 87 
1 Trifluoro butyne-2 CH;—C=C—CF; <100 <300 3, 4a, 61 
Isobutane (CH:)sCH 1360 3900 65 
t-Buty] fluoride (CH3)3CF 1500 4300 65 
Trimethyl phosphine (CH3)3P 910 2600 65 
Trimethyl amine (CH;)3;N 1530 4400 64 
Methyl germane CH;GeH; 585 1673 4b 


the thermodynamic results are beyond the scope of 
this paper, it may be stated that the barrier heights 
determined by the splitting methods are usually more 
accurate than those obtained from the thermodynamic 
properties of the molecules. On the other hand, the 
intensity and thermodynamical methods yield results 
of comparable accuracy (ten to twenty percent uncer- 
tainty). With careful experimental measurements and 
in favorable cases, this error may be somewhat lower 
for both of these methods. In the absence of any fre- 
quency work on a given molecule, a careful investigation 
of the specific experimental difficulties is required in 
order to decide whether the thermodynamic results or 
the microwave intensity results are more reliable. 

In addition to the barrier heights, the equilibrium 
configuration has been obtained for a number of mole- 
cules listed in Table XI. These equilibrium configura- 
tions are valuable because they supply additional in- 
formation which, besides the barrier heights, any theory 

of the origin of barriers must explain. Ethyl chloride 


TABLE XI. Equilibrium configuration. 


Refer- 
Compound Formula Configuration ence 
d CH:DCHO Methyl group eclipses O 
acetaldcnyss CHD:CHO and staggers H 38 
id CH:DCOCI Methyl group eclipses O 
NESE CHD:COCI and staggers Cl x 93 
yyanid CH:zDCOCN Methyl group eclipses O 
NCEE lear 2 CHD:COCGN and staggers CN 52 
i CH:DCOF Methyl group eclipses O 
Acetal fluoride CHD:COF and staggers F 80 
H=CH2 Methyl group eclipses 
Propylene CIDG * the double bond and 
staggers H 26 
` i 113 
: | chloride CH:D—CH-:Cl1 staggered 
k " ee ] silane CH:D—SiH:D staggered 39 
o GM CHsD -SiH D: r 
1 ilane CH:D—SiH:F staggered a 
TE abe CH:DSIHF: staggered 25c 
1-Difluorosilane —ăć G HD;SiHF: 


monodeuterated"’ was the first molecule in which the 
staggered configuration was confirmed by microwave 
spectroscopy. The tunneling rate is much slower than 
the measurement time in the microwave region so the 
two different configurations are observed : the deuterium 
gauche and trans to the chlorine atom. Each of these 
was observed. Even with only an approximate structure 
these observed rotational constants would be incon- 
sistent with an eclipsed configuration. 

Finally, a few molecules have been studied recently 
with two internal rotors. Although the theory, an ex- 
tension of the work of Sec. 3, is not covered here [see 
references 84, 40, 100, and 33(b) ], the results on acetone, 
dimethyl ether, and dimethyl silane are given in 
Table PX. The theory of internal rotation for molecules 
with an asymmetric internal rotor is not covered in this 
review, but two molecules, hydrogen peroxide,” and 
phenol, have been studied and analyzed approxi- 
mately as a symmetric top. References 6, 9, and 86 give 
some of the theoretical work. 


9. CONCLUSIONS 


After the various discussions on the interactions of 
internal rotation and over-all rotation, one naturally 
questions what causes these potential barriers and 
whether the values reported in Sec. 8 can be explained 
quantitatively in terms of the electronic structure of 
the molecules. Furthermore, can unknown barrier 
heights be predicted satisfactorily? Numerous attempts 
have been made (see Bibliography—Theoretical and 
Empirical Methods of Barrier Height Estimation) to 
provide answers to these questions, but the results do 
not seem very encouraging. This is understandable 
since the potential barrier represents only a very small 
fraction of the binding energy of the molecule. It seems 
highly impractical at this time to make any serious 
attempt to evaluate the barrier heights by the direct 
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and rigorous approach, i.e., from the difference in energy 
of the entire molecule for the two extreme configura- 
tions corresponding to the eclipsed form («= 60°) and 
the staggered form (a=0°). 

In order to offer a simple picture for the nature of 
the hindering potential, various mechanisms such as 
Van der Waals forces,” direct electrostatic inter- 
actions,” 010.013 overlap, and exchange force," have been 
proposed for the intramolecular interaction. Recently, 
Wilson”?!8 has made a critical examination of these 
various hypotheses and found that none of the existing 
theories can account for the experimental results of the 
barrier heights of the various molecules. He °'” suggests 
that the potential barrier could be an inherent property 
of the axial bond itself and that it is is not due to the 
direct forces between the attached atoms to any great 
extent or to the electronic distribution in the attached 
bonds which are at any considerable distance. Pauling,” 
on the other hand, hypothesizes that exchange inter- 
actions of the electrons in the attached bonds, as deter- 
mined by the overlap of these attached bonds which 
extend in the direction of the axial bond toward one 
another, are responsible for the potential barrier. By the 
use of higher orbitals, d and f, he makes a rough esti- 
mate of this nonbonding interaction which is in the 
right order of magnitude for the observed potential 
barriers. Although the absolute values are only for- 
tuitous, a number of interesting relative values are 
presented. From his hypothesis, unshared electron pairs, 
which do not hybridize with the f orbitals, would not 
contribute to the potential barrier, i.e., methyl alcohol 
would have a barrier approximately one-third of that 
in ethane and methyl amine would have a barrier 
approximately two-thirds of that of ethane. Another 
interesting point is the existence of the very low barriers 
in molecules such as nitromethane. A sixfold barrier 
would require 7 orbitals with an azimuthal quantum 
number /=6. Because of the large promotional energy 
these probably would not contribute much to the 
hybrid bond orbitals and hence we would predict a 
low barrier. 

Which of these various mechanisms, or combination 
of mechanisms, correctly accounts for the potential 
barriers measured in molecules is still unanswered. It is 
hoped that this review will stimulate further research 
toward the understanding of the origin of the potential 
energy hindering internal rotation. Nevertheless, from 
the viewpoint of molecular dynamics, the problem of 
internal rotation in molecules, i.e., the effect of the 
internal rotation on the energy levels, is quite well 
understood. The agreement between the theory and the 
observed spectra, including all the fine details, is par- 
ticularly outstanding. Furthermore, the study of the 
microwave spectra offers a method, which is by far the 
Most accurate at the present time, for r et th 
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APPENDIX 1 


Derivation of Kinetic Energy and 
Angular Momentum 


In Fig. 9, O and C represent, respectively, the center 
of mass of the entire molecule and of the top alone. Two 
vectors r; and o; are drawn to a typical atom in the 
top from the respective centers of mass O and C. The 
position vector (with O as origin) for an atom in the 
framework will be denoted by r;. If ® denotes the vector 
from O to C, one can readily see that 


rj=o+e;. (A1-1) 
The kinetic energy can be written as 
T=} > m;(0r,/01)- (Or,/0L) 
+4 >0jm;(0r,/dt)- (3r;/3t). (A1-2) 


By remembering that 


(dr;/0!) =oX ri, (dr;/0l)=oXr;+ (0a/dt) Xo;, 


where w is the angular velocity of the over-all rotation 
of the molecule, one can rewrite (A1-2) in the form of 


T=} Dim(Xr)?+3 Lym oX r+ (@a/dtXe;) F 
=; miro (o-r) J+} Z mirrao— (o: r) 
+3 mj lopa (o;-da/dt)* | 
+>) m;(oXr;)- (A«/dtXe;). (A1-3) 


The first two terms obviously represent the kinetic 
energy of the over-all rigid rotation of the molecule 
with angular velocity œ, and can be written as łot: l-o 
where I is the usual inertia tensor of the entire molecule. 
The third term may be expressed as 47ed? where Ja is 
the moment of inertia of the top about its own axis of 
symmetry. The last term may be simplified as 


> m;(oXr;)- (da/dtXo;) 
=D im;{L(et+e;)-0;](@-da/d!) 
— (o-0;)[(e+¢,;) (0a/a2) J) 


=) imo? (@-da/dl)— (@-0;)(0;-da/dt)]. (A1-4) 


lic. 9. The posi- 
tion vectors of typi- 
cal atoms in a mole- 
cule with an attached 
internal rotor. The 
O position and C po- 
sition represent the 
centers of mass of 
the whole molecule 
and the internal ro- 
tor, respectively. The 
vectors Ti and r; are 
vectors from the cen- 
ter of mass oO to 
typical atoms) in the — 
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The last step follows from the relation 
> jm;0;=0. 
By using a coordinate system attached to the top with 
the z axis along the axis of symmetry, and the origin 
located at the center of mass of the top, one has 
oj=xityijtzih, 

and 

> 7 5x52;= Dy 32;= 0. 
Equation (A1-4) then takes the form of 


Dail lity tzr) (-da/d!) 
— (wt j;+wyyj+.2;)2;| 00/0! ] 
=} m; (x+y tz) @-de/d!)—w.3;|d0/dt| ] 
=) m;lx’ ty) (o -3a/3l) =I alo:ða/ðt). 


The kinetic energy finally becomes 
T=hot-J-0+3]oe?+Ta(w:da/d!). 
The classical angular momentum is given by 


P=) ani (r:X Or,/01) +> 3m;(1;XOr;/0) 
=> m{1;X @Xr,) ] 
+2 smiL1;X { (oX r;)+ (0e/dtXe;)} J 
=} m[lr:X (Xr) +L mi[1;X (oX r;)] 
+> mil rX (0e/dtXe;) ]. 
The first two terms stand for the angular momentum 
of the over-all rigid rotation of the entire molecule 
which can be written as I-w. In order to simplify the 
last term, use is made of the fact that ø; is a vector 
measured from the center of mass of the top and that 
the axis of the top is a principal axis of the top. It 
follows that 


Dom 1;X (8a/d1Xe;) ] 
= Dimi (e+e;) X (3a/3tX0;)] 
=} m;lo;X (de/dlXo;) | 
= >om,l070a/dt— (c;-da/d1)o; | 
=Pm,o7k|da/dt| —27k|da/dt| ]=1.d0/a. 
Equation (A1-6) then reduces to 
P=IJ-0+/,00/01. 


(A1-5) 


(A1-6) 


APPENDIX 2 
Continued Fraction 
The substitution of Eq. (3-9) or Eq. (3-23) into Eq. 


(3-3) or Eq. (3-16), followed by the multiplication by 
g-i@k+o)@ and integration leads to a recursion formula 


among the A’s: 
Asatro t A Mr)AsrotAswn=0, (A2-1) 


where j A 16 
r=-(0-5) and M,=—(3k+0). 
5 2 9s 


AND ljo IDe 


SW ALBIN 


This defines an infinite set of homogencous equations. 
In order to have a nontrivial solution it is necessary 
that the determinant of the coefficients vanish. This 
infinite determinant can be factored into 3 different 
subdeterminants, one for each ø value. Koehler and 
Dennison’! have shown that Eq. (A2-1) leads to a con- 
tinued fraction. This can easily be seen by dividing the 
recursion relation Eq. (A2-1) by A3x-;4: 


A3 (k+1)+o A A3(k-1) +0 
—= (M,,—h) -_——_, (A2-2) 
“1 3k+o A 346 
and inverting 
A 3k+o 
A 3(k+-1)+¢ 
1 1 
= = — —=G;, 
A 3(k—1)-+0 1 
M,.—r— — M)—2Xr = 
A 3k+0 Mri... 
also 
A bk+o 1 : 
= - - Git. 
A 3 (k—1)-+0 1 
mA 
Miyi-v\—--- 


By substituting back into Eq. (A2-1), one obtains a 
continued fraction equation, 


A= M5 — Grit — Gri. (A2-3) 


A trial value of à is substituted into Gt and G leading 
to a first approximation of A. By successive iteration a 
consistent value of the eigenvalue, À, is obtained. The 
continued fraction does not always converge very 
rapidly and can even diverge. The use of a Newton 
Raphson approximation® is recommended to avoid 
this difficulty. By letting 


S() = Mi—)d— Grat A) — G (A) =0, 


and expanding at the mth approximation of the eigen- 
value (denoted by à”), one obtains 


INSAN AA") A”) 
=) "H An (A—X") f(A”) =0. 
Then 
wt —,” 


SQ 
=F 


(A2-4) 


where 
—f'(A")=14+Grrr?[14+Gp, (H -- -)] 
HOr l1 +HG AH. e). 


Since the G’s are determined in the calculation of the 
eigenvalue, the derivative f’(\”) is readily calculated. 
Care must also be taken to insure that the initial ap- 
proximation is not separated from the desired eigen- 
value by a pole. A pole occurs between each eigen- 
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INTERNAL RODATDION 
value. If one exists between the first approximation and 
the desired root, the continued fraction converges to 
the adjacent root. 

With the desired eigenvalue determined, a substitu- 
tion into Eq. (A2-1) leads to a recursion relation be- 
tween the A’s. By solving these relations and normaliz- 
ing, one obtains the eigenfunctions. These eigenfunc- 
tions are used to determine the matrix elements and 
integrals for the IAM and PAM if the approximations 
given are not sufficiently accurate. The eigenfunctions 
are tabulated for the periodic solutions.!37 For the 
nonperiodic solutions ø is replaced by o+pK in all the 
previous expressions. 
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symmetry species in C3 group 
rotational constants 

wave function expansion coefficients 
Angstrom unit 

IAM “b” coordinate axis 

expansion coefficients 

low barrier off-diagonal element || 
=3 

Mathieu eigenvalues 

rotational constants 

equilibrium rotational constants 


B, Bz, By, Bz symmetry species 
APPENDIX 3 c IAM “‘c” coordinate axis 
Tabulations Related to Mathieu Equations Ck Fourier coefficient of Mathieu’s func- 
tion 
The most recent and complete tabulations are listed z 
PES s (C5 Cry Cog, Cz rotational constants 
in Table XII. References 36 and 104 are the most useful ME ge ate : 
C2, C3, Ce, C rotational symmetry operations 
TABLE XII. Mathieu function tabulations. d reducing factor, r, times (I,J) 
dr Fourier coefficient of Mathieu’s func- 
Period Range - 
Reference in x in V Range in s Accuracy tion 
“Mathieu Functions%e =,2m O(1)§ 0(1)100b 1x103 dxy interatomic distance from atom X to 
9(1)14  0(2)100 : 
Kilbe. d,o 3x ots 2 (1)30(2)52(4)100 1 x10710 atom Y 
Stejskal ‘37 0(1)4 100(1 X107 . . 
Bigech and Rhodese. 2 oe ONIS 7008 1X10-= Die, Di Hamiltonian cross terms between 
Pextoni 4dr 0(1)9 1(1)40 1X107¢ 
pran = components of the total angular mo- 
a See reference 104. EY i mentum 
b A close tabulation is given for the low s values. a 2 K 
e BicenidactonS are also tabulated. x De, Do Fourier coefficients of Mathieu’s 
4 Copies are available from E. B. Wilson, Jr., Department of Chemistry, = z 
Harvard University. Herschbach'’s tables are also available. tunction 
e See reference 36. J o x A 
f See reference 96. DI (B)KpK rotational transformation matrix ele- 
& > 3. 
b Tabulated in terms of £ 0(0.002)0.1 where s =1/#?. ment 
i Tabulated in Li and Pitzer reference 55. Notation: 49 =s, 60 =b —}s. > 
E symmetry species, degenerate 
and complete for the threefold potential case. Also, by Æ energy 
the use of Eq. (3-29) and these tables, the eigenvalues E(x) reduced rigid rotor energy 
for s>30 for other periodicities can readily be deter- Ras rotational energy 
mined. A number of older references are given in È 
aaee A Eso, Exe torsional energy: PAM and IAM ; 
Recently, Herschbach?” tabulated the matrix elements E(JKMvo) total energy : rotation and torsion 
b, £, 4°, cos6a, the perturbation sums to fourth order f Mathieu parameter (Floquet’s theo- 
for use in the PAM, and the Fourier coefficients for rem) È 
use in the TAM. HS eer Nes OED mete f(x) IAM approximate integral function * 
duction explaining the use of the tables to solve the fA) continued fraction function of the X 2 
internal rotation problem. These tables are of great eigenvalue l 
assistance in the computation for barrier problems. F z : 
reduced rotational constant of the 
APPENDIX 4 internal rotor 
X F, > nonrigidit: 
Notation v gidity potential energy expan- 
sion coefficient 
2 ` k EN 
a IAM “a” coordinate axis go, g1 statistical weight factors 
4, AK expansion coefficients gg correction terms in the PAM rot: 
a, a’ low barrier diagonal elements || =3 tional energy 
ay Fourier coefficient in potential energy Ge nonrigidity kinetic energy expans 
Qn Fourier coefficient in IAM internal coefficient 
energy Gxt, GE continued fraction terms »" 
amu atomic mass units h Planck’s constant divided by 
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Hamiltonian 
coupling term in Hamiltonian 
rotational Hamiltonian 
torsional Hamiltonian 
vibration Hamiltonian, harmonic os- 
cillator function 
asymmetry term in Hamiltonian 
rotational Hamiltonian for the vo 
torsion state 
unit vector 
moment of inertia tensor 
moments of inertia 

Tx 
moment of inertia of the internal 
rotor about its symmetry axis 
products of inertia 
unit vector 
quantum number of the total angular 
momentum 
Boltzmann’s constant 
unit vector 
correction terms in the PAM rota- 
tional energy 


quantum number of the total angular 
momentum along the molecular z axis 


free rotation quantum number 
nonrigidity angular momentum ex- 
pansion coefficient 

free rotation quantum number 

mass of the ith atom 

quantum number of the total angular 
momentum along the space axis 
JAM nonperiodic torsional function 
Mathieu function 
megacycles/second 

continued fraction term 

periodicity of the hindering potential 
internal angular momentum 


PAM matrix element of the internal 
angular momentum 


(REP. Pe, Pi, Pk: P zPu, P=) the gth components of the 


P” 


angular momentum 

the gth component of the total angu- 
Jar momentum with the independent 
variables 7 and 7 constant 

TAM torsional function for a sym- 
metric molecule 

PAM Hamiltonian term 

TAM torsional function for an asym- 
metric molecule 


reducing factor 
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ri, rj 


SWALEN 


position vector of the zth (7) atom 


Yay Tb) Vc) Fz, Ty, Tz position vector components 


s 

Sı 

S2 

Se, So 


Syxu (0, ¢)e*x 


1h 
T 
Urola) 


v 


V(q@), Vu, Vs, Ve, 


op) (n) 


W, W, 
v 


Ax 

AEJK 

AE(J K1K41) 
OAsy 


ox 


Ag, (AzsAysAz) 


Hmn 
y 


TI 


reduced barrier height 

rotational transformation matrix 
asymmetric transformation matirx 
Mathieu functions 

symmetric rotor function 

kinetic energy 

temperature 

PAM torsional function 

torsional quantum number 

Vi2 potential energy (subscript de- 
notes periodicity) 

PAM perturbation terms 

rigid rotor energy 

x principal axis 

nonrigidity term 

y principal axis 

z principal axis 

internal angle 

direction cosine of ọ to z principal 
axis 

internal angular velocity 

first term in the vibration rotation ex- 
pansion of the rotational constant 
direction cosine of ọ to y principal 
axis 

nonrigidity expansion terms 

IAM splitting of internal sublevels 
for K=0 

IAM splitting of internal sublevels 
for Kth level 

internal rotational splitting of the 
J, K rotational level 

internal rotational splitting of the 
J k_1K,1 rotational level 

asymmetry splitting of K and —K 
rotational levels 

nonrigid-rotor internal rotor split- 
ting 

IAM integrals 

displacement coordinate 

Eulerian angle 

asymmetry parameter 

eigenvalue of a secular determinant 
direction cosine of the internal rotor 
to the g principal axis 

dipole matrix element 

frequency 

dot product of the total angular mo- 
mentum vector and the vector p 
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o (a) direction of the internal axis such 
that the internal angular momentum 
vanishes 


(b) position vector of the center of 
mass of the top from the center of 
mass of the whole molecule (Ap- 
pendix 1) 


Po component of ọ along the g axis 


torsional sublevel quantum number 


APPENDIX 5 


Comparison of the Notation of Various Authors (Grouped Where Possible) 


oj position vector of a top atom fro 
the center of mass of the top (i 
ternal rotor) 


T mean life time 


p, & Eulerian angles 
X Xp X% X Eulerian angles 
Y, y wave function 


© total angular velocity vector 
wg (Wa, Wb, We, Wz, Wy, wz) Component of the total angular 
velocity along the g axis 


ae : E 

Io, Ii; A,B, C, Ci, D Izz, Tyy, Lez A,B, C, Ci, Ijk Iz, Iy, I: : 
Dee C2 Iı C2 Ia 
p (C2/C)* Sos ` (o) 

(BC,—D*) (BC,—D*) 
r == seca rd J 
(BC— D?) (BC— D?) 

o T° H o(å,E), K 
m k ky DOG m 

Wro!) sae —2pev 

Wyo 3i T 1+4Fp 

Exve, Live Ekin EKun Evo 
AE(JK1K41) (JK1K41), K 0.00 AA, AW 
P, P, Fic P, P, 
p be Li P $ 
Q x o x a 
py/p sing ~f Nn/A 
s H a s 
M Kvo (a), Poe (a), 
Qkre(a), Uve (a) Pxen(x) QKun U vo (a) 

v n n v 


a Dennison, Koehler, Burkhard, Ivash, Hecht, Lide, and Mann. 
b Grinn Myers, and Tannenbaum. 
im! e Itoh, 
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Foundations of the Optical Model for Nuclei 
and Direct Interaction 
G. E. BROWN 


Department of Mathematical Physics, University of Birmingham, Brimingham, England 


I. INTRODUCTION 


HIS paper gives a description of the scattering of 
particles of energies in the kev and Mev range 
by complex nuclei. Starting from exact many-level 
formulas for the scattering amplitude, we show that 
several phenomena which were inexplicable in the old 
compound-nucleus theory arise from phase relations 
between the nuclear levels, whereas terms in which 
phase relations would not seem to be important corre- 
spond to the scattering predicted by the old compound- 
nucleus picture, supplemented by statistical assumptions. 
This paper might be described as a treatment of the 
theory of nuclear reactions in the region in which many 
nuclear levels participate in the scattering. Thus, it is 
complementary to the comprehensive article of Lane 
and Thomas! where the treatment is particularly appro- 
priate to the case in which few levels contribute. These 
authors employed the R-matrix theory of Wigner and 
Eisenbud? which is especially well adapted to the low- 
energy case because it makes the energy dependence of 
all expressions as explicit as possible. We employ here 
the formalism of Kapur and Peierls? which has the 
great advantage for the many-level case that the sum 
over levels enters linearly into the scattering amplitude. 
The formalism employed is only an intermediate step 
to the final results, and, therefore, we choose the one 
that seems simplest for the particular development here. 
Surveying first the old compound-nucleus picture, we 
note the points at which it must be revised. It was 
realized long ago that this was not a complete descrip- 
tion; in fact N. Bohr‘ already suggested in 1938 that a 
particle, upon entering the nucleus, might go directly 
to a final state without forming a compound nucleus. 
However, most calculations were carried out with the 
extreme form of the model, and most physical pictures 
tended to follow in this way. By extreme form we mean 
the form in which it is assumed that the phases of con- 
tributions from different compound levels are random, 
and that they therefore do not interfere. Such assump- 
tions are generally implicit in statistical calculations. 
The compound-nucleus picture formulated by N. 
Bohr’ explained the very narrow resonances observed in 


1A. M. Lane and R. G. Thomas, Revs. Modern Phys. 30, 257 
1958). 

2 4 P. Wigner and L. Eisenbud, Phys. Rev. 72, 29 (1947). 
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the elastic scattering of slow neutrons by assuming that 
the incident neutron, once inside the nucleus, shares its 
energy with many other nucleons through its strong 
interaction with them; the resulting compound nucleus 
then lasts a long time, until one of the nucleons acquires 
sufficient energy to escape from the nucleus. Because 
of the long lifetime of the state, the uncertainty prin- 
ciple shows that its energy can be well determined, and 
hence, that its width is small. The observed resonances 
were of the order of electron volts wide, about a million 
times narrower than the single-particle levels formed in 
the scattering of particles by the potential wells earlier 
assumed to represent the nucleus. 

Since the observed compound-nucleus resonances 
were so long-lived, it was usually assumed that their 
characteristics are independent of the way in which 
they are formed, aside from restrictions resulting from 
conservation laws, and are also independent of the 
neighboring states. It seemed natural then to introduce 
statistical assumptions which neglected interference 
between the scattering from different levels. Then, in 
the neighborhood of a compound state, one could write 
the cross section cag, j for a particle incident in ‘‘channel” 
a to emerge in “channel” 8 (we give precise definitions 
of all terms later), leaving the residual nucleus in excited 
state j as a product of factors 


Y paY pbs 


T 
sa =a, 
k? (ep E) Fa 
where p labels the compound state, Ypa is the partial 
width® for formation of it by a particle in channel a, 
Ypg; is the partial width for its decay, œp is the total 
width of the state, and e, is the real part of its energy. 
This type of formula, valid in the region of a single 
isolated level, was given by Breit and Wigner? and 
bears their names. The above expression for ¢ as a 
product of factors implies the independence of the = 
processes of formation and decay of the level both from _ 
each otker and from the properties of other levels. 
It is clear that the number of compound states ; 
unit energy interval increases rapidly with the 
of the incident particle. Already at energies in t] 
range, the density of levels observed in the scatte 
of slow neutrons by medium and heavy nuclei 


ê Usually the widths are denoted by ca ital lett : 


However, we reserve these for the widths of the ; 
levels in a potential well. $ 


7 G. Breit and E. P. Wigner, Phys. Rev. 49, § 
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Fic. 1. The increasing 
density of nuclear levels 
with excitation energy is 
sketched on the left. Pre- 
dictions of the contin- 
uum model for the be- 
havior of the total cross 
section are shown on the 
right. 


to 10° levels per Mev. Therefore, in the scattering of 
beams of neutrons which generally includes a spread in 
neutron energies of several Mev because of the ex- 
perimental difficulty of producing monoenergetic neu- 
trons of arbitrary energy, one might expect the cross 
section to approach that of a completely absorbing 
system with increasing neutron energy. A specific model 
incorporating these features was formulated by Fesh- 
bach, Peaslee, and Weisskopf* and is known as the 
continuum model. Here, the wave function of the 
neutron incident on the nucleus is subjected to the 
requirement that it have only an incoming part at the 
nuclear radius, i.e., that 


d/dr[r4(r)]| ræk F iKRY(R), 


where 4 (r) is the wave function for the incident neutron, 
R is the nuclear radius, and K is an “internal” wave 
number that the neutron is supposed to possess inside 
the nucleus as a result of its interaction with the other 
nucleons. The predictions of such a theory for the total 
cross section are illustrated in Fig. 1 which also gives an 
indication of how the number of compound states 
increases with increasing excitation energy. The nega- 
tive energy states of the nucleus are shown schematically 
as bound states in a potential well. At high energies the 
total cross section approaches 27R?. The rise in the 
curve at lower energies is due to the fact that, in this 
model, the particle “feels” the nucleus already at 
distance R-+-A, where A=X/2z and À is the wavelength, 
because of quantum-mechanical effects. 

Contrary to expectations, the experimental cross 
sections, for beams in which the neutrons had a spread 
of energies, showed giant maxima of widths of the order 
of one or two Mev, as indicated schematically in Fig. 2. 
These were reproduced theoretically in the later theory 
of Feshbach, Porter, and Weisskopf?’ by the scattering 
from a complex potential well in which the real part 
represents some average interaction of the incoming 
nucleon with the nuclear particles, and the imaginary 
part, the disappearance of particles out of the incident 


Fic. 2. Sketch of a giant 
resonance of the type ob- 
served in the scattering of 
neutrons by complex nuclei. 
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isskopf, Phys. Rev. 71, 145 (1947). 
skon, Phys. Rey. 96, 448 (1954). 
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beam into compound states. There is no natural ex- 
planation for such giant resonances in the extreme 
compound-nucleus picture, and the representation of 
the nucleus by a complex well indicates a major modi- 
fication. 

Other phenomena which demanded modification of 
the extreme compound-nucleus picture were observed 
in the inelastic scattering of nucleons by complex 
nuclei. The predictions of the compound-nucleus model, 
supplemented by statistical assumptions, are shown as 
the solid line in Fig. 3. In this picture, the probability 
of leaving the nucleus in the various excited levels is 
assumed the same for all levels, aside from kinematical 
factors. The curve therefore rises as the energy taken 
off by the scattered particle decreases (and the energy 
left to the nucleus increases) and is cut off only when 
the wave number of the escaping nucleon (or nucleons) 
is so small that it will be reflected at the nuclear surface. 
Experimentally, an anomalously large number of fast 
particles was observed, as indicated by the dotted line 
in Fig. 3. Processes responsible for the high-energy 
particles were qualitatively well described by the 
direct interaction formalism of Austern, Butler, and 


OE) 


Fic. 3. The solid line shows typical predictions of the statistical 
model for the spectrum of inelastically-scattered neutrons. The 
energy of the emitted neutron is denoted by Æ’. The dashed line 
indicates the type of spectrum observed experimentally. 


McManus.” Here, the incident particle is assumed to 
“chip” off particles from the nuclear surface without 
formation of a compound state. For example, the transi- 
tion element for the nucleus to go from state 7 to state 
f is assumed to be 


2M p° 
Mi= A xsl Ee EV (r— E)e** ty E d Ed’r, 
(i ro 


where expik;-r and expik,-r are the wave functions of 
the incident and scattered nucleon, and x0(£) and x;(&) 
those of the initial and residual nucleus. Here r repre- 
sents the coordinate of the incident nucleon and ~ the 
totality of nuclear coordinates, conventions that are 
used throughout. The radius rọ is supposed to define 
the point at which the “surface” begins and was chosen 
so as to fit the experimental angular distributions. 
The form of this theory was suggested by the earlier 
description by Butler" of deuteron stripping which is 
also a form of “direct interaction.” However, for sim- 


10 Austern, Butler, and McManus, Phys. Rev. 92, 350 (1953). 
u S, T. Butler, Proc. Roy. Soc. (London) A208, 559 (1951). 
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plicity, we discuss the scattering of nucleons since com- 
plications are introduced by the composite structure of 
deuterons or a-particles. 

Many experiments have shown that the high-energy 
particles in these inelastic processes have angular dis- 
tributions, usually peaked towards small 
whereas the low-energy nucleons have comparatively 
structureless, symmetrical angular distributions, as 
would be predicted by the statistical theory, and this is 
a further indication of the direct nature of the high- 
energy processes. 

A further phenomenon which violates the early 
compound-nucleus picture occurs in the dipole photo- 
effect. Here, the absorption cross section for all complex 
nuclei shows a gaint maximum in the region of 15 Mev 
for heavy nuclei. This maximum was explained by 
Goldhaber and Teller” and Steinwedel and Jensen" in 
terms of a collective oscillation of proton density. How- 


do 
an 
°° qo 180 6 
do 
dn 
° go" igo 9 


Fic. 4. In the upper figure, the angular distribution of the high- 
energy part of the inelastic spectrum, Fig. 3, is sketched; in the 
lower, that of the low-energy part of the spectrum. 


ever, the decay of such a system, in which the energy 
is almost evenly distributed over all particles, ought to 
be adequately represented by the statistical theory. In 
the case of medium and heavy nuclei, the number of 
high-energy protons observed is far in excess of that 
predicted by this theory, often by several orders of 
magnitude. Such high-energy protons are predicted 
naturally by Wilkinson’s picture of the giant dipole 
photoeffect, in which the y-ray is absorbed by a nucleon 
which then goes into a single-particle state in a complex 
well, and consequently has an appreciable chance of 
escaping with the full energy of the y-ray before being 
absorbed into compound states. (This absorption is 
again described through the imaginary part of the well.) 

Thus, although the old compound-nucleus picture 


12 M. Goldhaber and E. Teller, Phys. Rev. 74, 1046 (1948). 
19 H. Steinwedel and J. H. D. Jensen, Z. Naturforsch. 5a, 413 


(1950). 
“D, H. Wilkinson, Physica 22, 1039 (1956). 
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(el 
Fic. 5. The absorption 
cross section for -y-rays 
showing the giant dipole 
resonance. 
angles, ~TISMEV E 


had to be modified, the new phenomena could be repro- 
duced by simple physical models which employed a 
complex well. However, conceptually, the description 
was not complete because it was known from cases 
where the cross sections could be investigated in detail 
that the giant resonances actually consist of thousands 
or millions of compound levels, and the relation between 
the detailed behavior—which often could not be inves- 
tigated experimentally, but could be inferred—and the 
average behavior as predicted by the complex well had 
to be clarified. Further, the connection of the param- 
eters of the optical well with more fundamental quan- 
tities such as the nucleon-nucleon force had to be made. 

A unified description of the above phenomena begins 
from an exact description in terms of nuclear dispersion 
theory. The scattering amplitude is separated into 
terms corresponding to direct interactions and com- 
pound-nucleus processes. Before going into such a 
description, however, we review the dispersion theory 
of Kapur and Peierls,’ which is used in the later develop- 
ment, and present a simplified picture illustrating the 
physical assumptions employed later in the more 
formal arguments. 


II. KAPUR-PEIERLS DISPERSION FORMALISM 
1. Scattering by a Potential Well 


We begin by treating the scattering of an S-wave 
neutron by a potential well; this simple example illus- 
trates the main features of the theory. The Schrédinger 
equation for this case is 


E Ay V(r) ] 
E—V (r) |o=0, 
2M dr? 


(1) 


where V(r) is the potential and ¢(7)=n)(r) with y the 
wave function of the neutron. For radii 7 greater than 
some radius R, beyond which the potential V (r) is zero, 


(d°p/dr*)--+-k’p=0, r>R, (2) 
with 
2 = 2M Eft. (2.1) 
The solution of Eq. (2) is 
$= (sinkr/k)+ Se*", (2.2) 


where the normalization of ¢ has been chosen so that 
the first term on the right-hand side corresponds to the 
S-wave part of a plane wave of unit amplitude, i.e., 
expikr&sinkr/kr for kr&1; S is the amplitude of the 
scattered (outgoing) wave. : 


896 Gare. 


In the internal region y<R, ¢ can be obtained as 


p(r)=È andm(r), r<R, (3) 

where the ¢,, are eigenfunctions satisfying the equation 
h Dom ; 

TIPA +(Em— V)¢n=0, (3.1) 

2M dr 
and the boundary condition 

dom 
dr r=R 


The eigenvalues Em are complex because of this imagi- 
nary boundary condition, the boundary condition 
depending explicitly on the wave number and hence, 
the energy of the incident neutron. We discuss the con- 
sequences of this later. These eigenfunctions form a 
complete set. Orthogonality of the ¢,, is easily estab- 
lished by considering the equation for n, 


I? on 


2M dr? 


+(En,—V)¢,=0, (3.3) 


multiplying Eq. (3.1) on the left by ¢, and Eq. (3.3) 
on the left by ¢,, and subtracting. Thus, 


I bm Lon ( 
Fae a.) JF Un oF En nPm— 0. (3.4 
2M dr? dr? Kass E3 


By integrating this equation from 0 to R and using 
Green’s theorem, one finds that 


Wf dom don 


R 
sales tm) =E En) f dudndr. (3.5 
2M dr dr F rer 0 Sey 
From Eq. (3.2) it can be seen that the left-hand side 
vanishes and hence 


R 
f bn(r)dn(r)dr=0, En*#En. (3.6) 
0 


In general, Em En implies mn; the exceptional case 
En= En for mæn can be handled by special methods, 
but it does not occur for the type of potentials we 
consider. For m=n, we choose the normalization of the 
$m s so that the integral is equal to unity, i.e., 


a 


J KIO (3.7) 
0 


Two important features are first, the orthogonality 
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By using the orthogonality of the dm we can obtain 
the @m of Eq. (3) by again using Green’s theorem. 


R h? do dom 
f (PnLo—G$LGm)dr= ay (60 —¢ ) ) (4) 
0 2M dr dr J pr 


with 


jR @ 
Qac (1) 


4.1 
2M dr? et) 


From Eqs. (1) and (3.1), one obtains 


h? F /do 
(E— Em)am tor A (=- ito J | 
2M dr R 
l 


9 


1 
SEP AGR), (CL) 
2M 
giving 


he GE ikR 
i — a 


2M En—E 


$om(R). (4.3) 


The joining of the inside and outside solutions at r=R 
gives 


sinkR 
SS Csa see Se, (5) 
giving 
1 kh? Cbn R) E 
=— E ikk —sinkR}. (6) 
keE | m 2M Em— E 


This can be put into familiar form by defining the width 


P [bn(R)] 
Lm=— Ọm R a 6.1 
a (6.1) 
Then 
GEIA Da sink R 
S= DD = eTikR, (7) 
2k m Em—E k 
and 
o=4r|S]|?. (7.1) 


Because of the imaginary boundary condition, the m 
and hence the Tm are complex. For low bombarding 
energies, the imaginary part is small as appears later. 

The imaginary part of Em can be found by the de- 
velopment 


is between $n and m, not pn“ and $m as is usually the 
E; case, and secondly, the orthogonality depends essen- 


R 
f ($nLbm* E Pm” Ehm) dr 
tially on the fact that dm and ġn obey the same boundary ~? 


R 
i =E) f dnputir. (8) 
Peierls, Proc. Cambridge Phil. Soc. 44, 242 (1948). 0 
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From Green’s theorem, the left-hand side is 


(ikh? /M)pm"pm(R), 


and if the imaginary part of Em is denoted by —Bm/2, 
İ.C., 


Em= em—ißbm/2, (8.1) 
then from Eq. (8) we find that 


O oR 
m M R . 
f leno eer 
0 


This has a simple interpretation. The numerator is 
proportional to the escape velocity of the particle kł4/ M 
multiplied by the probability of the particle being at 
the surface, whereas the denominator represents the 
probability of the particle being in the nucleus. At low 
energies where the ¢,, are mainly real, as already indi- 
cated, the denominator is nearly unity [see Eq. (3.7) ] 
and comparing Eqs. (8.2) and (6.1), we see that 


(8.2) 


Bm&=V'n (low energies). (8.3) 


To obtain a better idea of the sizes of the various quan- 
tities, it is useful to illustrate the formalism by evalu- 
ating them for a square well. We choose a well V=—U, 
r<Ro; V=0, r>Ro; with U=42 Mev and Ro=1.45 A} 
X10- cm, where A is the atomic number; these are 
the parameters of the well employed by Feshbach, 
Porter, and Weisskopf,’ aside from the imaginary part 
employed by them which is introduced later. Con- 
sequently, these parameters are typical of those con- 
sidered later. The solution of Eqs. (3.1) and (3.2) are 
compared later with ¢, defined by 


h? d 
— L on O (1) + (En —V)bn()=0, (9) 
2M dr? 


dm” 


=0. (9.1) 
dr r=R 


This real boundary condition is that of the Wigner- 

Eisenbud theory, and the solutions ¢, are real. The 

functions ¢,, are considered because it is easier to 

calculate these real functions and, at low energies, they 

are a good zero-order approximation to ¢m as we show. 
In the square well ġm is equal to 


on=An sin(Km—ikm)r, (9.2) 
with Km and xm defined by 
(Km—ikm)} = 2M (Em+ U)/i?. (9.3) 


For simplicity, the joining radius R in Eq. (3.2) has 
been chosen to be the edge of the well, Ro. It can be 
chosen differently and is generally chosen somewhat 
larger, but this is immaterial for the qualitative features 


of the results derived here. The boundary condition, 
Eq. (3.2), can now be written 


(Km—ikm) COS(Km— ikm) R= ik sin(Kin—tkm)R. (9.4) 


Now KmR is shown to be small at low energies; 
Eq. (9.4) is expanded to first order in this quantity. 
Then 


(K m—ikm)(COSK mR+ikmR sinK mR) 
~ik(sinKmR—itmR cosKmR). (9.5) 


This gives 
Km k sink »R/ (KR sink ,R— cosK mR). (9.6) 


Since cosKm RKK mR sinK mR [see Eq. (9.8) ], it follows 
that 


Km=k/ KR. (9.7) 
By using this value for km, one obtains from Eq. (9.5) 
cosK mR&— (RR) sinK mR/(K mR)? (9.8) 


to lowest order in kmR. Now K,,R is large, of the order 
of 10 even for medium A and zero incident energy, and 
SO Km and cosKmR are small. Equation (9.7) shows that 
kmR is small—which is necessary for the rapid con- 
vergence of the expansion above—as long as EKU, 
that is, as long as the bombarding energy is much less 
than the depth of the well. 

We now let KnR=KmR+6K,»R, where on™ 
= (2/R)' sinKn©R, and Ky» R= (n-+4)m gives a solu- 
tion to Eqs. (9) and (9.1). To lowest order in k,R, Eq. 
(9.8) shows that 


5K n= kLRR/(K mR)®]. (9.9) 


Although this is a second-order correction (in the sense 
that 6K,,R is of second order in kR), it is given correctly 
to this order by our first-order expression Eq. (9.5), 
since further terms enter only in third order. We find, 
further, that 


Sem = €m— EmO = (h?/2M) (2K mK m— Km?) 
=[1/ (KR)? Jem, (9.10) 


where the energies are measured from the bottom of 
the well. For a typical value of KmR of ~10, this means 
that calculation of the real part of the resonance energy 
from the boundary condition Eq. (9.1) gives results 
accurate to ~1%. 

The imaginary part Bm of Em is given to first order 
In KaR oy (2?k/M)Cbm™ (R)? which is equal to 


m= 2h2k/M RET m, 
since 


om (r) = (2/ R}? sinK mr. 


Thus, to lowest order in kmR, the Dm are teal 

dependence of the real parts of the resonance el 
€m on the bombarding energy enters only in t 
order of this quantity. ~~: 
Generalization of these 
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complex square well is trivial. The introduction of 
V@)=—U—-iW (use V to denote a complex well; 
eigenfunctions in this complex well are denoted by 
$m(r), etc.) so that W simply shifts the resonance energy 
by the constant amount — iW. Hence, 


En=Em—iW = €mn—i8 m/2—iW. °(9.13) 


Consequently, introduction of an imaginary part 
broadens the resonance levels. The eigenfunctions are 
unchanged, i.e., dm(r)=@m in the special case of the 
square well. The development for the case in which the 
real and imaginary parts vary with r is easily carried 
out. This development can also be carried out for 
complex velocity-dependent potentials!®; this is im- 
portant since the parameters of the optical model wells 
occurring in practice depend on the bombarding energy. 
It is convenient to take the depth of the well to be 
different for the different eigenenergies at a given 
bombarding energy. In discussing a velocity-dependent 
well V(r), one can consider V to be a nonlocal operator 
V(r',r), which is equivalent to a velocity-dependent 
potential. The wave equation is 


9 


Ear aap are + Endm(r’)— fre 1)dm(r)dr=0. (10) 


Orthogonality of the $, requires V(r’,r)= V(r,r'); 
potentials not obeying this relation can be shown to be 
physically unreasonable.!® Proof of orthogonality as- 
sumes V(r,r’) to be zero unless both 7 and 7’ are less 
than R. 

The resonance treatment for a neutron in the complex 
potential has been developed in some detail, not only to 
illustrate the formalism, but also because knowledge of 
the positions, widths, and spacings of the levels is often 
useful in theoretical estimates. 

Generalization of these results to the case of nonzero 
angular momentum is easy. In this case we express the 
function, regular at the origin, Ya(r) as 


Wer) = 0.(8, ~)ba(r), (11) 


where ©.(6,¢) is the normalized function of angles, e.g., 
in the case of spinless particles, ©.(0,¢) is equal to 
Y” (0,9), where Y,” is the normalized spherical har- 
monic, and ¢zq is the radial function in the potential 
well, regular at the origin and asymptotic to 


e*t sin[ kr— (lar /2)+61], 


where 6, is the phase shift for the Ith wave. The lower 
index a labels all angular momentum quantum numbers, 
i.e., in the terminology of Wigner and co-workers, it is 
be ARG “channel” index. Further, solutions asymptotic to 


E picoing waves are denoted by ¥*(r), i.e., 
at (1) =O.(6,9)ba* (r), (11.1) 


an ae I, BRIA and C. T. De Dominicis, 
_ See Bppendiz C or A A72, 70 (1958). 
ee 
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where ¢a*(r) is asymptotic to exp(kr—ilr/2) in the case 
of neutrons. Generalization to the case of charged 
particles is easily made. Here ¢.+(r) is asymptotic to 
exp (tkr—n In2kr—In/2++-0,) where n=Ze?/hv, with v 
the velocity of the proton, and e; is the Coulomb phase 
shift. Solution of the radial Schrédinger equation can 
be expressed for r> R, as 


ba(r)=[bat(r)— pa (1) J/2ik+ Sahat (r), 


with ga = (¢«*)*, the solution asymptotic to incoming 
waves. The eigenfunctions ¢m (these have the same 
angular dependence as the ¢.(r), so we do not carry the 
subscript a on them) are then determined by the 
boundary condition 


(12) 


dom 
| = fu) | (12.1) 
dy r=R 
where 
1 doat(r) 
fat (r) = (12.2) 
dat(r) dr 


is the logarithmic derivative for an outgoing wave at 
the joining radius. Again, ¢a(r) can be expanded in 
terms of the m, 


balt)= Di Andm(r), r<R, (12.3) 


and the development analogous to that of Eqs. (4) to 
(4.3) now gives 


(E— En) dn= a (Er ba Jón] . (12.4) 
2MLX dr R 


The use of Eq. (12) for ġa leaves only the term in ġa” 
on the right-hand side of Eq. (12.4), and we obtain 


h? 
PHEA C1 aaa ( Ls fo a Pm . (12.5) 
( ) pe fa —fat)baomr 

The Wronskian is 
[fat(r)— fa (r) bat (r) ba (r)=2ik, (12.6) 

so that we obtain 

h? 1 ml R 

ea (12.7) 


“2M pat (R) (E,— E) 


The joining of the inside and outside solutions at r= R 
now gives 


1 kh? om (R) pat(R)— pa (R) 
Tj ear R 2 m 2M 118 2tk mu R 
We define 


pence P(R)bm2(R (12.9) 
Takis Pm ) a 


CC-0. Gurukul Kangri University Haridwar Collection. Digitized by S3 Foundation USA 


Ti 7 : - 


OPRTYCAL MO DET SE OR NIUIG TEAL 899 


where P is the penetrability, 


1 1 
pe = (12.10) 
bat (R)bda (R) | bat (R)|? 


We then have 


1 da (R) Da 1 da (R) 
e=—- 2 (1- ). (12.11) 
2k bat (R) m E,»—E 2ik bat (R) 
The cross section is given by 
Ce=4r|Sq|? 
w|¢o-(R)_ 7 NE 
= =, 4 i(1 JE (12.12) 
kalpa (R) m E,—E hat (R) 


2. The Many-Body Case 


We consider the case in which an incident nucleon is 
scattered either elastically or inelastically by a nucleus 
of A particles.'” Once this particle is inside the radius R 
—which is chosen so that the interaction between the 
incident particle and the nucleus vanishes outside this 
radius—the wave function can again be expanded in 
compound states &)(r,£). Here we use r for the coor- 
dinate of the incident nucleon and € to label the totality 
of coordinates of the A particles inside the nucleus. 
For the moment the incident particle is treated as dis- 
tinguishable from those in the nucleus, but later the 
generalization to the case where it is identical with 
particles in the nucleus is indicated. 

The compound states P”? (r,¥) obey the equations 


H®®)(r,£)=W b (1,8), 
H=H-+T(r)+V(r,b. (13) 


H; is the Hamiltonian of the A nuclear particles, T(r) 
is the kinetic energy of the incident particle, and 
V (r,£) is the potential interaction between the A nuclear 
particles and the incident nucleon. It is a sum of 
nucleon-nucleon potentials, 


Vr 9=5 V (r,ri), 


i=l 


the nucleon to which the coordinate r refers being the 
incident particle. It is assumed that V(r,&) is a well- 
behaved potential, i.e., a potential without a strong 
repulsive core and other singularities; generalization to 
the case that it is not can be made using techniques 
developed by Watson and Brueckner, but this only 
introduces nonessential complications for the points 
considered here. 


17 The emission of composite particles is dropped from the 
dispersion-theory description later on when one makes the 
assumption that the interaction between the incident particle and 
all of the A particles vanishes beyond the joining radius R. These 
processes can presumably be included by modifying the formalism. 


States 6) and 6 are not now orthogonal, but 
rather PP) and 6”, where P% is obtained from &@ 
by taking the complex conjugate of all functions of 
angles, in particular, of the Q.(@,¢) occurring in the 
single-particle function. Equivalently, to within an 
arbitrary phase factor, P% is obtained from # by 
rotating the wave function so that the total angular 
momentum component M is changed into —M without 
taking the complex conjugate of the function of intrinsic 
structure (e.g.,the radial wave function in the single- 
particle case). We choose 


R 


f Ppd Ed’ = 1, (p=q), (13.1) 
0 


and the integral is zero for pq. In using the ortho- 
gonality later on, we do not indicate the bar over the 
® on the left since the change in the angular momentum 
functions is a trivial one to introduce once we introduce 
the expansion, Eq. (14), and all integrals over angles 
are easy to carry out. 

We introduce a complete set of states x;(&) for the 
A particles. Boundary conditions for these states could 
be chosen in various ways. However, the tightly bound 
states close to the ground state in which we are mainly 
interested are insensitive to the precise boundary con- 
dition, e.g., if we choose the joining radius R fairly far 
out, then both the wave function and its derivative 
are small for a bound state defined in any sensible 
way, and it would not matter much whether we chose 
the former or the latter to be zero, or to have some small 
finite value at that radius as our boundary condition. 
Difficulties enter when the excitation energy ej, which 
we measure from ¢ as origin, of the state x; is suff- 
ciently large so that one of the A particles can escape 
from the nucleus. However, since the only property of 
these highly excited states that we use is that they, 
together with those of low excitation, form a complete 
set, we believe our results to be independent of the way 
in which these boundary conditions are chosen. 

In order to make the connection with the optical 
model later, it is convenient to employ states Ym(r) in 
a complex well of the type discussed in the last section 
for the incident particle. We can then expand 


p) (r, ¥)= > Bjm?X5(E)Wmi (x). (14) 


jm 


The boundary condition on ®® at r=R can now be 
simply stated in terms of the boundary conditions on 
the Ym’. It is convenient, in analogy with Eq. (11), to 
introduce radial functions }m through 


rYm3(r) = Oal, o) Pml). (14.1) 
The boundary condition is then 4 
domi z 
| T3 fea (r, B= Dbui(0 | (14.2) 
dr AR 
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The H—e; in the argument of fet indicates that it is With the use of Eq. (14.2), we find 
to be taken for that energy. The boundary condition x 
is chosen to be that of an outgoing wave at the reduced 4 h? 5 ; 
4 A : E. @p=—————_ >> jm? dey 5(E) J dQ 
energy available to the outgoing particle after leaving 2M(W,—E) im J f 
the residual nucleus in excited state 7. The channel index s 
a on the Wm is again suppressed. Because of the de- ae d 
pendence of fat a cej, the s also depend on j, as x [ryniR |=- yrr, p-«) in| , 
indicated; however, this dependence is weak. The fact 7 r=R 
that the boundary condition on the #’s depends on j (16.1) 
does not upset the orthogonality of the PPPs as long where we have indicated the energy dependence of the 


as one always integrates over the ts coordinates first. _ fat explicitly. The operator in [ ]’s in Eq. (16.1) gives 
We CaN EEX PIESS the TEE function W(r,é), which is Zero operating on any “outgoing” part of Y, and the 
the solution to the Schrédinger equation, only term that contributes in Eq. (15.2) is the term 
HV=Ey, (15) in Ya (r)xo(E). Hence, using the orthogonality of the x’s, 
as 2 _ +7 PR 
eee mitre, (51) ,  , “ , -—* 
and as 2M(W,—E) m 2ik 
ieee at (r)—Wa (1) OEO 
(1,2)=1.{ — y  xol 8) + Sea lE x f dORGm(R)RY-(R). (16.2) 


P Saa’,Wat*(r)x;(~) forr>R, (15.2) 


eae, The integral over Q guarantees that the angular de- 


pendence of the m is the same as that of the incident 
channel œ. With this restriction understood, we can 
write ap in terms of the radial part of the wave functions, 


where Sq is the amplitude for elastic scattering in 
channel a and Saa, j is the amplitude for inelastic scat- 
tering from channel œ into channel a’, leaving the 


nucleus excited in state x;. The Ya represents an h? aE 
outgoing wave at energy E— e; in channel a’. The coef- ap=I«— - | — lo DX Gom? Om (R): 
ficient Ja represents the amplitude for channel a in the 2M(W,—E)\ 2ik m 


plane wave. In the case of spinless particles, (16.3) 


; CAA : A We can simplify Eq. (16.3) by using the Wronskian, 
eik =D iar (U+ PV 0) ju(kr). (15.3) Bq. (12.6), bine |. (16.3) by using the Wronskia 


If we set a=}, i.e., identify Y with ©, for this case, he 1 
and use = Me DX Gom?bm(R). (16.4) 
__ sin(kr—In/2) (15.4) * "OM (W»—E) ba+(R) m 
i> [=e 
kr i By having ap, we can now obtain Se and Seas ; by 
we choose i matching the internal and external wave functions, 
I,=7'(21-+-1) (15.5) Eqs. (15.1) and (15.2), at the joining radius. If both 


equations are multiplied by xo(¥) and RO, and inte- 


for this special case. The (47)! is taken into account at : 
P (47) grated over d’ and dQ, one obtains 


a later stage by multiplying by 4r to obtain the cross 
section, as in Eqs. (7.1) and (12.12); this convention Sabat(R)= SX ap © ton”: (R) 
makes our general treatment correspond to our S-wave P? , 

case when /=0. 


m 


We can again find the ap by Green’s theorem, —Ia«(pa (R)— pa (R))/2ik. (17) 
R If we introduce the width, 
J (EDHY-YHDO)ddr=(E-Wp)ap mar ors 
= a DS am? f ‘bbu(® TM ARAR) = = OS eae 1 
=< 2M im G we can write E 
a, x fae ®) wa] pet © "e ESNE 
= dr i 2h bat(R) > Wy—E 2ik a(R)’ 
=- CRER, E) Dl) J | a ey which is of the same form as Eq. (12.11). The cross 
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section oq for elastic scattering is given by 
=4r|S.|?. (18.1) 


For spinless particles, œ can be replaced by the label J, 
and with the value of J; given by Eq. (15.5), one obtains 


3 or (R) ) Š 
p (R) 


oı= (2/+1 is 


m |r (R ) Yp ( 
if 1 
keloit(R) > W,—E 


(18.2) 


for the total scattering through channel Z. Since the 
incident particle is described iy, a plane wave, Eq. 
(15.3), the differential cross section is given as the square 
of a sum of terms involving different J, 


da (@) 


dQ 


=4r| £ HSV P}. (18.3) 
l 


Similarly, the amplitude for inelastic scattering is 


1 
——> © ap È im” $mi (R). 


(19) 
ait (R) p m 


aa’, j7 


In the inelastic case, the widths do not occur naturally 
in the amplitude, but if quantities up and ttp; are defined 


by 
kh? 1 
y= (— ) D dom >¢,,(R), (19.1) 
|pat(R)| m 
and 
kh? 1 
mE (=) SR), Nt) 
M / |p (R| m 
then 
4rkj U pit 
Cadt:j=— See PS eo) 
k? p Wp— 


The ratio k;/k, with k; defined by k} =2M (E—e;)/h?, 
is the ratio of the velocity of the outgoing particle to 
that of the incident one, which enters into the cross 
section. 

+ In case one term, p, in the sum in Eq. (19.2) con- 
tributes the main part of the sum, the inelastic cross 


(8 


M imm 


where the sum over 7 includes the term 7=0. The right- 
hand side of Eq. (22.1) has a simple interpretation 
ad similar to that of Eq. (8.2). In the numerator the jth 


18 The factor i`? enters because S, is the coefficient of 
exp (ikr—ilr/2), so that the coefficient of expikr is 


j Sı exp(—ilr/2)=S,i7t. 
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he T kja jm” tjm Phm” (R) Pm’ 7(R)/dba’ + (R) dat ETA (R) 


R 
f p*p d3 tdr 
0 
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section can be put into the form of the Breit-Wigner 
formula 


Y py ps 
Oaa’, j a 
| | 


(See the Breit-Wigner formula given in the introduc- 
tion; there we indicated the channel index on the y’s 
explicitly.) In Eq. (19.3) 

Yp= Up, 


as previously, and 


(19.4) 


> 
Ypi T Upi 


is the width for the inelastic process. 

Now W,=€,—ta,/2, where ep and œp are real, and 
we are interested in the width a,. This can be obtained 
from the original and conjugate equations, 


(H-W,)® =0, 
(H—W ,*)b@)*=0. 


(20) 


If the first equation is multiplied on the left by 6®* 
and the second by ®?, one obtains, after subtracting 
and integrating, 


R he 
ia f PP) *h P) d3 Edr = —_ fefe 
0 2M 


d d 
x[i em-an] (21) 
dr dr 


r=R 


By using the expansion Eq. (14) and the boundary 
conditions Eq. (14.2), this becomes 


R 


ias | P(r) *p d Edr 
0 


h? 
=— >> 
2M ism,m’ 
x Liat (B= Ej; R) fom N= Ej; R)]. 


Use of the Wronskian, Eq. (12.6), results in 


A jmP* a jm” Om" (R) émi(R) : p ; | 


o) 


term in the sum is proportional to the 
the particle being at the radius R, mi 
escape velocity and the relevant a tra 
tegral in the denominator represen: 
the particle in compound st 
(The normalization is su 
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necessarily unity.) In the low-energy region where all 
of the quantities on the right-hand side of Eq. (22.1) 
are essentially real, as is shown later, this reduces to 


Opes Ypi (22.2) 


This is interpreted as the total width being equal to the 
sum of partial widths for the various processes. 

Equation (18) is similar to a many-level Breit-Wigner 
formula. This relationship between the scattering am- 
plitude and the sum of resonance terms is linear, and 
this is why it is often simpler to employ in the many- 
level case than the Wigner-Eisenbud expressions. How- 
ever, before it can be applied to the case in which only 
one or two levels contribute, as in the application to 
low-energy reactions with light nuclei, it is sometimes 
necessary to make the energy dependence of the y,’s 
(which results from the dependence of the boundary 
conditions on Æ) explicit, and to determine the imagi- 
nary part [the y,’s defined by Eq. (17.1) are complex]. 
The determination of these is equivalent to relating 
the formalism back to the Wigner-Eisenbud one in 
which the widths are real and independent of energy. 
The approximate relationship was developed in pertur- 
bation theory by Kapur and Peierls,’ but we prefer 
to use a procedure developed by A. M. Lane.!9 
States Wn?(r) are defined for which the radial part, 
multiplied by 7, obeys the energy-independent boundary 
condition 


d 
|=. OO} =0] , (23) 
dr r=R 


in analogy with Eq. (9.1). By building up #ọ%® as 
Po (1,8) => ajm x 3(E) PmI (x), (23.1) 


7m 


we define energy-independent compound states. The 
overlap between such states and our energy-dependent 
compound states can be found by employing Green’s 
theorem again as in Eq. (16). This gives 


R 
w =w) f Py Vb) Gs Eds 
0 


he R 
=—— | afaa RaR 
l, S | ee 


d 
XE] -832 
dr r=R 


For the case in which the resonances are well separated 


and only S-wave scattering can occur, we have 


=$: 
$ 


¥ 


R 
$ (Wp W) J by Ob d3 Edy 
pa s è 0 


y th? 


= E kjlj a jn? bmi (Rn (R). (24) 
2M imn e: 


“eee í ication). 
“19 A, M. Lane (private communication) 
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This equation is exact, but we now use perturbation 
theory starting with the zero-order approximation 
p=, P). To obtain the first-order correction to the 
energy, we let p=q and set Po” equal to 6™ under the 
integral. This gives 


Wp—W,OL—iap/2, 


where we have used Eq. (22.1) and remembered that 
the ajn”? and Øm’ are real in this approximation. Hence, 
the first-order correction to the energy simply adds an 
imaginary part which is just the width of the state 6” 
and which is independent of energy in this order. We 
can obtain the first-order correction to the wave func- 
tion by expanding 


PP) = hy?) +-5— Cq? Py, (24.1) 
and we obtain 
: ih? 
(1 p—W)cg?= aioe a R50 jm? jn? 
IM jem, n 
X bmi (R)bni(R), (pq). (24.2) 


The right-hand side is of order —ia,/2, its approximate 
size for p=q, or less. (The signs of the a;,,2 and the 
jn?” can be both positive and negative, and this 
tends to make the right-hand side smaller than —7a,/2. 
If a random-phase approximation were applicable, the 
right-hand side would be zero.) Therefore, for order of 
magnitude we have 


Og? ~ —10,/ (Wp—W,™). (24.3) 
Thus in the case of well-separated levels, when 
|W p,—W,|>ap, (24.4) 


the c” and, consequently, the imaginary part of P? 
are small, and the latter is of the order of the ratio of 
the width to the spacing of the levels. Hence, to a good 
approximation, the energy dependence of the various 
parameters and the imaginary parts of the widths are 
absent in the region of well-separated resonances, so 
that, except for effects especially sensitive to these 
features such as the “Thomas Shift,” one can usually 
ignore them. 

The arguments in the next section do not neglect the 
energy dependence or the imaginary parts of the widths; 
these are included correctly, although we may often use 
the case of well-separated levels, which is especially 
simple, as an illustration. These features do not essen- 
tially complicate the arguments. 


II. THE OPTICAL MODEL AND 
DIRECT INTERACTION 


1. The Physical Picture 


The physical picture underlying the optical model was 
developed chiefly by Feshbach, Porter, and Weisskopf?’ 


| | 
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and Friedman and Weisskopf.*° These authors split scattering corresponds to that from the complex well, 
the scattering amplitude Sz into the imaginary part of which describes the disappearance 

: ; -, Of particles out of the incident beam into the long-lived 
Sa= (Sa)av-+ (Sa= (Sa)av)s (25) P 8 
component. S 
where (Sa)a is the value of S, averaged over an energy We define the complex well V (7) so that it reproduces 


interval 7. The averaging process is defined more the average scattering phase, i.e., so that Se= (Sa)ave 
precisely later. The average elastic cross section can The physical picture indicates that this is the most 
now be expressed as reasonable procedure. We then go on to discuss the 
, characteristics of this well V(r) and relate the param- 

(Tan = | (Sadar |2+ (| Sa— (Saal 2)av, (26) 


eters back to nucleon-nucleon forces. 
since the cross term between (Se) and Sa—(Sa)av y 4 
averages out. Although both terms on the right-hand 2. The Picture of Lane, Thomas, and Wigner 
side represent elastic scattering, they are physically 
quite different in natwre. 

Friedman and Weisskopf,” who used a time-depend- 
ent treatment, showed that if one constructs a wave 
packet of width AZ in energy, then the scattered par- 
ticles corresponding to the average amplitude (S.)a, 
i.e., those given by the first term on the right-hand side 
of Eq. (26), pass over the nucleus in time r=//AE 
=h/I. This is, in fact, clearly allowed by the uncer- 
tainty principle, because one has, by averaging, con- 
structed a scattering amplitude (Se)aw which varies 
appreciably only over an energy interval J; since the 
energy of particles associated with this scattering is 
undefined within this interval, the time r which the 
particle spends in the nucleus can be defined to within 
h/I. Hence, if the energy interval J is very large, the 
particles corresponding to the average amplitude go 
over the nucleus very quickly. It is therefore reasonable 
that for sufficiently large intervals /, the particles make 
only a few interactions with the nuclear particles—even 
though the nucleon-nucleon potentials are relatively 
strong—and that we can describe them in the weak- 
coupling type picture indicated by the optical model. 
This scattering, which corresponds to the average phase, 
is termed “‘shape-elastic.”” Later we find precisely how 
large the interval J must be for various weak-coupling 
type descriptions of the average scattering to be valid. 

On the other hand, the fluctuation scattering (often 
called compound-elastic scattering), described by the 
second term on the right-hand side of Eq. (26), varies Sat = Sama Salary (27) 
over energy intervals of the order of the width of the . di ; . S 
compound! nuclear states) (1 e ai O A. in dispersion theory, using the fact that we choose V so 
Hence, the corresponding particles stay in the nucleus that Sa=(Sa)w. Then, 
the order of a million times longer than those of the 
shape-elastic scattering that would correspond to a S«®=Sa— Se= 
wave packet of the order of 1 Mev wide. If the particles 
corresponding to the shape-elastic scattering have time 
for one or two collisions, then those corresponding to Where the second term in brackets gives the scattering 
the fluctuation scattering stay in the nucleus long &mplitude in the complex well, and Em is the eigenvalue 
enough to make the order of a million collisions. The in the complex well, which may be taken to have the 
latter particles can be regarded as those forming form, Eq. (9.13), Em=€m—i8m/2—iW. (We are con- 
compound states of the type envisaged in the old sidering the complex well to be a square w 
compound-nucleus picture, whereas the shape-elastic although generalization to the case where it vari 
—_——- r is easy.) From our definition, (Sa**)y, the aver. 


Possibly the most striking feature of experimental 
data on elastic neutron scattering was the appearance 
of the giant resonances in the total cross section in the 
energy region ~0 to 3 Mev, as discussed in Sec. I. In 
order to understand these, we develop the picture 
of Lane, Thomas, and Wigner.” In this development 
we assume that the well V(r), which reproduces the 
average scattering amplitude (Sa= (Saa), has the 
characteristics of the optical model potential of Fesh- 
bach, Porter, and Weisskopf’; that is, if V=—U—iW 
then U=42 Mev, W lies in the range 1 to 2 Mev, and 
the radius of the potential is R&1.45 41X103 cm. 
These parameters are used only for order-of-magnitude 
estimates; knowledge of their precise values is not 
necessary for understanding the conceptual points 
considered here. Later in this section we show how to 
relate these parameters back to nucleon-nucleon forces. 

In the picture by Lane, Thomas, and Wigner,” for 
W pen, i.e., for energies of the compound states in the 
neighborhood of the single-particle energy en, only 
terms m=m'=n in the width yp of Eq. (17.1) are 
important, i.e., only the (don?)? are large if W, lies in 
the neighborhood of en. We now make these criteria 
more quantitative. For simplicity, we consider the 
scattering of S-wave neutrons in the region of well- 
separated levels where the y, can be taken to be real. 

We write the amplitude for compound-elastic scat- 
tering, defined by 


enzikR Yp Tm <a 
L) = 
` 2k P? W,—E m Ën- E ( ) e 3 


2 F., L. Friedman and V. F. Weisskopf, Niels Bohr and the 
Development of Physics (Pergamon Press, London, 1955), p. 134. 


ĉl Lane, Thomas, and Wigner, Phys. Rev. 98, 693 | 
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Se*, is zero. Hence, the average of the quantity in 
brackets must vanish. The intervals over which we 
average are always small compared with the widths of 
the single-particle resonances, so the second term in 
brackets need not be averaged. We can write 


Yp 
Wp— E 


Ns j 
| OOD a, (28) 


lz m H,,—E 


where P and Q are real. We find 
(em~ E) m 


P(Ð) =L <r 
m (€m—E)?+ (W+Bm/2) (28.1) 
and 
Pn(W+Bmn/2) 
Q(4)=x 


a (€m—E)?+(W+Bm/2)? 


where Tm is assumed to be real, which implies speciali- 
zation to the low-energy region. We cannot express 
P(E) more simply in terms of yp, but the imaginary 
part of the sum Q(£) can be put into a more useful 
form. 

The average (F (E)}w of a quantity F (E) is defined by 


(F(E))w= f p(E-E)F(E)\dE’. (29) 


The weighting function p employed by Feshbach, 
Porter, and Weisskopf? was a square one, 


OWE T2 
pP@=l1/I, —1/2<a<I/2, (29.1) 
O sup 
so that 
1 E+I/2 
(F(E) w=- if F(E/)dE!. 
IJ g-1/2 


With such integrals, one has end effects coming from 
resonances where ep lies near either E—J/2 or E+J/2, 
and these must be disposed of, somewhat inelegantly. 
Physical results must be independent of the precise 
form of weighting function employed, as long as it is 
not an unreasonable one, and we find it convenient to 


use 
5 I 1 
E-E) =- —, 
p( ) 7 GW EP +E 
SE 


E region, where the k-dependent factor 

E: o Ed Tes penetrabilities in the widths vary rapidly 
7 ho barding energy, these factors should be divided out 
me the average is taken. This is done explicitly in the next 
efo. 


BROWN 


in which case 


Yp i Yp 1 
> —} =- f È -di 
A Dla T I)E a Wp E (B-E) 


Yp 
== (30) 
? Wp—E—il 


where the integral can be evaluated by contour inte- 
gration. Quite generally, averaging with this weighting 
functiona quantity F(E) which has poles only in the 
lower „half of theřcomplex Eřplane (such functions are 
often called R functions following Wigner), one finds 


(F(E))a=F(E+11), (31) 


where J corresponds to the interval over which the 
average is carried out. We always assume this interva 
to contain many resonances . 

We find, then, 


Sy ? 
Q(E)=In a (32) 
> W,—-E—il 


By defining the average density of the levels in the 
region of interest by 1/D (where 1/D=N gives the 
number of levels per unit energy, which is assumed to 
be large), one can convert the sum in Eq. (32) into an 
integral, obtaining 


LER (pce yl TY 
O(E)= f —dép=—, (32:1) 
1D) hay | VS oo be D 


where we have neglected a, compared with J, and where 
y is the average width. The function r7/D is called the 
“strength function.” Our assumption of well-separated 
levels is equivalent to 77¥/D<1. 
Finally, we find 
V DA 
t—=Im >> ——, 
D m Em- E 
when we make the average value of the imaginary part 
of the quantity in brackets, Eq. (27.1), vanish, using 
(Seea =0. In the special case where the energy Æ is in 
the neighborhood of en, the main contribution to the 
sum on the right-hand side comes from the nth term 
provided that the single-particle resonances are suf- 
ficiently far apart. Then, 
Y l,(W+8,/2) 
a—= , Een. (33) 
D (en =E)? + (W+8,/2)? 
The strength function, for constant W, Tan, and Bn, is 


thus of the Lorentz form. The contribution of the other 
levels is at least of order W/Ae smaller, where Ae is 


(32.2) 


*3 Since the sum in Eq. (32.2) relates only to the single-particle 
well, one can carry out the calculation of the sum explicity for 
special cases, such as a square well, and verify that if een, the 
sum of terms mm, contributes in order W/Ae (see Appendix A, 
reference 16). 
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the distance between single-particle levels of the same 
angular momentum, i.e., of the same channel «œ. This 
distance is of the order of U, the real part of V. 

The validity of the picture depends on WU, which 
is also one of the criteria for the existence of marked 
resonances in V. In the range of separated compound- 
nucleus resonances, the condition kR<1 is fulfilled, as 
appears later, and then the maxima in the strength 
function of resonance shape given by Eq. (33) are 
simply related to the total cross section or, as can be 
seen by using the optical relation 


or= (4r/k) Im Sa. (34) 


Hence 


(34.1) 


4r = 2r ¥ 
(or)w=— Im(Sa)aw=— Im Sa 2&—a—, 
k k p 
which we have obtained from Eq. (32.2). 

This situation just discussed is illustrated in Fig. 6, 
where the distance between levels en’ — en is assumed to 
be much greater than W, so that the picture of Lane, 
Thomas, and Wigner is valid. The 4s and 5s single- 
particle levels are shown on the left-hand side, and it 
is illustrated on the right-hand side how—although 
each single-particle level is split up into many compound 
states—there is an appreciable probability of finding 
them only within an energy region of width ~ W 
bracketing the single-particle energy.” Hence, although 
the interaction V(1,&) is strong enough to change the 
wave function completely so as to split the single par- 
ticle level Y, into thousands or millions of compound 
states, it is not strong enough to mix the single-particle 
level p, which is a distance of order en’— €n away, into 
the compound state in the neighborhood of en with 
appreciable probability. 

We might ask how well the condition (€,-—e,)>W 
is fulfilled in the actual physical case, taking the 


parameters of Feshbach, Porter, and Weisskopf? for V. 


(Qu) 


Fic. 6. On the left, S-wave single-particle resonances in the 
complex well are shown; on the right, the behavior of the square 
of the expansion coefficients. Behavior of the strength function 
is essentially the same as this. 


” We have made arguments only for the case of separated 
levels where the aon? are essentially real, and therefore our argu- 
ments do not apply in detail to the higher single-particle resonance 
n’. However, it is clear from them that the (do,?)? has a 
spread of order W, where W is the imaginary part of V (which is 
velocity-dependent) at the excitation energy en. 
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By way of example, consider the nucleus A=160 where 
the 4s single-particle resonance occurs near zero energy 
experimentally, as in Fig. 6. The occurrence of the 4s 
resonance requires K,R= (7/2), where K, is the wave 
number for a particle of zero energy, measured from the 
bottom of the well. The 5s resonance occurs for 
Ky R= (9/2)m. Hence, €n-/én= 81/49, where en and en 
are now measured from the bottom of the well. For a 
well depth of 42 Mev, en — en =Ae=27 Mev. Thus there 
is no doubt that W<Ae is the low-energy region, where 
W is of order 1-2 Mev.” 

Although the single-particle width I, is split up 
among thousands or millions of compound states, some 
features of the single-particle resonance remain. In 
particular, since to a very good approximation, 


Yp OAA Een, 


and since it follows from the completeness of our two 
representations connected by the coefficients @jm? that 
Do p(Gon”)?=1, then 


DS yeST a, 
Pp 


where the prime on the sum indicates that it is extended 
only over compound states in the neighborhood of en. 
In other words, the single-particle width is split up 
among many levels, but the sum of widths of these many 
levels is just equal to the single-particle width. 

At this stage we have an understanding of why the 
compound resonances p are so narrow without resorting 
to the classical picture which explains this in terms of 
the energy being shared among all of the nucleons. The 
interaction V(z,£) is sufficiently strong so as to mix 
many states x;(£)Wm7(r) into the state PP). Since 


2 CA 1, 
jem 
following from the normalization of the $”), and since 
there are many terms which contribute in this sum, 
then (aon?) <1 for any given p and the width of the 
compound resonance is much less than that of the 
single-particle one. 

Nonetheless, the interaction V (r,¥) is not strong 
enough to break down the underlying single-particle à 
structure completely and, in particular, it does not mix é : 
different single-particle levels n and n’ into the same 
$, but only spreads the single-particle resonance ~ 
locally. Consequently, it is not surprising that some of ; 
the single-particle features remain in experiments carried _ 
out with wide beams. Ki 

For completeness, we remark that the width of 
strength function is given essentially by W as sho 


ok 


25 The discussion of the contribution of terms mæn in Eq. | 
was carried out taking the same W for all terms m, contra 
the spirit of the other development where w 
velocity-dependent potential, or, eq 
V(r',r), in which case the W’s enteri 
single-particle levels are quite differ 
terms <n must be the same in € 


a Gay 


= 
= 
~~ 
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in Eq. (33) (although this equation is a reasonable 
approximation only when Æ is in the neighborhood of 
€n), and not by the square root of the second moment 
of the perturbing potential, as Lane, Thomas, and 
Wigner indicated might be the case. Bloch?’ has 
pointed out, using some rough quantitative estimates, 
that one should expect the root-mean-square moment 
to be much larger than the width owing to large con- 
tributions from the wings of the strength function which 
are weighted heavily. Use of the form Eq. (33) for all 
energies would lead to an infinite second moment which 
would contradict the finite expression for it in terms 
of the potential,” so that this form is clearly not valid 
far from the resonance energy. 


3. Low-Energy Behavior of the Scattering” 


At low energies, where (ry/D)K1 and kR<1, the 
interpretation of average quantities is especially simple. 
Equation (34.1) shows that the average cross section is 
proportional to the strength function in this region. The 
maxima in the strength function indicated by Eq. (33) 
would be difficult to observe experimentally by meas- 
uring the strength function at various energies for a 
given nucleus because the single-particle resonance is 
broad, of width 21W/+-8,, and even if the same experi- 
mental technique could be used over this broad energy 
interval, the interpretation would become difficult, both 
because waves of higher angular momenta would be 
mixed into the experiments and because our simple 
consideration—which applied only to the region of 
well-separated resonances where kRK1—would no 
longer be valid. Consequently, the strength function is 
usually measured at low energies for different nuclei 
and plotted as a function of A. In this way a variation 
in the quantity Z—e, is obtained which depends on KR, 
where K is the wave number measured from the bottom 
of the well. Some experimental data, plotted in this 


Fic. 7. Experimental data for the strength function. The solid 
line indicates typical theoretical predictions for a complex-well 
model. A measured width yp is reduced to a y® by using 
yO =p(ko/k), where ko is taken to be the wave number for a 
neutron of energy 1 ev. A more refined theory” takes into account 
nuclear deformation, and the resulting agreement between theory 


and experiment is then better. 


26C. Bloch, Nuclear Phys. 3, 137 (1957). 
a7 wie ed moment 4 clearly finite for well-behaved poten- 
rhi sed here. A 
; hich Ere results in this section were first derived in the 
x formalism by R. G. Thomas, Phys. Rev. 97, 224 (1955). 
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Fıc. 8. Experimental data compared with theoretical pre- 
dictions for R’ (solid line) from the complex-well model. A more 
refined theory*! takes into account nuclear deformations, and the 
resulting agreement between theory and experiment is then 
better. 
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way, are shown in Fig. 7.” The theoretical curve is not 
computed here from dispersion formalism but is 
obtained from 


U 5 
t—=?2k Im Sa, (34.2) 
D 


[see Eq. (34.1)], where Ša is calculated directly by 
integrating the Schrödinger equation with potential P. 
Nevertheless, our formulas are useful in understanding 
the qualitative behavior in the resonance region. 

Historically the first striking observations of the 
nonmonotonic behavior of average cross sections with 
changing A referred to measurement® of a quantity 
sometimes called the “potential scattering.” To obtain 
this quantity, we expand in Å, considering k to be small, 
but must carry the expansion one step further, i.e., we 
express the average cross section (a7), as 


(or)w= (a/k) +8, (35) 


which can be done at low energies where terms of higher 
order in & are negligible. The first term on the right- 
hand side is the familiar 1/2 term in the cross section 
which is related to the strength function and which 
was already discussed in Eqs. (34.1) and (34.2), and 
the second term is often expressed as 


b=4r(R’)?, (35.1) 


where R’ is interpreted as a radius. The observed 
behavior of R’ with atomic number A is shown in 
Fig. 8.3! 

A simple description of 6 is difficult to give in the 
Kapur-Peierls theory in so far as the imaginary parts of 
the width yp [see Eq. (23) ff. ] are of order k compared 
with the real parts, and these must therefore be taken 
into account. It is consequently more straightforward 
to give this description in the Wigner-Eisenbud for- 


2 Hughes, Zimmerman, and Chrien, Phys. Rev. Letters 1, 461 
(1958). 

% Fields, Russell, Sachs, and Wattenberg, Phys. Rev. 71, 308 
(1947). 

31 Seth, Hughes, Zimmerman, and Garth, Phys. Rev. 110, 692 
(1958). 
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malism. We sketch this briefly, following the treatment 
of Lane and Lynn,” and refer to the review article of 
Lane and Thomas! for a detailed discussion. 

The expansion is actually in powers of p=kR, as in 
Eq. (9.5) ff., because p is a good expansion parameter 
at the energies of interest. To make this explicit, we 
write a 

2ik(S)}w =e —1+2ipe?(RKP}w, (36) 
where 


Rker=}_ Xp/(Wp— E), (36.1) 


Rxp being the Kapur-Peierls R function. (It has poles 
only in the lower half of the complex plane.) xp is the 
reduced width, 

(36.2) 


The relation between Rxp and the Wigner R function 
Rw is easily obtained by looking at the scattering 
matrix in the two theories. It is 


Rgrr=Rw/(1—ipRw). 


XD (2p)s*75. 


(37) 


In averaging Rxp by adding iZ to E, we find that 


(Rxp)w=Ry/(1—ipRw), (37.1) 
(see pp. 306-309 of Lane and Thomas), where 
Ryw=Rw*?+insw (37.2) 
with Rw” and sw now real, and given by 
on Ree if AED y, (37.3) 
D R'—E 


Here x is the average of xp®, the width with the 
Wigner-Eisenbud boundary conditions [see Eq. (23) ]. 
The expansion to any required order is now easily made. 


4r 4rp 
— Im(Sa)av=— (Ts) 
k k? 


4rp i 
= (= Rw} (rsr)? +069), (38) 


The left-hand side of Eq. (38) is equal to (7) by the 
relation Eq. (34.1). We now can identify easily the a 
and b of Eq. (35); thus, 


4rp 
= (Sr) 
k2 


and (38.1) 


b=4rp*[(1—Rw*)?— (rs w)?]. 


To lowest order in p, 2prsw=27/D and, retaining only 
a, we see that Eq. (38) reduces to our earlier expression, 
Eq. (34.1). 

This derivation is independent of any particular 
model, although the model introduced in the last section 


2 A. M. Lane and J. E. Lynn, “The widths and spaci 
resonance levels,” Harwell Report T/R 2210 (1957, Pabn aI 
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is helpful in understanding the qualitative behavior of 
sw and Rw” in the actual physical case. We have 
already indicated the resonance-type behavior in the 
strength function in Eq. (33). We see immediately from 
Eq. (37.3) that to the extent that the strength function 
is symmetrical about its maximum, Rw” is zero at 
this maximum which occurs at the real part of the 
single-particle resonance energy. Further, Ry” becomes 
negative on one side of the maximum and positive on 
the other so that one can understand the qualitative 
behavior of R’ shown in Fig. 8. 

Theoretical curves, such as those shown by the solid 
lines in Figs. 7 and 8, are obtained by calculating Sa 
and then using (Sa)a= Sa. 

In past work the term (zs) has often been neglected. 
However, near the single-particle resonance at A= 50, sw 
is appreciable, as R. K. Seth has pointed out,® and it is 
vital to include the term (rsp)? in analyzing data near 
A=50. Otherwise (1—Rw*®) comes out to be too large. 

This treatment illustrates the type of problem in 
which the Wigner-Eisenbud formalism is useful. It is 
advantageous here to deal with the real quantities of 
this theory in which the energy dependence is explicit. 

Reverting to the Kapur-Peierls formalism, we can 
show the relation between the average cross section for 
forming a compound nucleus and the W in this energy 
region, to lowest order in k. The former is defined as 


(oc) ay = (oT) — Gee, 
where øse is the cross section for shape-elastic scattering, 
Tse= fr | (Saa? = 4r] Sa|?. 


Use of the relation between (cr), and Im Sz given in 
Eq. (34.1) results in 


1 a 2 
(oe)w= 4r 7 WMS [Sal (38.2) 


By using the resonance expansion for Ša [obtained 
from Eq. (7) by replacing Em by Em] and remembering 
that Tn is of order RR, we find, to lowest order in kR, 


4r WE m/2 

(oc) w—— PE ELA (38.3) 
k? m (E—€n)?+W? 

where we see that the shape-elastic scattering does not 
contribute in this order. To interpret this formula we 
consider Ya(r), the solution of the Schrödinger equation 
in the complex well. Since we are dealing with S waves 
we can expand ġa, the radial part of Ya multiplied by r, 
in terms ‘of eigenstates m. The coefficient has already 
been derived in Eq. (4.3). Thus 


bent cann OR 
2M m B P 


March, 1959, 


SR. K. Seth, Optical Model Conference, Tallahassee, Doa j 
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a= x "i 7 
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Fic. 9. Illustration of 
E the position of the neu- 
tron energy with respect 
to the single-particle res- 
onances in our “simple 
case of scattering.” 


This formula clearly shows the resonance behavior of 
the probability 


=f o= Ge = ae = A 


of finding a particle inside the nucleus whose wave 
function has unit amplitude asymptotically. Here we 
have dropped m which is of order kR smaller than W. 


p= It is clear that 
te.) Sri 7 (40) 
Oc)w=——CaW. ‘ 
4 hk 


_ In other words, the average cross section for compound- 
nucleus formation just defined is proportional to the 
absorption W times the probability @. of finding the 
nucleon inside the nucleus. In a somewhat classical way, 
one might interpret W as the absorption per unit nucleon 
Es in the nucleus. 


4. A Simple Case of Scattering 


l Much of the relationship of our detailed picture, 

= where we make a complete description in terms of the 
P) to the simplified complex-well model, can be 

= understood by considering an especially simple case*4.3° 
= in which the energy of the incident particle falls 
pee between single-particle resonance energies. Let us 
_ _ suppose, referring to Fig. 6, that the energy E of the 
ae incident particle lies midway between en and e», as in 
ea _ Fig. 9. The amplitude for elastic scattering coming from 
Sos the lower group of levels is 


Caen?) Uy SORR 
yee, (41) 
2k » W,—-E k 


ee the lower suffix n on Sq indicates that this is the 
Sa that would result if only the lower group’ of levels 
sisted: We have here used the model developed in the 
St section and dropped the doy” in the widths yp 
ring to the set of compound-nucleus levels lying 


(Sa)n= 


othe W j os are spread a distance W about en. Conse- 


Phys. Soc. (London) A70, 515 (1957). 
y , Proc Phys Soe, (Eondon) ATO Proc. Phys. Soc. 
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quently, if E— eW, then a good first approximation 
is to set E—W, equal to E— en in the denominator of 
Eq. (41). Evaluation of the sum over # gives in this 
approximation, 


er?*wR Ta sinkR 


NZ GS 92 


(Sa)n= Cin (42) 


which is just the contribution to Se from the nth 
resonance in the complex-well picture if we neglect W 
compared with E— en in the denominator. 

Therefore, if the spread of the yp about the single- 
particle resonance energy is small, then the whole set 
of them act together just as the single-particle reso- 
nance. This is related to the principle of spectroscopic 
stability, from which it follows that—although the wave 
functions may be drastically changed by the perturba- 
tion—certain simple features come out just as in the 
unperturbed problem. 

The main oversimplification is in the neglect of the 
widths yp which lie close to Æ, i.e., the yp are not so 
well concentrated around en and ew as we have shown. 
However, our approximations are good in so far as the 
“shape-elastic” scattering is concerned. Our neglect of 
the near levels has simply resulted in our dropping the 
compound-elastic scattering. We now make the de- 
velopment more precise in such a way that the near 
levels can also be treated. Such a treatment is clearly 
necessary in order to consider general Æ which may be 
in the region of a single-particle resonance. 


5. Expression of the Scattering in the 
Green’s Function Formalism 


In defining V(r), it is convenient to express °° in 
terms of the Green’s function as was done by Bloch.*® 
Actually, Bloch’s elegant formalism is much more 
general than that we use, and by specialization to 
specific representations he can obtain either the Kapur- 
Peierls or Wigner-Eisenbud expression. However, our 
simpler formalism is adequate for illustrating the points 
considered here. We again specialize our consideration 
to the scattering of S-wave neutrons; generalization of 
our results to other cases is obvious. 

We use the notation 


B(r,E)=|p)= (|, xilë) Inr) = 


and 


R 
| jm)= f BO) (1,2) xj(E m (drd E= aj. 
0 


| jm)=(jm|, (43) 


No distinction is made between |p) and (p| since ex- 
pansion and orthogonality conditions in this theory 
involve only the wave function and not its complex 
conjugate, aside from the trivial changes in functions 
of angles mentioned in part 2 of this section. 


36 C, Bloch, Nuclear Phys. 4, 503 (1957). 
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We then have 


en 2ikR Yp int. 
SCS {> -5 = | 
2k l2? W,-E m Em—E 
~ e218 T [Elomo] om p 
=—— g? ik om om — 
att E Zlom) om’ | =) 


1 
— (om = 
H-E 


i 
=— g kk Y |< om 
2M m,m’ H 


a (om 


where H is obtained from H by replacing V(r,£) by V, 
i-es 


on!) (nn (R), (44) 


on!) 
E 


— f 


1 


1 
H—-E 


om!) bm(R)bm'(R), 


H=H-+ (V (r, — V(r) =+ 6y. (44.1) 

By using the identity 
1 

fon RGI 


(V—-V) 


= 


H-E 


1 botal ae 
r= pee Ve, (ES) 
H-E H-E H-E 


which is easily checked by multiplying both sides of 
the equation by H—£, we obtain 


h 1 1 
Donec r A SS. = - 
2M mm En—E Bm — 
Z AAI 
Xi lom|V— V| onl)—{ om (V—V)—— 
HZB 


xv- Ý) (46) 


om! ) | bm(R) Pn (R). 


At this stage, by using Eq. (38.4) we obtain S.** in a 
convenient form, 


2M & 
Sat = -= (oa| V— V| oa) 
h? 


T wy, an 


where Ya is the scattering state defined by Eq. (11) and 
the discussion following it. Further expansion of 
1/(H—E) by iteration with Eq. (45) would give the 
Born expansion for Sa’, i.e., the expansion in successive 
powers of the perturbing potential ôV. Equation (47) 


ENN 5 
(= eae V) 
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can be derived directly from scattering theory without = — 
recourse to dispersion theory. ae 

Insertion of the unit operator | p)(p| in the second 
term on the right-hand sides of Eqs. (46) and (47) 
allows replacement of (H—E)“ by (W,—£)™, indi- 
cating that all of the poles of this term lie in the lower 
half of the complex energy plane. Thus we can average 
this term with respect to energy by replacing Æ by 
E+i/, following the procedure of Eqs. (30) and (31). 


(Sa®*(E) w= Sa (E+il). (48) > 


The complex potential V(r) is defined by the condition 
Sa(E)=(Sa(E))m, or by (Sa%*(E))w=0. This can be 
accomplished by requiring 


(o| V— Vo) 


co (o 

One can therefore define an optical potential V(r) 
which reproduces the average scattering phase. At this 
stage, not very much has been proved because the 
formalism does not make it clear that V does not vary 
rapidly and possibly nonmonotonically with energy; its A 
usefulness empirically comes from the fact that it varies 
slowly and regularly with energy and with atomic 
number A. We now go as far as we can towards demon- IN 
strating that V has these characteristics. In order to 
do this, we make the connection between the parameters 
of V and the nucleon-nucleon potentials. +a 


S 1 = 
(V—-Y——— w 
H—E—il 


0)=0. (49) 


6. Perturbation Theory of the First Kind 


In the perturbation theory to be developed, we will © 
have to carry out an expansion in ôV =V (r,&)—- V) 
since we wish to obtain expressions for quantities which 
do not contain the complicated P. The smallness of 
the widths of the $P)? compared with those of the single- 
particle states m which, together with x; we ; 
ploy as unperturbed wave functions, indicates that — 
a large number of terms are necessary in the expansion — 
Eq. (14) of PP in terms of x;(€)Wm(r). Thus, 
expansion of $P in perturbation theory, starting fr 
x0(E)Wmi(r) as zero-order function, would converge vi 


convergence than those for the wave functions as mig 
be suspected from the physical arguments of the fi 
part of this section. We now formulate these c 
following, the development of Brown, De Dor 
and Langer.’ — o 
W is defined as 


where 


37 Brown, De Dominicis,-and La 
209 (1959). bate 


Ae 
A 
a, O j 
eS 
T. ERUS, aes 


E N  S 


E 


* r 1 rs 
k K o aa 
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Our defining equation, Eq. (49), for V becomes 


lo 0)=0. 


(49.3) 
Equation (49.3) can be explicitly solved for W.** For 
brevity, 


1 
V—V+w)}| 1-_———— 
i | H—E—il 


(V— 7+) | 


e=H—E-il, (49.4) 
and > 
é=H—E-il=e—(V—Y). 
Then Eq. (49.3) can be written 
1 
(o (e— wji —-(e— 2+) | o} =0, 
e 
1 
=(o (e—é+W)-(€—-W) 0, 
e 
(49.5) 


=(o 


1 1 
(e—@)-é (e—é)- 
e e 


0%- (o 
+00 oX. 


Since H;|O)=0, the factor (2—W) in the final term 
commutes with |o) and can be taken out to the right. 
Equations (45)-(49) imply 

0, 


1 
e= 
e 
and from this we see that the final term in Eq. (49.5) 


is equal to W. We can then solve the equation for W, 
obtaining 


1 
1-(o (V— n- ys 


og 


1 
—(é—W 
e 


1 
ée—W 


(49.6) 


-A el 
(V-¥)-(V-T) | 0, 
e 
(49.7) 


where we have added to the final factor the term 


(e—é)-e 
e 


lo 


0, 


= for symmetry. This term is zero, since (0|e-é|0)=0. 
= Equation (49.7) is equivalent to 


o), (50) 


(50.1) 


e—A(V—V) 


A= [0X0], 


De Dominicis, J. phys. radium 19, 1 (1958), 


ke * 
i ? žy 


AZ “as nl Ber $ pS ar 
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as can be checked by expanding in powers of (V—V)1/e. 
One can derive this expression without recourse to per- 
turbation theory as an intermediate step. For pur- 
poses of perturbation theory it is more useful to think 
of the denominator of Eq. (50) as é+(1—A.)(V—D), 
since the expansion is in powers of (1—A,)(V—V). 

In working with Eq. (50) it is advantageous to use 
a representation for the extra particle which is diagonal 
in T+V rather than in T+. The importance of con- 
sidering W comes from the fact that it contains the 
entire imaginary part of V; its contribution to the real 
part of V would be important only in case one was 
trying to calculate this quantity fairly accurately. For 
the simple case in which Visa square well, changing the 
imaginary part of it by iW does not change th eeigen- 
functions but only shifts the resonance energies by iW. 


Our new energy eigenvalues are, therefore, 
En= Em —iBm/ 2. (50.2) 


We now investigate the eigenfunctions and eigen- 
values of the operator 


K=H—-A,(V—V). (51) 
We label them by Q, and W,’, i.e., 
=W p'Qy. (51.1) 


Now, —A.(V—V) can be considered as a small per- 
turbation since the Q, are not very different from the 
2), Thus, we choose 


=P PEN, (52) 
and 
ô= n Ap"™O™, 
To lowest order 
Wp =W p. (52.1) 


The use of Green’s theorem as in Eq. (16) and the 
assumption“! that 9, satisfies the same boundary condi- 
tion as P), gives the result 


R 
f (E KA — Qp RP) Edr=0 
0 


=(Wr—Wn)ap" tE (p| (V—V)| om)(om|n), 


™ 


(53) 


where Wp =W, and 2,=®) on the right-hand side 
since we wish to compute ap” only to lowest order. By 
employing 


(p| V—V | om)=(p| H—H | om) 
= (W p— emt ißm/2)(p lom}, 


” J. S. Langer, thesis, University of Birmingham, 1958. 

0H. Feshbach, Ann. Rev. Nuclear Sci. 8, 44 (1958). 

“ Actually, the functions 2, obey slightly different boundary 
conditions from the @™), because, as is clear from Eq. (51), the 
9, contain no components xo, these being projected out by the 
operator Ao. However, such components of ®™ do not con- 
tribute to the left-hand side of Eq. (53) because they are orthog- 
onal to 2,, and the equality given there holds, 


(54) 
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we obtain 
(Wp— Em +18 m/2) 
= W.-W, 


Up Pr (p|om)(om|n). (55) 


For order-of-magnitude estimates, 


$ (b | om) (om| 2)Bm™~ (p| om) Bm™~Yp, 


and on the giant resonance 
Wp— ens W. 


As a result the ap” are large only if (W p— W n) ~WYp/I m 
and, therefore, because of the smallness of yp, only a 
few of the neighboring ®’s are mixed into each 2 by 
the perturbation. 

We can expand the ’s in terms of the x’s and the 
y’s just as with the ®’s. Then 


p= ya: B jm? X5(E) Wm? ( r) : 


jem 


(55.1) 


One important difference between the right-hand side 
of Eq. (55.1) and that of the expansion of the 6), Eq. 
(14), is that the right-hand side of Eq. (55.1) contains 
no terms with 7=0. These have been projected out by 
the operator Ae, so that the eigenfunctions of 3€ form 
a complete set in the space orthogonal to xo. 

Since the eigenvalues and eigenfunctions of 35¢ differ 
only slightly from those of H, aside from the fact that 
the former do not contain components in |o), our argu- 
ments about the distribution of the &;jm? apply equally 
well to the distribution of the bjm”. However, it is 
important to retain the A, in the denominator of Eq. 
(50), i.e., not to approximate the Q, by the #0), 
because this would allow states |o) to occur as inter- 
mediate states in the expansion which would give large 
spurious contributions. Retaining the A, is related to 
the need to eliminate unlinked clusters in expansions of 
the Brueckner type. 

We are now ready to discuss the criteria for pertur- 
bation theory. We can express the matrix element as 


2 (on|W (E) | on)=(on| V—V |p’) 


(p'|V—V\on), (56) 


Sear a Se 
W »—E—il 


where we label the eigenfunctions of 3€ by |p’). In a 
perturbation expansion of the type 


w(E)=(o Ura A= 
x| dan (V-P=——] (V—-T) o}, (57) 


“This was pointed out to the author by Dr. A. M. Lane. The 
partial summation of Bloch*® takes this into account properly. 
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Fic. 10. Positions of 
poles in the complex- 
energy plane. The scat- 
tering amplitude, aver- 
aged over energy inter- 
val J, is obtained by 
taking its value for the 


xEjm 


complex energy E+tI. a 


the lowest-order term of the expansion can be expressed 
as . 
lon wona (on| V—V | jm)(jm|V—V | on) 
on|W (E) | on) = 2 
i,m Ejn—E—iW 
(58) 


The same result can be obtained by replacing |’) by 
E ;m| jml jm|p') in Eq. (56) and then approximating 
W,’ by Ejm. This approximation is justified if J is large, 
because the (jm|p’) are large only if Wp’ — ejm W by 
the arguments of the preceding section. The situation 
is illustrated in Fig. 10. In the complex energy plane, 
the state | jm} has an appreciable probability of being 
found in states |p’) distributed over a region of 
width W. In going from Eq. (56) to the approximation 
Eq. (58), we are approximating the distances E+2I 
—W,’ in the energy plane by the common distance 
E+il—Ejm. This is justified if | E+il—Ejm| >W, W 
giving the spread of the W,’ as indicated in Fig. 10. In 
the worst case for the fulfillment of this criterion, in 
which E= ejm, which is illustrated in Fig. 10, we can 
satisfy this criterion by making J sufficiently large, so 
we must have IW. 

This could be phrased in the following way: Calcu- 
lating only the average phase means that we can evalu- 
ate it for the imaginary energy 6=H#-+iJ7. The ap- 
proximation of taking only the first term in the expan- 
sion is clearly better, the larger 6—Ejm. In the simple 
case of scattering considered earlier (where we actually 
approximated at a different stage), a large value for the 
distance between & and the single-particle resonance 
was obtained by taking E, the real part of &, to be far 
away. The same result can be achieved by pushing & 
up in the complex plane, which means averaging over 
larger intervals. 

We have thus formulated mathematically the physical 
considerations of part 1 of this section. There it was 
seen that the time spent in the nucleus by the particles 
corresponding to the average phase decreased as the 
interval J, over which the average was carried out, 
increased? This development shows that the larger J is, 
the faster the series Eq. (57) converges. This series 
in powers of ôV can be interpreted as an expansion in 
the number of collisions the incident particle makes — 
with the target nucleons. The larger J is, the more 5 
quickly the particle passes over the nucleus and the | 
fewer the collisions it makes. < ae 

The expansion of W in this way has been carried 
by Bloch. After expansion, he makes assum 


Fic. 11. Variation of 
absorption W with en- 
ergy of the single-par- 
ticle excitation. 


about the randomness of signs of the (kll ôV | jm). We 
do not, however, make such assumptions, and show 
later that they are not justified. The expansion is, 
however, convergent without these assumptions pro- 
vided J is sufficiently large. 

It might appear that the condition J>>W is difficult 
to achieve. The decisive point here is that the W to be 
employed is the W appropriate to the state | jm) which 
corresponds to the nucleus excited by energy e; and a 
single particle of energy roughly E— e; in the well. This 
is illustrated in Fig. 11. From our earlier developments, 
we can assume energy conservation to within the width 
of the excitation | jm). Thus Ee;+ém. The width of 
the state |j), which is a true compound state, can be 
_ neglected relative to that of the single-particle excita- 
_ tion |m). But the W corresponding to state | 12) is much 

smaller than that corresponding to state |) because 
_ the former is far down in the well, and W is a rapidly 

increasing function of energy, as shown in Fig. 11. 

Taking the quadratic dependence of W given by simple 
theories, as discussed later, 


E— Ej 2 
W m= ( ) Was (59) 
E 


where we have put a lower suffix m on the W on the 
left-hand side to indicate that it refers to the state |m), 
_and we now denote the W referring to state |n) by Wn. 


ngle-particle excitation decays is less than the width 
f the original single-particle excitation. 

_ Usually, the state 7 is a highly excited one since the 
number of states per Mev available increases exponen- 
ly with excitation,* and in this case W „KW n. Of 
rse, the more highly excited states also tend to have 
more complicated structure so that the matrix ele- 
ments for excitation of the A particles become smaller. 
5 n the next part of this section it is shown that the ratio 
(E- ge tends to be ~1/3 or, Wm/Wn~1/9. How- 
e in the case of easily deformable nuclei, “there are 
mat atrix elements to low-lying states j, the, collective 
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je foregoing, we see that the inequalities 
Wm<l<Wn, (60) 


of excitation is limited since the initially excited 
at te already occupied. This is incon- 
tis ramet ion, but this does not 
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can usually be satisfied, although Wm is smaller than 
Wn only by a numerical factor and not by orders of 
magnitude. When these inequalities are satisfied, the 
average can be carried out over an interval which is 
large enough for the convergence of perturbation 
theory but small enough so that one can still obtain 
information about the shape of the single-particle reso- 
nance of width ~W,. In fact, neglect of the variation 
in the |) and other quantities in the various averages 
constrains us to average over a distance < Wn. 

The same arguments apply, with slight modification, 
to the case of inelastic processes going into a low-excited 
final state. Here the transition amplitude is 


2M 
-= (ja | V— V| oc) 
ike 


(ie 


h? J| 1 1 


= — gil kj) R 


2M m,m’ E,,.- E JD =R 


Sac’ j= 


(V—V) 


a=) 


1 
Ly 


= 


N4 | (jm'|V—V|om)— Cin’ 


= WH 


z 


i 
Mi= 
H 


x(V- Ŷ) 


om ) lön (R)m (R) ? 


where j is assumed to be a low-excited state. Perturba- 
tion theory can be used to calculate the (Saa, jav) Since 
the states that # and m’ decay into are, in general, of 
substantially lower energy and consequently have a 
smaller width. This gives the result, in the first approxi- 
mation, 


2M 5 
Sao (10 | Ha y| oa), (62) 
l 


which has been used extensively in calculating direct 
interaction processes. This is just the same result as 
follows from lowest-order Born approximation, i.e., 
keeping only the lowest term in ôV with initial and final 
states distorted by the complex well. Such a weak 
interaction picture is only applicable to the average 
phase and then only if it is averaged over a large energy 
interval. 


7. Perturbation Theory of the Second Kind 


Relation (62) is an important one in that it justifies 
the use of perturbation theory—in the usual termi- 
nology, Born approximation with distorted waves—in 
direct interaction calculations. We found also, in part 6, 
that the average elastic scattering amplitude was 
reproduced by V=V—W with W given to lowest 
order by Eq. (58). This latter relation is interesting but 
not well adapted for the calculation of W since it still 
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involves the highly complicated nuclear states x;(&). 
We therefore develop a perturbation expansion of the 
type used by Bruckner, Eden, and Francis which 
essentially relates quantities back to shell-model states. 
This expansion gives an expression for W which is 
practical for calculation; however, to satisfy the cri- 
terion for its validity, one must average over a large 
energy interval as we shall see. 
In this treatment the unperturbed Hamiltonian is 


A A 
H=S(TAV)—-3 E Vis, (63) 
i=0 


i,7=0 
with eigenstates |@) such that 
| q)=Eq| 4). (63.1) 


Here V; is the self-consistent potential defined in the 
state | @) felt by the 7th particle, as in the Hartree-Fock 
theory, and a: 

Vis=(G|Visl O. (63.2) 


The summations cover the incident particle ((=0) as 
well as the A particles in the nucleus. Thus, the A-+-1 
particles are here treated symmetrically. The c number, 
i>, ,; Vi; has been subtracted to ensure that (¢| H| g) 
is equal to the energy in the Hartree-Fock approxima- 
tion. The crucial question in using the | g) as zero-order 
functions relates to the extent of the spread of the 
strength function (q| p} in energy. This can be obtained 


from 
| 
Ay 


1 
= m files 1), (64) 
Ja 


where Hf is a comparison Hamiltonian which has been 
introduced as an artifice to calculate this strength 
function.*® Its significance is seen later. It is given by 


H=H—w, (64.1) 
where W is then defined by 


on (ahs) 
— (a (V— Pw- V+) lay (64.2) 


Sal P)*}0#=Im | (a 


HE 


— h1 


which guarantees satisfaction of Eq. (64). [See the 
similar development, Eqs. (44)-(49).] This equation is 


“ Brueckner, Eden, and Francis, Phys. Rev. 100, 891 (1955). 
See also M. Cini and S. Fubini, Nuovo cimento 2, 75 (1955) and 
A. M. Lane and C. F. Wandel, Phys. Rev. 98, 1524 (1955). 

46 We use the symbols H and W because, although they stand 
now for different quantities than previously, there is a close 
analogy between these and the previous ones. 
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formally similar to Eq. (56), and by the same procedure 
as employed to solve the latter, we can find that Eq. 
(64.2) can be satisfied by choosing W to be diagonal in 
the |¢) representation, with matrix elements 


(|W |g) 


Ke 


where 


(H—H) 


—A)\¢q 5 
H-V- Enid Alay, g> 
A= | gäl. (65.1) 


In evaluating (¢|W|q) by lowest-order perturbation 
theory which now means replacing 


{H—Aq(H—H)—E-—il}4 


by 
2 |8){2s—E—i XSI, 
šq 
we find that 
A 
W= 5 Wi, (65.3) 
i<j=0 


with 


1 
alw: D= lVi] 2 el Valg). (65.4) 


s—E—i 


The half-width of the strength function m{ (g| pY} }a/D 
is given by 


A 
2 mlwyd, 


i<j=0 


(65.5) 


if we neglect the natural width of the state |ġ}. 

If we identify |g} with the state in which A+1 par- 
ticles form a state, wherein the A levels from the bottom 
of the well are filled except for one hole in the state we 
label by k’ and in which two particles are excited in 
states labeled by k and J, then it is easy to identify the 
various parts of the width: 


1. Im 2 (q|Wasl a) and Im (ql Wal), 


correspond to the widths for the particles k and / to 
interact with the other particles so that the two-particle 
excitation, in perturbation theory language, decays into 
a three-particle excitation. (One should omit j=’ and 
i=k’ in the sums.) 

2. Im 2'5’(q|W.;| @) corresponds to the width for the 
decay of the shell model state with a hole in it, ie, 
the width for absorption of the hole. (The prime indi- 
cates that the term j=’ should be omitted. But j=% 
and / should, for completeness, be included in the sum.) 

By using the approximate symmetry of holes and 
particles‘ near the Fermi surface Er, we find that the 


‘6 Calculations by E. P. Pen: lebury, analogous to those of Lan 
and Wandel,“ indicate that this symmetry holds only near the 
Fermi surface; the absorption for holes then ris 


than i i ; RPE 
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total width of |) is 
Wq=Wn.4W i+ Wi 
(e,—Er)*+ (ea Er)? + (Er— ex’)? 


=l Wa, (66) 
(en— Er)? 


from the simple theories referred to previously. We 
label the width of-the original single-particle excitation 
|n} by Wn as previously, and we employ the W’s to 
denote the imaginary parts of the W,;. 

The width of our original single-particle excitation is 
given in the lowest approximation by 


W,=In > 
4 


(onl È Voy Toaila E Vo Foilon) 


(67) 
Eq—E-il 


The largest phase space is available when the energy 
is equally distributed among the particles and the hole 
so that W, Wi, and Wy are each ~ (1/9). Since 
Wi+-Wit+Wie<W, we can again satisfy 


Wa <I <W n, (68) 


and employ perturbation theory, although all three of 
these quantities are of the same order of magnitude, 
so that the average must be carried out over an interval 
Z that is a good fraction of the width of the single- 
particle resonance. 
In case |o) can be described as a shell-model state, 
i.e., as an antisymmetrized wave function of inde- 
pendent particles moving in a well, or can be obtained 
from such a state by successive application of the two- 
body interaction, the A+1 particles can be treated 
symmetrically, and the identity of the particles can be 
taken into account easily. By using such an iteration in 
which the evaluation is carried out starting from the 
Fermi sea, which is assumed to be a reasonable approxi- 
mation for the interior of a large nucleus, Brueckner*’ 
finds that the value of W is increased considerably over 
that given by the simple theory and introduces both a 
term linear in energy and a constant term in the depend- 
ence of W. This would mean that the criterion, Eq. (67), 
might not be satisfied by any J since the sum 
Wit Wit Wi might be as large as Wh. Whereas 
Brueckner’s calculation brings out the sensitivity of W 
to correlations in the ground-state wave function near 
the Fermi surface, there are good indications that these 
cannot be calculated in perturbation theory, i.e., that 
F = -the ground-state wave function Xo(&) cannot be popes 
jn the region of the Fermi surface by iterating oi 
= shell-model state.” In fact, the parts of these correla- 


BEE p, 2 (1956). 
= aK. A. Brueckner, Phys. Rev. 103, 17 O 936 (1958). 
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tions responsible for the pairing forces tend to make the 
Fermi surface stiffer against perturbations, working in 
the opposite direction from the effects calculated by 
Brueckner.*” Consequently, a consistent evaluation of 
the corrections to the lowest-order theory undoubtedly 
gives results smaller than Brueckner’s calculation. 

In the discussion following Eq. (59) we promised to 
return to the question of easily deformable nuclei where 
there are large matrix elements from xo to low-lying 
rotational states so that the initial single-particle 
excitation has a high probability of decaying into an 
excitation where the initial particle has only slightly 
less energy. This state, therefore, has a width almost 
as large as the initial width. In this case, one should 
separate out the strong transitions to the low-lying 
collective states which, together with the initial channel, 
we term the “chosen channels” and treat them sepa- 
rately in a system of coupled equations as has been 
done by Sano, Yoshida, and Terasawa,” (references to 
extensive earlier work by Yoshida and others are given 
in this article) and by Chase, Wilets, and Edmonds.” 

In fact, the preceding development can be generalized 
so that V—V is a matrix between chosen channels, |o) 
becoming the space of the chosen channels and A, 
excluding these from occurring in intermediate states, 
essentially reproducing Yoshida’s formalism. This 
generalization is within the spirit of the optical model 
where the distortion provided by the central potential 
is supposed to represent the average effects from the 
great number of channels and correspondingly large 
number of degrees of freedom which cannot be con- 
veniently treated in detail. 


8. Special Models 


It would not be possible to evaluate W from either 
Eq. (56) or Eq. (58) since the matrix elements 
(on|V—V|p) or (on|V—V| jm) cannot be calculated 
owing to the complexity of the P (r,¥) and the x;(&). 
It would be difficult to evaluate W even from the 
equations of the last section since these would involve 
using wave functions in a complex well of finite extent. 
Consequently, somewhat idealized models have been 
introduced to carry out this evaluation and we now 
discuss their relationship to the development here. 

The interior of a large nucleus has been represented 
as a Fermi gas. In this case, the absorption, Eq. (67), 
is approximated by 


Wa=— {on |X Vee VO CO) 


where the Fermi sea |o) is taken as an antisymmetrized 
product function of plane waves, |) is a plane wave 
of energy above the top of the Fermi sea, and |@) is a 


49Sano, Yoshida, and Terasawa, Nuclear Phys. 6, 20 (1958). 
6 Chase, Wilets, and Edmonds, Phys. Rev. 110, 1080 (1958). 
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state of the same energy as |on) in which a second 
particle has been lifted above the sea. 

In such a model it is immaterial how large the interval 
I over which the average is taken in calculating W or W. 
No features are left in the model capable of giving rapid 
variations in the quantities calculated, so that the 
scattering amplitude is a smoothly varying function of 
energy and the average phase, for reasonable intervals, 
is very close to the actual phase. Consequently, one 
does not need to carry the iZ in the energy denominator 
in the actual evaluations. However, in establishing the 
validity of Eq. (67), it was necessary to carry the if 
in the compound-state picture. 

A second model, which is not so useful in obtaining 
numerical values, but is instructive, has been used by 
Wigner.’ He assumes that levels | jm) are evenly 
spaced in energy and that the matrix elements 
(jm| òV | kl), where 6V is the perturbation, are all equal 
in absolute value but of random sign as long as Ejm— Ezy 
is less than a certain quantity (in the Wigner formalism 
the energy eigenvalues are real) and zero for greater 
energy differences. He then finds, under certain con- 
ditions which are probably satisfied in the nuclear case, 
a Lorentzian form, i.e., the shape Eq. (33), for the 
strength function in the region of the single-particle 
resonance. This implies, according to Eq. (33), that 
W is constant with respect to energy. 

A third model, used by Bloch,’ is similar to Wigner’s 
except that the point at which he makes his assumption 
of random signs and equality of matrix elements is in 
the perturbation expansion, Eq. (57). This means that 
only the even powers of the expansion contribute. His 
analysis shows that the largest number of nonvanishing 
terms comes from the lowest-order term, corresponding 
to Eq. (58). However, such assumptions of random signs 
are unjustified in the actual physical case, and for the 
case of a large nucleus, when approximated by a Fermi 
sea, it is found that the third term in the series,” which 
would be zero under the above assumptions, actually 
contributes about as much to the imaginary part of the 
potential as the second term. Furthermore, the second 
term appears to be much less predominant when the 
exclusion principle is taken into account since the 
second-order processes are severely inhibited. 

Once one has adopted a simplified model, any 
mechanism that might be present in the actual nucleus 
to give rapid variations in the parameters has been 
dropped, and consequently, the result is insensitive to 
the interval over which one averages since the actual 
phase is very nearly the same as the average phase. 
However, the inclusion of 72 when the compound-state 
energies W, were present was necessary to establish 
the relevance of these models to the calculation of the 
average phase. 

It seems reasonable that at high bombarding energies, 


5 E, P. Wigner, Ann. Math. 62, 548 (1955). 
8 L. Verlet and J. Gavoret, Nuovo cimento 10, 505 (1958). 
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where the widths of the compound states are much 
greater than their spacings, the actual scattering am- 
plitude varies smoothly with energy and that fluctua- 
tions are small. Consequently, one might expect the 
actual amplitude to be the same as the average one 
here, and again, the calculated amplitude would be 
insensitive to the distances over which one averaged. 
This would explain why our formulas for the average 
amplitude have the same form as those of Watson and 
collaborators for higher energy scattering even though 
no average over energy is carried out in their work. 

However, in the intermediate region where fluctua- 
tions may still be important and in the low-energy 
region, it is necessary to carry out the averages aver 
energy as we have done. 

We have not discussed the fluctuation scattering in 
much detail. This is because it is not possible to evaluate 
this directly; since the corresponding particles stay in 
the nucleus a long time, any expansion in terms of suc- 
cessive collisions is bound to fail. In fact, because of 
their long duration in the nucleus, these particles would 
seem to correspond to those described by the original 
extreme compound-nucleus model. Consequently, it is 
reasonable to neglect phase relations here on the aver- 
age, and this results in angular distributions sym- 
metrical about 90°. There is always some ambiguity 
about adding in the fluctuation scattering, but if one 
can calculate the average scattering amplitude so as to 
obtain the first term on the right-hand side of Eq. (26), 
one obtains a lower limit on the cross section for the 
relevant process. This is especially useful in inelastic 
scattering where similar considerations apply. An appli- 
cation of this is indicated in the next section. Further- 
more, if the above assumptions are justified, the fluc- 
tuation scattering is rather structureless and simply 
tends to make the minima in the shape-elastic scattering 
less pronounced without altering the general picture. 


IV. DIRECT INTERACTION IN THE 
DIPOLE PHOTOEFFECT AND 
IN RADIATIVE CAPTURE 


We treat one case in which the incident and emerging 
particles are different, namely, that in which one of the 
particles is a photon. As discussed in Sec. I, a giant 
resonance is observed in the absorption of photons by 
complex nuclei in the region of ~15 Mev for medium 
weight nuclei. Collective models give a natural ex- 
planation of the absorption mechanism but not of the 
number of fast protons emerging which is orders of 
magnitude in excess of the predictions of the statistical 
theory. Direct interactions are necessary to account 
for them. This does not mean that emission of fast 
protons is the dominant decay process; this constitutes 
only ~1% of the total decay processes in heavy nuclei 
where such decays are strongly inhibited by the 
Coulomb barrier. However, this is several orders of 


magnitude above the predictions of 10-4 or 10-> for 3 : 


the probability given by the statistical theory. The — 


oundation 


E oi GHE. 


inverse (p,y) processes violate the predictions of the 
statistical theory in just as striking a fashion.*®*' 
Whereas this theory would predict a strong decrease 
| in the cross section with increasing energy or increasing 
ii” - atomic number A, the experimental results of Cohen®® 
5 show no such decreases for proton energies in excess of 
5 Mev. This is an added indication that direct processes 
play an important role. 
Mii Wilkinson" has proposed a model for the (y,p) reac- 
= {ion which incorporates the physical features that 
follow from the detailed theory. He views the absorption 
of the y-ray by one of the nucleons as leading first to 
an excited state of the single particle of width W in the 
complex well. This is just the single-particle excitation 
discussed earlier. One then associates with this level 
the natural width T for escape from the nucleus and 
= the width 2W for absorption into compound states. 
= Consequently, the proportion T/(T+2W) escape, car- 
_ rying off essentially the full energy of the y-ray. This 
= reproduces the observed order-of-magnitude of fast 
= protons when averaged over the relevant single-particle 
levels. 
= We now extend the formalism to treat the absorption 
of radiation following Brown and Levinger.*® This 
| extension is similar to that carried out in the framework 
of the continuum theory of nuclear reactions by 
© Peaslee.*7 The wave function of the system in which 
= y-rays are absorbed consists only of outgoing waves and 
is described by 


[= x V(r,)= 2 Sra, iXilE) Wat (T), r>R, (70) 


‘< 


IA 

= where Sya; jis the amplitude for y-ray absorption with 
= emission of the nucleon into channel a’, leaving the 
= residual nucleus in state j. For y<R we have 


x W=WV,(1,5)+> > app P) (1,8), (71) 


: _ where the PP) are the compound-state functions, Eq. 
(13). Here Y, is the wave function of the initial nucleus 
(it being understood that a y-ray is present). It can 
b be adequately represented as a shell-model wave 
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beled by r whereas those of all the other A—1 
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on restricting the consideration to the dipole interaction. 
Since the coupling to the radiation field is weak, the 
a,’s are needed only to first order which can easily be 
done by requiring Y to satisfy the Schrödinger equation 
to that order. We have 


HV o= (E—H)} a=}, ap (E-W). (73) 


By multiplying on the left by ®® and integrating, we 
find 
lp= (p| Hr | o)/ (E— Wa); (73.1) 


as expected from perturbation theory. Here we use |o) 
for Ys; E is the full energy of the system. We now 
measure it from the ground state of the initial nucleus, 
E=fw, where hw is the y-ray energy.*8 

We easily obtain Sy, ; by equating Y from Eqs. (71) 
and (73) at r=R, multiplying by x;(£)O.(0,¢) and 
integrating over dé and dQ. 


1 R 
Seen = d f db”) (g R)x; 
mam tt ERE 
p| H 
ele = (74) 
yee) 


From the expansion, Eq. (14), we find 


1 pas ] Hy 0 
DT sR) 2 dm? (R)( jm | ae (74.1) 
This gives a cross section 
hk; 
Tee |S Ted (75) 


Once again we are concerned with the average scat- 
tering amplitude (Sya, ;)4, Which can be obtained from 
Eq. (74.1) by replacing fw by fiw+i7 in the denomi- 
nator, i.c., 


(p| Hr] 0) 


1 2 
(Sya’,j)v= D Pmil(R) jm] p) m) 
mens 


partt (R) Pm I 


which in the equivalent Green’s function description is 


1 ) 


1 
(Sya i)n = 2 Pm (R) (m 
H 
(76.1) 


barFt(R) mm 1 — Bil 

The equivalence of Eqs. (76) and (76.1) is easily shown 
by inserting the unit operator | )(p| to the left of Hy 
in the latter. By employing an expansion similar to 
that, Eq. (45) for (W—fw—il)—, one has 


58 This means that the energy of the excited particle, in terms 
of single-particle excitations, is measured from the energy of the 
ground state of the original nucleus, whereas earlier, the E asso- 
ciated with the single particle in our discussion of elastic scattering 


ie = i! 
2. ‘ (T EA eee 


IP AL ILC AN IL, 
y (jm|Hr| 0) 
(Sya, i)a = D Pmi (R) Ee 
pa H(R) m Ejm—ho—il 
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x| mi V—V|kl) 


_Cim|V— V1 p)0lV— a 
W p—hħhw—il 


x(HlHrlo)}, (77) 


where the label $ refers to the excited states of the A— 1 
particles as does 7. We now limit consideration to the 
case in which e€jm=Z, i.e., the single particle takes off 
approximately the full energy of the y-ray. The other 
case, in which €j,—E>W,, where Wm is the absorption 
relating to the single-particle state |7), can be treated 
just as our simple case of scattering, Sec. III, 4. 

Perturbation theory here amounts to expanding the 
(W,—hw—il) and the first term is obtained by re- 
placing | p)(W,—/liw—il)"(p| by 


Di, n| in) (Ein hoil) in|. 


As discussed in Sec. ITI, 6, this is a good approximation 
because the states |7} are mainly at a lower excitation 
than the original |/) which are the states reached by 
the single particle absorbing approximately the full 
y-ray energy, and consequently the spread of the 
(p|in)? is small compared with W,, the absorption per- 
taining to the state |/). 
We can expect the first term on the right-hand side 
of Eq. (77) to be a good approximation to (Sya’, j)av: 
We consider first the case | j7)=|k/). In this case, it 
W is useful to consider both terms in square brackets 
together. If it is remembered that the overlap between 
|j) and the part of W, relating to the -particles is 
large only if the states of the &-particles are similar in 
both (the final factor (kl| Hr|o) ensures this will be as 
shown), and if Vis chosen so as to satisfy Eq. (49), then 
the term in square brackets here very nearly vanishes, 
any remaining effects coming from the small differences 
between | 7) and the &-variable part of Wo. Since the 
two terms in the square brackets are separately of order 
Wm, and since the denominators in front never become 
smaller than W,-+J, then if the two terms cancel to a 
good approximation, their contribution can be neglected. 
Estimates of the first term in square brackets for the 
case | jm)x|kl) indicate that it is of order (V/A)/ 
(Wı+I) relative to the first term. Corrections for this 
term may have to be made in some cases. We do not 
pursue this further but assume that the first term gives 
a sufficiently good approximation. In any case, we have 
A 
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indicated that the series expansion converges for large 
enough Z so that we could calculate further terms if 
necessary for good numerical results. 

The first term on the right-hand side of Eq. (77) 
looks very much like the matrix element that would 
describe the photonuclear process for a particle in a 
complex well, since the states |) have disappeared. 
Remnants of the fact that the process is really not a 
single-particle one remains however, in that states | 7) 
are employed rather than the shell-model excited states 
which are just the states in which particles fill all but 
one of the A levels from the bottom of the well. To 
show how we can dispose of the details of the states | 7), 
let us assume that the initial state Y, is well represented 
by the shell-model state so that we can write 


Wo(r,€) =Q(E)Ym'(r), (78) 


where Ym (r) is the bound state occupied initially by 
the r-particle. Then we can split the matrix element 
coming into the first term of Eq. (77) into 


(jm|Hr|0)=(m| Arim) f Q(E)x5 (8) aE 


=(m|Hr\m’)Qa;, 


defining Qg;. Let us assume that only one single- 
particle level contributes essentially to a given channel 
which is often a good approximation. The cross section 
arising from the first term on the right-hand side of 
Eq. (77) is 


(78.1) 


2E ee (Sya' ida P= Pa Qa kai K (79) 
i Mc i he|Ejm—hw—il 
where 
hk; 1 2 ; 
ree a 


is the full single-particle escape width taken at energy 
hw— ej where e; is the excitation energy of the state 7. 
From completeness, the probability of finding the state 
Q somewhere in the excited states | 7) is unity, i.e., 


Dj Qe7?=1. (79.2) 


From the development in Sec. ITI, 7, the width of the - 
distribution of ej about |ém| is Wm, where Wm can ; 
be interpreted as the width of the hole in state m. = 
Consequently, we see that the width of the total dis- _ 
tribution of emitted fast particles is the sum of the | E 
width for the excited particle and of the width of the 
hole in the shell-model states. paar 
These arguments do not add any new physi 
features to Wilkinson’s model but show how thi 
simple description relates to the many-body descripti 
Similar equations can be used to estima e the 
section for direct radiative capture of neut a 
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Fic. 12. Relation of the (y,p) to (p,y) reactions. 


protons. An extensive description of this is contained 
in work by Lane and Lynn.*! 

The cross sections for (y,p) and (p,y) reactions can 
of course easily be related. We have the (y,p) cross 
section by summing over all members of the configura- 
tion of the final A—1 nucleons (see Fig. 12). The (y,p) 
reaction goes from the ground state to all final states 7. 
The (,y) reaction may go from the ground state 
X0(é) to the various final states [see Fig. 12(b) ] which 
contain an appreciable component of xo(&). (The wave 
function for the €-particles must remain the same since 
Hr; is a function only of r.) Further, the final state must 
contain an appreciable component of Ym (r), the rele- 
vant single-particle state in the well. From arguments 
similar to those already used the probability of finding 
XoWm: somewhere in the ground or low-excited states is 
essentially unity. Thus, the cross section for the (p,y) 
reaction, summed over the ground and low-excited 
states, is given by Eq. (79) with two minor changes: 
(i) we multiply by the usual factor k,2/k,” from detailed 
balance or dispersion theory mee k, and k, are 
y-ray and proton wave numbers; (ii) we use Pe; with- 
out the averaging implied in the sum over 7 in Eq. (79) 

since the incident particle now has a unique energy. 


V. DISCUSSION OF OTHER FORMALISMS 


Recently, a unified theory of nuclear reactions has 

been formulated by Feshbach,®® and many of the rela- 

t tions between compound-nucleus parameters and those 
of the optical model have been derived by him, although 
he does not develop the perturbation theory described 
= jn Sec. III. Conceptually, his formalism has the ad- 
Bye vantage that it avoids the introduction of a joining 
A radius. However, his formalism does not seem to be as 
= convenient for making many of the arguments of Sec. 


e one par c here even though es, appear quite 
erent at first sight. We briefly demonstrate some of 
relationship, trying to preserve most of Feshbach’s 
on so that one can easily compare our formulas 


BROWN 


Feshbach employs functions ®,(r,£) which are eigen- 
functions of the operator 3C, Eq. (51), and which are, 
consequently, just the Q, of Eq. (51). The A, in Eq. 
(51) effectively projects off the incident channel so that 
the ®, form a complete set of states in the space orthog- 
onal to that of the incident channel. Hence, one can 
write the solution of the Schrédinger equation as 


W(1,8) =x (Eur) +L a”, (1,8), (80) 


and this equation holds for all values of r since the 
boundary conditions on the &, are taken to be outgoing 
waves at infinity in case the particle can escape by a 
channel other than the incident one. In this case, the 
&’s form a continuum, and the sum in Eq. (80) is to 
be understood in a generalized sense to include both a 
sum over discrete states and an integral over the con- 
tinuum states. Since the boundary conditions are 
imposed at infinity, introduction of a channel radius is 
not necessary. 

The a™ are determined by requiring W to satisfy 
the Schrödinger equation HW= EY, and one finds 


fore V (1,8) x0(E)uo(r)d3Ed5r 


a™ = - o N) 
E-6, 


where we have used Feshbach’s notation in denoting 
the eigenvalues of ®, by &,. Our a are the same as 
his An. One also finds, by multiplying the Schrödinger 
equation on the left by x.(&) and integrating over dÈ, 
that 


(T—V—E)uo(r) 
+E a” f xo(E)V(r,8)®,d%=0. (82) 


By substituting the a from Eq. (81), one finds that 
tolr) obeys the equation 


(T+0—E)uo(r)=0, (83) 
where U is Feshbach’s generalized optical potential 


V=V@)+E 


SEES iia 


’ 


fi En 
(83.1) 


and U is an integral (nonlocal) operator. One can 
easily make further connections between our formalism 
and his by noting that his matrix H;; is just 


Goe 
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In the very low energy region in which only elastic 
scattering is possible, the main difference between the 
two formalisms results from the fact that the incident 
channel occurs in our ©) but not in Feshbach’s ®,. 
The width corresponding to this channel is contained 
in our complex W,, whereas one must essentially invert 
a matrix—which can be done in a straightforward 
fashion—to introduce the widths into the denominators 
of the scattering amplitude in the formalism of Fesh- 
bach. The discussion following Eq. (51) shows that the 
eigenfunctions P> and ®, are, aside from the above 
difference, very nearly the same. 

The difference between the formalisms is that one 
can start from a formalism without joining radius, as 
Feshbach does, in which case the expression for the 
wave function is simple, but the transition to the scat- 
tering amplitude is somewhat complicated in detail. 
On the other hand, one can begin with a joining radius, 
in which case the scattering amplitude is simply ob- 
tained, and then carry out various sums as we have 
done to obtain results in which this radius does not 
appear. Whereas the former procedure has conceptual 
advantages, the intermediate formulas seem simpler in 
the latter procedure and the arguments about the 
spectrum of the compound states, which we have made 
to establish perturbation theory, appear to be simpler. 
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A more general formalism for nuclear reactions has 
been given by Bloch,’ and we have referred to his work 
at several points. By appropriate choice of represen- 
tation, one can obtain either the Wigner-Eisenbud or 
Kapur-Peierls theory from his formalism. Making the 
arguments of this article in his formalism would be 
completely equivalent to the treatment here, and we 
have chosen the simpler notation of the less general 
treatment. 

Finally, many of the arguments of this article were 
first formulated in the Wigner-Eisenbud formalism. 
This is especially true of the development of Sec. III, 3 
as shown by a given reference. However, for uniformity 
we have chosen to restate most of these arguments in 
our formalism which is just as suitable for them and is 
probably simpler for other considerations as discussed 
iniSec ls 
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I. INTRODUCTION 


ONSIDERABLE information about the brems- 
strahlung process has accumulated during the 

past several years. This information includes various 
cross-section calculations and measurements, which 
have helped to provide a more accurate description of 
the process. Unfortunately this material has never been 
assembled and integrated in an easily referenced form, 
although some general reviews! on the subject are 
available. This paper provides a coherent summary of 
the bremsstrahlung cross-section formulas and related 
data. The theoretical formulas and their specific limita- 
tions are presented in a form convenient for practical 
calculations. Estimates of their accuracy are given for 
cases where comparisons can be made with experi- 
mental results. Correction factors are indicated in 
either numerical or analytical form. A brief summary of 
other data pertaining to electron-electron and to thick 
target bremsstrahlung is also included. No results are 
presented for electron and photon polarization effects. 
Section IIB briefly discusses the problem of making 
exact cross-section calculations and indicates the gen- 
eral types of calculations that have been completed. A 
summary of the various cross-section formulas is given 
in Sec. IIC. Section IID gives useful graphical informa- 
tion derived from the various formulas in IIC. Section 
ITE lists corrections that can be applied to the above 
1H. Bethe and E. Salpeter, Encyclopedia of Physics (Springer- 


Verlag, Berlin, 1957), Vol. 35, p. 425; S. T. Stephenson, ibid., 
Vol. 30, p. 337. 
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formulas. In Sec. IIF, experimental bremsstrahlung 
cross sections are compared with the theoretical results 
contained in Secs. IIC and D. Conclusions with regard 
to the accuracy of the theory are presented in Sec. IIG. 
Section III summarizes the very sparse material avail- 
able on electron-electron bremsstrahlung. Finally, Sec. 
IV gives a brief treatment of thick target bremsstrahl- 
ung with information on the bremsstrahlung angular 
distributions (IVA), the spectra (IVB), and the pro- 
duction efficiencies (IVC). 


Il. BREMSSTRAHLUNG CROSS SECTIONS 


The cross sections discussed in this section apply to 
the bremsstrahlung process? in which an electron is de- 
celerated in the field of an atomic nucleus. These cross 
sections give direct estimates of the properties of the 
radiation emitted when electrons are incident on thin? 
targets, and provide basic data for analyzing the thick 
target bremsstrahlung considered in Sec. IV. 


A. Symbols, Constants, and Energy- 
Momentum Relations 


Eo, E =initial and final total energy of the electron in 
a collision, in moc? units. 

Po, p =initial and final momentum of the electron in 
a collision, in moc units. 

To, T =initial and final kinetic energy of the electron 
in a collision, in moc? units. 


k,k =energy and momentum of the emitted photon, 
in moc? and moc units.t 

&o = total energy of an electron incident on a thick 
target, in moc? units. 

To =kinetic energy of an electron incident on a 
thick target, in moc? units.T 

0o 0 =angles of po and p with respect to k. 

$ =angle between the planes (po,k) and (p,k). 

dQ, =element of solid angle, sin@od@od¢, in the direc- 
tion of k. 

dQ, =element of solid angle, sin@déd¢, in the direction 
of p. 

q =momentum transferred to the nucleus, in moc 
units. 


= po—p—k; = p? HHR? —2pok cosho 
+2pk cos6—2pop(cosé cosdo+siné sinbo cose). 
80,8 =ratio of the initial and final electron velocity 
in a collision to the velocity of light. 


2 Except for electron-electron bremsstrahlung which is briefly 
considered in Sec. III, no results are presented for other brems- 
strahlung processes, involving for example the acceleration of 
positrons or protons. : 

3A target is defined to be thin if both the electron scattering 
and energy loss processes have a negligible influence on the energy 
and angular distributions of the bremsstrahlung. Order of magni- 
tude estimates of such thin targets for particular energy regions 
can be found in the references listed in Sec. TIG. 

t This system of units for the symbols is used consistently 
throughout this paper. For cases in which the data are given in 
Mey units, these symbols have the multiplicative factor 0.51; for 
example, the kinetic energy in Mev units is represented by the 
quantity 0.517». \ 
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Z =atomic number of target material. 

ds; = bremsstrahlung cross section, differential. with 
respect to the parameter 7, in units of cm? per 
atom per incident electron. 


dr =volume element. 

T =radius vector from a center, in units of the 
Compton wavelength, Ao. 

a =angle of k with respect to the direction of the 


electron beam incident on a thick target. 


No  ~=6.0310% atoms (or molecules) per mole. 

c =3.00X 10 cm per sec. 

e =4.80X 10" esu= 1.60 10" coulomb. 

e = 1.44X 10" Mev cm. 

h =h/2r=6.58X10-" Mev sec=1.05X10~*" erg 
sec. 

he = 12.4 kev-angstroms. 

he =1.97X10! Mev cm. 

ħc/e =137. 

mo =9.1110-*8 g (electron mass). 

moc? =0.511 Mev. 

Xo =h/moc=3.86 X10 cm (Compton wavelength). 

To = e?/myc?=Ko/137 = 2.82 X 10” cm (classical elec- 
tron radius). 

$ = Z?r?/137= 22 5.78X 108 cm?. g 

do =h?/moe?= 0.530 10-8 cm (radius of hydrogen 
atom). 

= 137% p= (137)?r0 

do =8rro?/3 (Thomson formula) =6.64 10-75 cm?. 

e/ ao ig ile, ev. 

Io = ionization energy of hydrogen atom = 1/2 (137)? 


in moc? units. 
Z?Io =ionization energy of K electron (if<«1). 
1 Mev=1.60X10~ erg. 


Eè =pe+1, B=p+1. 


Eo = Inr a I IMi 
` £ 1 à 1 
Bo a = a a E] 
: (1—8)? (1—6?)3 
Eo =k+E. i x 
po =[To(Tot2)}, p=CT(T+2)}) 
3 Bo 3 B 
0 = tS See 
(1—8)? (1—6)3 
Bo Bes pe 
Eo E 


B. Types of Cross-Section Calculations 


The bremsstrahlung cross section do, for single pho- £ aa 
ton emission in a large cubic box of side L, is given by __ 
the transition probability per atom per electron divided 


by the incoming electron velocity. This cross sect 


rn 


tA 


922 H. W. 


can be expressed in dimensions of cm? as 


w i NE 
a= (>) 108, (II-1) 
(poc/ Eo) \ moc 
where 
w= (2r/h)pe s| His. * (IT-2) 


The term P; is the density of final states and can be 
written as 


` pERdkdQ.d2,L 


(II-3) 
(2r) moc? 


Py 


The term H;; is the matrix element for the transition 
of the system from an initial state before the emission 
of the photon to a final state after the emission. The 
quantity |H;;|* in formula (IT-2) can be written as 


2 
ary 3 Fe 2\ 2 
| H;;| ae (moc?) 


1C 


x| f Vitae ae Yd] I=, (II-4) 


In the foregoing, 4 is the unit polarization vector of the 
photon, œ is the Dirac matrix, and y; and y, are the 
Dirac wave functions for the initial and final electrons, 
respectively. Therefore the cross section in cm? can be 
written as 


1371? pEvE 
do= 
(2r)* po 


2 


[Serer ale pidr 


XkdkdQ,dQy (II-5) 


The important quantity to be evaluated is the matrix 
element H;,, defined in formula (II-4). 

The problem of evaluating an “exact” expression for 
the cross section involves, therefore, the use in the 
matrix element of “exact” wave functions, which de- 
scribe an electron in a screened, nuclear Coulomb field. 
It is not possible to solve the Dirac wave equation in 
closed form for an electron in a Coulomb field, pri- 
marily because the wave function must be represented 
as an infinite series. Therefore, various approximate 
wave functions and procedures have been used. 

The cross-section calculations that have been made 
may be classified either as nonrelativistic or relativistic 
depending on whether the Schrédinger or Dirac form of 
the Hamiltonian is used for the electron and field system. 
The calculations have been carried out wit (a) non- 
lativistic Coulomb wave functions (Sommerfeld); 
>) relativistic Coulomb wave functions (Sommerfeld- 
_Maue) valid to first order in (Z/137)?/1, where l is the 
, angular momentum quantum number that is the sum- 

vation index in the expansion of the wave function; 


e 
this problem is given by H. A. Bethe 


ajled discussion of 
‘ailed discus ev. 93, 768 (1954). 


Maximon, Ph 
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and (c) free-particle wave functions perturbed to first 
order in Z (Born-approximation procedure). 

The nonrelativistic cross-section formulas derived in 
the dipole approximation by Sommerfeld® with Cou- 
lomb wave functions have a complicated form with 
hypergeometric functions and are difficult to evaluate. 
Some numerical estimates® of the Sommerfeld cross 
sections have been made for selected values of the elec- 
tron energy, the target atomic number, and the photon 
energy. However, the theory is only valid when £o is 
small compared to unity, and can be expected to break 
down for initial electron energies greater than a few 
kilovolts. In addition, the theory disregards screening 
effects, which are important for very low energies and 
for targets with high atomic numbers. Because of these 
limitations, results of the Sommerfeld theory are not 
presented here. 

Cross-section calculations with relativistic Coulomb 
wave functions (Sommerfeld-Maue) including screen- 
ing corrections have been made by Olsen, Maximon, and 
Wergeland” and by Olsen and Maximon.® Their for- 
mulas are valid only in the extreme-relativistic region 
(above 50 Mev). Their results have the form of an 
additive correction factor to the Born-approximation 
formulas. 

The cross-section formulas calculated by the Born- 
approximation procedure with free-particle wave func- 
tions are available in a relatively simple analytical form 
for nonrelativistic and relativistic energies, with or 
without screening. In general, the Born approximation 
theory becomes less reliable as (a) the atomic number of 
the target increases, (b) the initial electron energy de- 
creases, and (c) the photon energy approaches the high- 
frequency limit. In spite of their limited validity, the 
Born-approximation formulas have been surprisingly 
successful in predicting the properties of the brems- 
strahlung radiation. Even where there is a breakdown 
of the Born approximation, the accuracy of the cross- 
section formulas is still reasonably good, and in the 
worst cases (except at the high-frequency limit), they 
can be expected to give at least the correct order of 
magnitude. Therefore, this paper emphasizes the Born- 
approximation cross-section formulas and includes 
various theoretical and empirical corrections to these 
formulas. Detailed references to the many papers in 
which these formulas are derived are given in Table III. 


C. Bremsstrahlung Cross-Section Formulas 
and Classification Diagrams 


A general classification of the various differential 
forms of the bremsstrahlung cross section is presented 
in Chart 1 for the Born-approximation formulas and in 
Chart 2 for the extreme-relativistic formulas that con- 


5 A. Sommerfeld, Wellenmechanik (Frederick Ungar, New York, 


1950), Chap. 7. i 
6 P, Kirkpatrick and L. Wiedmann, Phys. Rev. 67, 321 (1945). 
1 Olsen, Maximon, and Wergeland, Phys. Rev. 106, 27 (1957). 
8 H. Olsen and L. C. Maximon, Phys, Rev, 114, 887 (1959), 
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Born Approx{mation Cross Section Formulas 


#7 K,0,,0,0 


(Differential {n Photon 
Energy and in Photon 
and Electron Emission 
Angles. ) 


1BS 
(Bethe, Hoitler, 
Sauter, Racah 


2BS 
(Schiff, z = 0) 


GF oo 
o 


(Differential in 
Photon Energy 
and Angle.) 


3BS(e) 
(Schiff,7=0) 


3BS(d) 
do, (2<7<15) 


(Differential in 
Photon Energy.) 


3BS 
Bethe, Heitler) 


3BS(c) 
<2) 


Any 7) 
3BS(a) 
(7 = 0) 


ks 
Qrad (Bethe ,Heitler) 
(Total Radiation 


Cross Section. ) 


tain the Coulomb correction. The formulas represented 
are summed over the directions of the electron spin 
and the photon polarization vectors, and thus do not 
include all of the possible differential forms of the cross 
section. The primary formula gives the cross section 
that is differential in photon energy and in photon and 
electron emission angles. The remaining formulas that 
branch out from this starting point are divided into two 
main groups that are designated as screened or non- 
screened. Further subdivisions are made; these depend 
on the type of screening approximation, and on whether 
nonrelativistic, extreme-relativistic, small-angle, or 
large-angle approximations are used. For most of the 
cases, the charts include the names of the principal 
authors associated with a particular formula. 

The formulas are identified as follows: (a) the num- 
ber applies to a particular differential form of the cross 
section; (b) the first letter indicates either B for Born 
approximation (Chart 1) or C for Coulomb correction 
(Chart 2); (c) the second letter indicates either S for 
screening or N for no screening; and (d) the last letter 
a, b, or c indicates further subdivisions for specific 
approximations. The following -notation has been 
adopted here. The differential forms of the brems- 
strahlung cross section are designated by the symbol, 
doa,p,-... This symbol is the bremsstrahlung cross sec- 
tion that is differential only with respect to the pa- 
rameters given by the subscripts a, 8, ---, and is ex- 
plicitly defined by doa,g...= (d"c/dadB- --)dadB---. The 

a 
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(Bethe, Heitler, y= o) 


(Non- 


Relativistic [ion | 


(small ts) 
Angles 2BN( a 
ean Extreme- (Sommerfeld) 
(Sauter, { =") Rela 
CEA 2BN(b) 
Angles) (Hough) 
(Non- 


Relativistic) 3BN(a) 
(Bethe, Heitler) 

3BN(b) 
(Bethe,Hettler) 


3BN 


Extreme- 
Relativistic) 
(Non- 
( LBN > Relativistic) TER 
Racah 
LBN(b) 
Relativistic) (Racah) 


unit of the cross section, doe,g..., is cm? per atom per 
incident electron. 

The symbols and definitions for the specific cross 
sections are as follows. 


(a) dok,eo.2,¢ is the bremsstrahlung cross section that 
is differential with respect to the photon energy, k, and 
to the photon and electron emission angles, 00, 8, and ¢. 
This formula contains the parameters Eo, Z, k, 00, 6 
and ¢. 


? 


CKART 2 


Extreme-Relativistic Cross Section Formulas with Coulosb Correction 


ics 
(Olsen, Maximon, Wergeland) 


do 4,0,,0,9 
(Differential in 
Photon Energy and 
in Photon and 
Electron Enission 
Angles.) 


(Differential in 
Photon Energy 
and Angle.) 


doy 


(Differential in 
Photon Energy.) 


Xs 
(Davies, Batha, 
Maxizon, Olsen) 


Grad 
(Total Radtation 
Cross Section. ) 


nee ————errrrr ae 
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Formula 1BS—Differential in photon energy and in photon and electron emission angles. oe 
Approximation (H). Reference formulas : (13) in reference (a), (2) in reference (b), (13) in reference (c), (29) in ref- 


ence (e). 
Z (ro? dk p ddQ, p° sin’. po? sin?4o ; 
dernso=—(~) NG 4) ——— GE- g) + GE — g?) 
137 \2r kp @¢ (E—p cos6)? (Eo— po cos9o)? 


2ppo sin8 sino cosġ (4EEo—g?) 2k? (p° sin?0+ po sin?™o— 2p po sind sinbo cos) ll 
3 fe | 


3 (E— p cos§) (Eo— po coso) (E— p cos§) (Eo— po coso) | 
where 
= pt perth —2pok cosbo+2 pk cosd—2 pop (cosd coso+sin# sindo coso) 
and F(q,s)=atomic form factor discussed in Sec. ITE (3). 


Formula 1BN—Differential in photon energy and in photon and electron emission angles. 
4 Approximations (H), (B), (1). Reference formula (17) in reference (c). 


Z 6) dk p dQ dp 
137 k po 


È= p?+ p — 2p polcosð cosbo+sind sinbo coso). 


dok,80,0,¢= {H sin°@+ po? sin*@o—2p/o sinf sinbo coso}, 


T, 
where 


Formula 2BS—Differential in photon energy and angle. 
Approximations (H), (G), (M), (J), (K). Reference formula (1) in reference (h). 
4Z*r 0? dk 16y E (Eo+E)? Eeg+E 4yE | 
any + InM (y) 


yay = = ) 
137 k (OHIE G24+1)2E? LO++I)E? G?+1)'Eo | 


1 PEN Zi 2 
mearain 
M(y) \2EE 111(9?+-1) 


Comment: This formula becomes Formula 2BN (a) when Z=0 in M (y). 


dok,09= 


where 


Formula 2BN—Differential in photon energy and angle. 
Approximations (H), (B). Reference formulas: (11) in reference (d), (4.1) in reference (f). 


Lre dk p 8 sino (2E +1) 2(5E¢+2EEo+3) 2(pe—k) 4E 
dok,0o,6= — d| ————_—__ — — + 
81137 k po porAct peA? QA? poAo 
Ib [= sinh (3k— pE) Hei (E+E) 2—2(71E—3EEo +E) 2k(E¢+EEo— =| 
|) EE ee eee 
Pho poAo peA? prac porAo 
4e El 4 6k 2k(pè—k?) 
oe") 
pAo. (0) Ag? Ao Q?Ao 
where 


[| =] [=] 
L=ln| —————— |; Ao=Eo—focos0o; e=]n ; el=l]n 
EEo—1— ppo - E—p Q—-p 


= por-+h?—2pok cosho. 


Formula 2BN (a)—Differential in photon energy and angle. 

Approximations (H), (B), (J), (K). 
22r? E dk r (Eo+E)’Eo pi Al (2+ E?) Eo 4607 Eo" |} 
— AR —_— - ——_—_ n — — ; 
r137 Eo k (1+6e2Ec?)* = E(1+-60?E0*)? k E(1+60?E 0)? (1+02E08)* 


Comment: This formula was obtained from formula2BN by making the high-energy and small-angle approximations, The same 
result is obtainable from formula 2BS by setting Z=0 in M (y). 


dok:,60,6= 


2BN (b)—Difierential in photon energy and angle. A s 
Bonnie t pec maton (H), (B), (J), (L). Reference formula (8) in reference (i). 


Bre E dk d% E+E 2EE (SEo+2E) 2[—2Eè ln (k/Eo)] 
dok, aN )s oln —— + 
ak BP 1137 Eè k (1—costo)?|\ EEo Eo EE 

5 ELO?-+E(k—Eo cos9o)] Eok(1—cosdo) 


E,Q2(1—cos60) EQ@ 


[1—cos#o ] 


Q+E 
(30-+E(Zo+k)] nf )} 


— E 
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TABLE I.— (Continued). 


Formula 3BS—Differential in photon energy. : 
Approximations (H), (J). Reference formulas: (31), (34), (35) in reference (a); (62) in reference (b); (21), 
reference (c); (56)—(58b) in reference (j). 


4Z?rè dk EN:N [¢:(7) ] 2 Ef dee) 
i OO 
2 137 k Eo 4 3 Eo 4 


Formula 3BS (a)—Complete screening (y=0 or ¢1(y=0) =4 In183; $2(7=0) =¢1(y=0) —4). 
Formula 3BS with y=0. 


s Z 


4Z°rè dk E\? 2E 1E i 
do;= — +(—) —-— | In(183Z-!)+-—}. E 
137 k Eo) 3E 9 Eo 4 


Formula 3BS (b)—Arbitrary screening. 2 
42r? dk BN = dq 2 E I q dq 
T (5) Hf f araro] f (a missos) FO | | 
137 k Eo Ji g 3 Eo ô 6 ø 


where 


6=k/(2EoE). 


Formula 3BS (c)—Intermediate screening I (y <2). 
Formula 3BS with ¢:1(y) and ¢2(y) given in Fig. 1. 


Formula 3BS (d)—Intermediate screening II (2<y<15). 
Formula 3BS with $1 (y) =¢2(y) = 19.19 —4 Iny—4c(y) with c(y) given in Fig. 2. 


4Z?rè dk IDN 215 2EoE 
O 
137 k Eo 3 Eo k 


Formula 3BS (e)—Differential in photon energy. 
Approximations (H), (G), (M), (J). Reference formula (3) in reference (hb). 


22Z?rè dk JON PIG, 2 Ef2 4(2—6*) 8 2 
do.= — (++(=) —- =) InM (0)-+1—- tno) = In (1-+-8?)-+ tan“1b——+—- |}, 
137 k Eo 3 Eo b E|% 3b: 32 9 


2E.EZ3 1 ke NE Zi \? 
111k M(0) \2EcE 111 


Comment: This formula is obtained from formula 50 of reference (b). 


where 


Formula 3BN—Differential in photon energy. ze 
Approximations (H), (B). Reference formulas: (15) in reference (a), (16) in reference (c), (17) in ee 


in reference (e). 


5 Zro dk p Pp? eo «Eo eco SEoE kR(ESE+ pep’) k EoE-+- pe EoE+p? \ 
{3-220 L oS SE ( = (a 


do;= ATI 
A 137 k po Pp J pë # pop L3pop pèp 2pop pò D 


H where 
EvE+pop—1 Eo+ po E+p 
L=2 In] ————— |; «9=In ; e=In{f —— }. 
k Eo— po E—p 
. i Formula 3BN (a)—Differential in photon energy. 
i Approximations (H), (B), (1). Reference formula (18) in reference (c). 
Zr 16dk 1 
PENE ON) 
137 3 k pè o— p/ 


Formula 3BN (b)—Differential in photon energy. z 
. Approximations (H), (B), i. Reference formulas: (16) 


yong ae 


inr 
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TABLE I.— (Continued). 


Formula 4BS—Total radiation cross section. 
Approximations (H), (F), (J). Reference formulas: (47) in reference (a), (34) in reference (c), (62) in reference (j). 


4Pr? 1 
Qrad = [ massz-+—] 
18 


j 
i 
: a 
| i 


Formula 4BN—Total radiation cross section. 
Approximations (H), (C). Reference formulas: (29) in reference (c), (41) in reference (e). 


É =| (12E7+4) (8£5+-6po) 2 


$ret In (Eo po) — [In (Zo+ po) P—$+ 
137 3E sho 3E ope? Eopo 


[F(x)]}, 


where 


: =In(1+y) 
Pia f dy and x=2po(Eo+/o). 
0 y 


For small x, F can be expanded in the power series: 


For large x, F is given by 
F (x) = }n?+-3 (Inx)?—F (1/2). 


Formula 4BN (a)—Total radiation cross section. 
Approximations (H), (C), (1). Reference formulas: (32) in reference (c), (21) in reference (d). 
grad = (16/3) (Z?r0?/137). 
Formula 4BN(b)—-Total radiation cross section. 


Stone (H), (C), (J). Reference formulas: (33) in reference (c), (41’) in reference (e), (61) in reference (j), 
46) in (a), (22) in (d). 


is differential with respect to the photon energy, &, 
and the emission angles @ and ¢, It can be obtained by 
integrating the differential cross section in (a) over the 
direction of the outgoing electron. This formula con- 
tains the parameters Æo, Z, & and ĉo. 
(©) dor is the bremsstrahlung cross section that is 
i differential with respect to the photon energy & It can 
be obtained by integrating the differential cross section 
in (a) over the emission directions of the photon and the 
electron, This formula contains the parameters Ep, Z, 
and &, 

(d) raa is the only cross-section symbol used here 
that does not represent a differential form of the 
bremsstrahlung cross section. It is equal to the quantity, 
(1/Eo) Jo kdor. This form of a total bremsstrahlung 
cross section integrated over photon energy and photon 

~ and electron emission angles was introduced by Heitler,? 
who has defined it as the cross section for the’ energy 
Jost by radiation. This formula contains the parameters 


Eo and Z. 

= The Born-approximation formulas that apply to 
py t 1 are- presented in Table I, and the extreme- 
ivistic formulas with the Coulomb correction that 
v to Chaft 2 are presented in Table IT. The im- 


E he Quantum Theory of Radiation (Oxford Uni- 
, Heitler, aoe 1954), third edition, p. 242. 
r 


EN 


p~ 


Daa -C 


42r? 
brad =-—— (In2Eo— 3). 
137 
= = DCA = —-— = N 
(b) doro.o is the bremsstrahlung cross section that portant references and approximations for the formulas 


in Tables I and II are listed in Table III. The explicit 
expressions for the formulas in Tables I and II are not 
necessarily the same as the formulas in the original 
references because the attempt is made to use consistent 
units and symbols, with energies and momenta ex- 
pressed in moc? and moc units, respectively. 


(1) Born-A pproximation Cross-Section Formulas 


The Born-approximation calculations require the 
initial and final electron kinetic energies in a collision 
to be large enough to satisfy the conditions: (27Z/137£o), 
(27Z/1378)K1. At extreme-relativistic energies, the 
cross sections predicted by the Born-approximation 
formulas are larger than the true cross sections. For 
example, the value of the total cross section predicted 
for lead by the Born-approximation formula is about 
10% larger than the value predicted by more accurate 
formulas.” At very low energies, the situation is re- 
versed and the Born-approximation cross section is 
smaller than the true cross section. The energy region 
in which the Born-approximation formulas require only 
small corrections is approximately between 4 and 10 
Mev for the initial electron kinetic energy. As a rough 
guide, it is estimated that Born-approximation formulas 
for the total radiation cross section, @raa, are correct to 
within 10% for initial electron kinetic energies above 2 
Mev and within a factor of two below 2 Mev. 
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(2) Extreme-Relativistic Cross-Section Formulas 
with the Coulomb Correction 


The formulas in Table II are valid for arbitrary Z 
and have been developed in a series of papers by Bethe 
and Maximon,” Olsen,” Olsen, Maximon and Werge- 
land,” Olsen and Maximon,®” and Davies, Bethe, and 
Maximon." Their calculations were carried out (a) with 
Sommerfeld-Maue wave functions, and (b) with the 
extreme relativistic approximations: Eo, E, k>>1, and 
poo 1. These formulas are estimated’ to have an 
accuracy of the order of (Z/137)?(InE/E) which is 


+7) 


T eu e ai. i 


. 100k 
EEZ "3 


Fic. 1. Screening factors?’ ¢ı(y) and ¢2(y), for electron- 
nuclear bremsstrahlung plotted as a function of y=100k/ZoEZ!. 
The curve marked “Hydrogen atom” was calculated with exact 
wave functions. The curves for the Thomas-Fermi atom and a 
bare nucleus differ by the quantity 4c(y), where the function cly) 
is plotted in Fig. 2. 


100 k 


y EEZ" 
Fic. 2. Screening factor, c(y), for electron-nuclear brems- 
strahlung plotted as a function of y=100k/EyEZ}. 


10 H. Bethe and L. C. Maximon, Phys. Rev. 93, 768 (1954). 
u H., Olsen, Phys. Rev. 99, 1335 (1955). 

12 H, Olsen and L. C. Maximon, Phys. Rev. 110, 589 (1958). 
8 Davies, Bethe, and Maximon, Phys. Rey. 93, 788 (1954). 
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Fic. 3. Dependence of the Born-approximation cross section 
integrated over the photon directions on the photon and electron 
energy. The ordinate values for these curves are obtained from 
Formula 3BN for 0.05- and 0.50-Mev electrons, and from For- 
mula 3BS (e) for 5-, 50-, and 500-Meyv electrons. 


RELATIVE INTENSITY 


k/To 


Fic. 4. Dependence of the bremsstrahlung spectrum shape on 
the electron kinetic energy for a platinum target (Z=78). The 
relative intensity (defined as proportional to the product of the 
photon energy and number per unit time) is integrated over the 
photon’ direction and is normalized to unity for zero photon 


energies. The intensity values were computed from Formula 
3BS (e)? 


better than 2% for electron kinetic energies above 50 
Mev and for Z equal to 80. > 


5 


a 


D. Graphical Representations of the Formulas _ 


z 


A general picture of the dependence of the coe 
section formulas in Sec. IIC on the electron and phot n 
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TABLE II. Bremsstrahlung cross-section formulas with Coulomb correction. 


Formula 1CS—Differential in photon energy and in photon and electron emission angles. 
Approximation (J), (N). Reference formula (7b.13) in reference (m). 


Ze (ro\2 dk p dd) (° (9? —g2)4(1—F(y)) 7° | (° T&F (x) ldx 
dormne= (2) ==> Í dy——————— | dp-pJ (qp) [O — g?) ] exp zie | — 
k po g qQ: y 0 0 x 


137 \2r 
| p° sin? pè sin*6o 2ppo sin sinbo coso (4EEo— g?) +2k? (p? sin?0-+ po? sin?ho—2ppo sind sinbo coso) 


2 


(48 ¢—¢) + —— 4 -¢) - 
(E—cos6)? (Eo— fo cos§o)* (E— p cos8) (Eo— po coso) 
where ; 
P= p+ pertk?—2pok cosbo+2pk cosð—2pop(cosð cosho+sinð sindo cosg) 
gz=pocosbotp cosd—k; gu®= po? sin*@o+p? sin*@—2pop sino sind coso. 


F(x), F(y) are atomic form factors discussed in Sec. ITE (3) and functions of the momentum variables x and y. 
F(x) cannot be set identically to zero as discussed in reference (n). 


Formula 2CS—Differential in photon energy and angle. 
Approximation (J), (N). Reference formula (7.2) in reference (n). 


2Z?rè dk dt 
= =a (EC +E) (3-+2P) —2EoE (1+41222)}, 
to” 


dok,Oo= 


where 
1 


1 5 
f= ; u=pðo; r=in(-)-2-12)+5(-) 
1+12 ô E 


ô R [e (8/#)] k 
5(-)= {Hi—F (g) #-1} dq; ô= ; 
td B/E Ë 2EoE 


N 100k 
?ormula 2CS(a)—Complete screening { -=0, where y= i 
£ EEZ 


111y 1112-3 
Formula 2CS with =o ) or r=n( )2-10, 
200 £ 


F(Z) =1.2021 (Z/137)? for low Z 
=0.925(Z/137)? for high Z. See reference (k) for further discussion. 


Formula 2CS(b)—Arbitrary screening. 
Formula 2CS with the form factor, F(q), as an arbitrary function. 


where 


Formula 2CS (c)—Intermediate screening. 
Formula 2CS with 5(6/£) given by 


624 
TR 0.5 1.0 2.0 4.0 8.0 15.0 20.0 25.0 30.0 35.0 40.0 45.0 50.0 60.0 70.0 80.0 90.0 100.0 120.0 
1216 
—(5/£) 0.0145 0.0490 0.1400 0.3312 0.6758 1.126 1.367 1.564 1.731 1.875 2.001 2.114 2.216 2.393 2.545 2.676 2.793 2.897 3.078 


Formula 2CN—Nonscreened case. 
Formula 2CS with P=1In(1/6) —2—/(Z) 


Formula 3CS—Differential in photon energy. 
Approximation (J). Reference Formula (1) in reference (I). 


4772 dk E\?\[ ¢:(7) 2 E\[¢2(y) 
dam a GE) | aza -Gal anzao |); 
137 k Eo 4 3 Eo 4 


Formula 3CS (a)— Complete screening (y=0 or ġı(y=0)=4]1n183, $2(y=0) =¢1(y=0) — $). 
Formula 3CS with y=0. i 4 

42279? dk | (=) =| A 

‘l Fora. . 


Eo 


` do.= 


= —=— |On (1832-4) — f(Z) + 
j- 137 k 3 


Eo 0 


- Formula 3CS (t)—Arbitrary screening. 
“` grr dk IENNE chi the g Neils oh Sida 
N) ILS ign a) TEO —3 EE $ ol angr 4s CRO +t-J2) : 
= Eo & c 


dok 137 k 
ô=k/(2E0E). 
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TABLE II.— (Continued). 


Formula 3CS (c)—Intermediate screening I (y <2). 
Formula 3CS with ¢ı(y) and ¢2(y) given in Fig. 1. 


Formula 3CS (d)—Intermediate screening II (2<y<15). 


Formula 3CS with $1 (y) =¢2(7) =19.19—4 Iny— “4c (y) with c(y) given in Fig. 2. 4 


2E cE a 

[> =e) -40 | x 

k Oty 

2 E 2EoE 

I- 
k 


3 Eo 


N . do.= 


Formula 3CN—Nonscreened case (y= or c(y)=0). 
Formula 3CS with y= o. 


4 AZ dk EN? 
inz a HE) -= 
137 k Eo 


Formula 4CS—Total radiation cross section. 


Approximation (J). Reference formula (45) in reference (k). 


4277? 
Prod = 
137 


Formula 4CN—Total radiation cross section. 


Approximation (J). Reference formula (44) in reference (k). 


427" 


Praa = 


4Zr°? dk EN?) (20 
=a (5) =o 
137 k Eo 3 Eo 


[assz- nio, 


7 [n2Eo—3—/(2)] 


wo | o: 


energies, the photon emission angle, and screening cor- 
rections is presented in Figs. 3-10. These graphs provide 
various types of theoretical intercomparisons primarily 
for energies above 1 Mev. Such a detailed examination 
of the predictions is useful only for the high-energy 
region where the theories are reasonably accurate and 
require much smaller corrections than in the low-energy 
region. The high-energy intercomparisons rely heavily 
on the extreme-relativistic predictions of Schiff! which 
depend on the validity of the complete screening 
approximation [sce Sec. ITE (3) ]. The Schiff formulas 
are given in a relatively simple analytical form, and 
have been used extensively for estimating the spectrum 
shape from a high-energy accelerator even though other 
more complicated formulas with intermediate-screening 
approximations are believed to be more accurate (see 
: Table V). 


(1) Dependence of the Bremsstrahlung Spectrum 
on Electron Energy 


= Figure 3 shows the dependence of the bremsstrahlung 
spectrum (integrated over the photon directions) on 
various initial electron kinetic energies for a platinum 
target (Z=78). The spectra for 0.05- and 0.5-Mev 
electrons were calculated from Formula 3BN. The 
spectra for 5-, 50-, and 500-Mev electrons were cal- 
culated from Formula 3BS(e). Figure 4 compares 
spectrum shapes predicted by Formula 3BS(e) for 
various electron energies. 


(2) Dependence of the Bremsstrahlung Spectrum 
on Photon Angle 


Figures 5(a)-(e) show the dependence of the spec- 
trum shape on the reduced photon angle, Fo, as ob- 


tained from Formula 2BS. The figures show that as the 
emission angle increases, the relative number of high- 
frequency photons increases until the trend reverses at 
the larger angles. For comparison, the spectrum shape 
integrated over the emission angle is evaluated from 


TABLE III. Approximations, conditions of validity, and references dee 
for bremsstrahlung formulas of Tables I and IT $ 


Approximation Condition of validity 


A. Nonscreened OOZ (1+ prhe (E0E/k) 

B. Nonscreened 13777 (E0E/k) 

C. Nonscreened E1377 4 
D. Complete screening GOZ (1+ p0) (E0E/k) 4 
E. Complete screening 13777 (EoE/k) PH 
F. Complete screening Ey>137Z++ . 

G. Approximate screening (Ze/r) exp(—r/a) 


potential: 


H. Born approximation (2rZ/13760), (2rZ/1378)<K1 1 É 


I. Nonrelativistic Bol A 
Extreme relativistic Eo, E, k>1 f: 

K. Small angles sind=0 a 

L. Large angles A0>>0 P 

M. Approximation in electron 69S (Z!/111E0) LS 
angle integration. Result J 
not accurate for 

N. Small angles <p <5 


a H. Bethe and W. Heitler, Proc. Roy. Soc. (London) A146, 83 (1934). aa 
sane Beta proc. Combes Phi Se 30. Sef RSN v r 
eitier, Quantum eory o; adiation O; 
London, 1954), third edition, p. 244. niversity 
- Sauter, Ann, Physik 20, 404 (1934). 
G: Recan, Nuovo mento ‘11, in OS) 
L. Gluckstern and M. REA Rev. 90, 
ON Sommerfeld, Ean er ‘Ghrederioe Ungar, Ney 


Chap; 
I Schiff, Phys. Rev. 83, 252 (1951). 
rE: V. C. Hough, Phys. Rev. 74, 80 (1948). 
1 E. Segre, Experimental Nuclear "Physics (Ji 
York, 1953), p. 
x Davies, Betho and Maximen, Phys. Rev: 93, 3-788 


ohn Wiley & s 


1 H. Olsen, Phys. Rev. 99, 1335 (19. 
= y und, v. 106 
TLE, Schiff, Phys. Rev. 83, 252 (1951). Bearer eens 38 
Š 2 r- k te > Ea 3 
gi k CC-0. Gurukul Kangri University Haridwar Collection. Digitized by S3 F 


930 H. W. KOCH 


ae 


Formula 3BS(e) and is shown by the dashed line. In 
Figs. 6(a)-(e), the dependence of the cross section 
| (Formula 2BS) on the photon emission angle, ĝo, is 
plotted for various photon and electron energies. 

The spectrum shape integrated'® over the photon 
directions with the limits from zero to a maximum 
value of 6 equal to O is of practical interest to experi- 


-O.51To ° 10 Mev 
Z». 78 


q 


RELATIVE INTENSITY 


k/ To 
Fic. 5(a) 


O.51To "20 Mev 
Z:78 


tt 


AND J. W. 


MOTZ 


mentalists. Figures 7(a)—(e) show the spectra obtained 
for different values of £)© by integrating Formula 2BS 
within the above limits of 0o. These curves facilitate 
estimates of the change in thin-target spectra for 
different experimental arrangements that subtend vari- 
ous angles. In Figs. 8(a)—(c), the curves give estimates 
of the fraction of the total number of photons at any 


T =a T 


ee! 
0.51To * 40 Mev 
Zz: 78 


INTEGRATED OVER bo” 


RELATIVE INTENSITY 
oa 


Fic. 5(c) 


ale || aia T 
0.51To > 90 Mev 
Zz: 78 


a T 


5 ‘ 8 
= Z 
2 a 
2 2 
5 E 
= 2 g 
> w 
= z INTEGRATED OVER 4% 
4 <q 
D ma) 
td INTEGRATED OVER 8o 2 
2 a2, 
Fae 2 4 6 8 10 fo) 2 6 8 T.O 
g. k/ To k/ To 
Fic. 5(b) < Fic. 5(d) 
= ema . 30, 981 (1959). 
15 J. H. Hubbell, J. Appl. Phys. 30, (1959) 
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0.51To » 300 Mev 
Z:78 


INTEGRATED OVER bo 


RELATIVE INTENSITY 


k/T 
Fic. 5(e) 


Fıc. 5 (a) Dependence of the Schiff spectrum shape on the photon 
emission angle, ĝo, for 10-Mev electrons and for Z=78. The data 
are obtained from Formula 2BS (solid lines) and from Formula 
3BS (e) (dashed line). The values for the intensities (defined as pro- 
portional to the product of the photon energy and number per unit 
time) are normalized to unity at the zero photon energy. (b) De- 
pendence of the Schiff spectrum shape on the photon emission 
angle, 00, for 20-Mev electrons and for Z=78. (c) Dependence of 
the Schiff spectrum shape on the photon emission angle, ĝo, for 
40-Mev electrons and for Z=78. (d) Dependence of the Schiff 
spectrum shape on the photon emission angle, 8o, for 90-Mev 
electrons and for Z=78. (e) Dependence of the Schiff spectrum 
shape on the photon emission angle, 8o, for 300-Mev electrons. 


given energy that are included within the angular limits 
from zero to Eo@; these curves are obtained by graphical 
integration from Figs. 6(a), (c), and (e) for initial elec- 
tron kinetic energies of 10, 40, and 300 Mev. 

Figures 5-8 present some predictions of the Born- 
approximation formulas given in Table I. For com- 
parison, the spectrum shapes as a function of the photon 
emission angle that are predicted by the more accurate 
extreme-relativistic Formula 2CS(c) in Table II, are 
shown in Figs. 9(a) through (e) with a normalization 
of unity for zero photon energy. The spectra for elec- 
tron kinetic energies of 10, 20, and 40 Mev, Figs. 9(a), 
(b), and (c), are predicted with a zero Coulomb cor- 
rection factor, {(Z)=0, in Formula 2CS(c), and the 
spectra for electron kinetic energies of 90 and 300 Mev, 
Figs. 9(d) and (e), include the Coulomb correction 
factor for Z=78. A comparison of the spectral shapes 
with and without the Coulomb correction factor shows 
only small differences compared to the larger effects that 
occur with different types of screening approximations. 


(3) Screening Effects and Coulomb Corrections 


Figures 10(a)-(e) intercompare Formula 2BS (0° 
Schiff), Formula 3BS(c) (Schiff’s formula integrated 


s 
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over the photon angle), Formula 3BS (Bethe-Heitler’s 
formula integrated over the photon angle), and the 
latter formula including the Davies, Bethe, Maximon 
correction [Formula 3CS and Sec. ITE (1)]. The three 
curves that are integrated over photon angle are appre- 
ciably different in shape. For example, the curves 
labeled “Schiff” and “Davies, Bethe, Maximon” are 
10% different for an electron kinetic energy of 10 Mev 
at a fractional photon energy of 0.7 with the normaliza- 
tion used in this figure. For the sake of completeness, 
the spectra corrected for multiple scattering are also 
plotted in these figures. The multiple scattering cor- 
rection as calculated by Hisdal!® is discussed in Sec. IV. 


E. Corrections for the Cross-Section Formulas 


Various corrections have been obtained for the for- 
mulas given in Sec. IIC. These corrections may be 
classified according to three types: (1) Coulomb cor- 
rections, (2) high-frequency-limit corrections, and (3) 
screening corrections. In each case, the correction is re- 
stricted to a particular energy region, and is intended 
to apply only to the formula for a particular differential 
form of cross section as specified below. 


(1) Coulomb Corrections 


(a) Nonrelativistic energies—In the nonrelativistic 
region where 791, Elwert"” has estimated a multipli- 
cative Coulomb correction factor for the cross-section 
Formula 3BN(a). The Elwert factor, fz, can be 
written as 


_ Bol 1—expl— 2nZ/13780)} 
Z B(1—expl— (2"Z/1378)]) 


This factor is valid only if (Z/137) (@4—Bo")<1. This 
requirement forbids the use of the Elwert factor near 
the high-frequency limit. In addition, the Elwert cor- 
rection was derived on the basis of a comparison be- 
tween the nonrelativistic Born-approximation and the 
nonrelativistic Sommerfeld calculations. Therefore the 
factor is restricted to nonrelativistic electron energies. 
For higher electron energies (of the order of the electron 
rest energy), the experimental results in Sec. IIF show 
that the Elwert factor breaks down. As a rough guide, 
the Elwert factor may be expected to give results that 
are accurate to about 10% for electron energies below 
about 0.1 Mev. 

(b) Intermediate energies—In the energy region from 
roughly 0.1 to 2.0 Mev, Coulomb corrections to the 
Born-approximation formulas are not available in 


(11-6) 


analytical form. Therefore these corrections must be __ 
estimated empirically from experimental results (Sec. _ 
IIF). For the cross-section formulas differential in 


photon energy, døg, such empirical corrections cannot 


16 E. Hisdal, Phys. Rev. 105, 1821 (1957). 
11 G. Elwert, Ann. Physik 34, 178 (1939). 
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Fic. 6. Angular dependence of the Schiff cross section (Formula 2BS, Z=78) at various photon energies for the following 
electron energies: (a) 10 Mev, (b) 20 Mev, (c) 40 Mev, (d) 90 Mev, and (e) 300 Mev. 


(Figs. 6(b) to 6(e) continued on following pages) 


be determined in enough detail from the available data 
to cover the whole energy region. However, corrected 
estimates of the integrated cross section, raa, are given 
in Sec. IIF from which empirical correction factors can 
be obtained. The results indicate that the corrections 
to the Born-approximation formulas for raa are as 
large as a factor of two in the energy region close to the 
electron rest energy, and less than 107% in the energy 
region from about 4 to 20 Mev. 
(c) Extreme-relativistic energies—In this energy re- 
rmulas that include the Oia a ie for 
P i ctions dok,00,¢ and do, are given 
the geeni ee ree of the formulas in Tables I 
in Table II. -A compar! 


i be 
s that the Coulemb correction can 
wel ae Born-approximation formulas for doz by 


gion, fo. 


the addition of 


42Z°rè? dk JENE BIB; 
a=- 11+(—) |x, on 
Sie Eo 3 Eo 


where f(Z) is approximately equal! to 1.20(Z/137)? for 
low Z and 0.925(Z/137)? for high Z. This additive term 
is independent of the type of screening approximation 
that is used and is similar to the correction derived for 
the pair production process.!! For lead and energies 
above 50 Mev, the correction decreases the Born- 
approximation ġa with intermediate screening by 
about 10%. The corrected cross section should be 
accurate to about 2%, 

Accurate experimental data corroborating the cross- 
section values predicted by these formulas are not yet 
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available. However, confirmation is available in the 
results of absolute pair cross sections. The ratios of the 
experimental pair-production cross sections to the 
Born-approximation cross sections are found to agree 
with the Davies, Bethe, Maximon™ values, as shown 
in Fig. 11. 

(2) High-Frequency-Limit Corrections 

The formulas in Tables I and II are derived on the 
basis of certain approximations which do not permit an 
evaluation of the cross section at the high-frequency 
limit. This shortcoming has been emphasized by various 
experimental studies'® which indicate that the cross 
section has a finite value at this limit. 

Recent calculations made in the Sauter approxima- 
tion (expansion in powers of 2/1378) and Z/137) by 
Fano predict a finite value for the cross section at the 
-s 18 W, C. Miller and B. Waldman, Phys. Rev. 75, 425 (1949); 

= Fuller, Hayward, and Koch, ibid. 109, 630 (1958); D. Jamnik 


-A (private communication). r 
19 U, Fano, Phys. Rev. (to be published). 
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high-frequency limit. In contrast, the cross section with _ 
the Born-approximation (expansion in powers of | 
Z/13780 and Z/1378) becomes zero at the limit. The — 
cross-section formulas for the high-frequency limit 
obtained by Fano are” oe 
Zr 2 in3 
[doco lee tye a 
137? RES k (1—Bo coso) — 


X (1+ 48 0(Eo— 1) (Eo—2) (1—60 cos6o)} (IL 


and, after integration over ĝo, 


Zid eee 
[do bp k=7e= ‘— - Bo oe 
137° k =) 
pee 
3 _ (Zo+1) 2Boltc? \ 


* The cross section differen 
therefore, both the Sauter-approxima’ 
mation calculations predict ‘that the 
frequency limit is zero for @ equa 
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Fano, Koch, and Motz% have shown that Formula 
| (II-9) overestimates the cross section at the high- 
| frequency limit and that a more accurate estimate is 
f obtained for a given electron kinetic energy if this for- 
$ mula is multiplied by the ratio of the “exact” to the 

Sauter photoelectric cross sections.” A summary of 
their results is given in Fig. 12 which shows the depend- 
ence of the bremsstrahlung cross section (integrated 
over photon direction) at the high-frequency limit on 
the incident electron energy for aluminum and gold 
targets. The solid lines (Sauter-Fano) are predicted by 
Formula (II-9) and the dashed lines (corrected Sauter- 
Fano) are estimated to be the corrected cross-section 


21 Fano, Koch, and Motz, Phys. Rev. 112, 1679 (1958), 

2 Detailed formulas for the Sauter photoelectric cross section 
and for the “exact” cross sections of Sauter-Stobbe and Nagasaka 
are given by Heitler? and by G. White Grodstein, Natl. Bur, 
Standards Circ. No. 583 (1957). The “exact” photoelectric cross- 
section formula for high energies has been calculated recently by 
R. H. Pratt, thesis, University of Chicago (June, 1959), and Phys, 
Rey. (to be published). 
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values. A comparison of the theoretical and experi | 
mental values indicates that the true cross sections at | 


the high-frequency limit are predicted by the dashed 
curves with an accuracy of approximately 20%. 


(3) Screening Corrections 
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where V(r) is the potential that determines the inter- 
action for the bremsstrahlung process and r is the radius 
vector in units of the Compton wavelength, Xo. This 
potential for an atom is represented by the sum, 
V,tV., where V, is the potential arising from the 
nuclear charge and V. is the potential arising from the 
charge of the atomic electrons. If the atomic electron 
form factor is defined as 


4r singr 
ra= fon( dr, (11-11) 

Ze qr 
where p(r) is the electron charge distribution, then the 
matrix element, M, can be written as proportional to 
the quantity (Z/q’) (En— Fe). En is the nuclear form 
factor which is roughly equal to unity. Therefore, the 
unscreened differential cross-section formulas may be 
corrected for screening effects by including the multi- 
plicative factor [1—F..|?. 
“us. J. Biel and E. H. S. Burhop, Proc. Phys. Soc. (London) 
A68, 165 (1955). 
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For a Thomas-Fermi model, Fe depends on the quan- 
tity gZ—* where q has a minimum value of (po—p—2). 
At higher energies, gmin becomes equal to (k/2EoE) and 
screening calculations are expressed in terms of y 
= 100k(E.EZ*). y is approximately equal to the 
radius of the Thomas-Fermi atom (rrr=137Z—4) di- 
vided by rmax, where 7max is the maximum impact pa- 
rameter discussed by Heitler® and is equal to gmin 
Tf fmax is large compared to the nuclear radius but small 
compared to the atomic radius rrr, then y is large and 
F.(q,Z)=0. If max is of the order of rrr, then y~1 and 
screening must be taken into account. If the impact 
parameter is of the order of the nuclear radius, then the 
distribution of the nuclear charge must be included by 
a nuclear form factor although the influence of the 
distribution of the atomic electrons can be neglected. 

The dependence of fmax on the initial electron kinetic 
energy at all energies can be obtained by’setting Tmax 
equal to (po—p—k)—. The results are shown in Fig. 13 
for k equal to 0.179, 0.5To, and 0.9 To. Also, the dashed 
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lines give the Thomas-Fermi atomic radii for beryllium 
and gold. A comparison of 7max with rrr shows that 7max 
is larger than rrr at low and high energies. To be 
specific, screening effects can be expected to become 
important over a large part of the spectrum for electron 
kinetic energies above approximately 5 Mev and below 
approximately 10 kev. For low fractional photon en- 
ergies where k<0.1To, screening effects are important 
for all values of To. It is interesting to observe that for 
the high photon energies the screening effects are the 
: a least important for values of To approximately equal to 
S AA the electron rest energy- x 

= zH _ The accuracy obtainable with a bremsstrahlung for- 
mula corrected for screening depends on the validity 
of the extreme-relativistic approximations and on the 
equacy of the atomic model used to provide the form 
ly the latter will be commented on here. The 
Hy ensive calculations and applications have been 
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the Hartree self-consistent field model is more accurate” 
but more difficult to apply. The atomic form factors 
predicted by the two models have been compared by 
Nelms and Oppenheim” and are given in Fig. 14. The 
curves in this figure show that the accuracy of the 
Thomas-Fermi model decreases as the atomic number 
decreases. 

Information concerning the influence of the form fac- 
tor differences on the bremsstrahlung cross section can 
be obtained by referring to pair production calculations. 
The nuclear momentum distribution in the pair pro- 
duction process at extreme-relativistic energies as cal- 
culated by Jost, Luttinger, and Slotnik,”® is given in 
Fig. 15. Their results show that the most probable q 
values are of the order of 0.1. Table IV gives the ratio 
of the Thomas-Fermi to the Hartree form factors for 
representative g values, as obtained by Nelms and 

“uA. T. Nelms and I. Oppenheim, J. Research Natl. Bur. 


Standards 55, 53 (1955). 
25 Jost, Luttinger, and Slotnik, Phys. Rev. 80, 189 (1950). 
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Fic. 8. (a) Dependence of the Schiff cross section on the angular 
integration limit, Z)0, for various photon energies with 10-Mev 
electrons. The curves are derived by graphical integration of the 
curves in Figs. 6(a), (c), and (e), and may also be derived analyti- 
cally by the integration of Formula 2BS. (b) Dependence of the 
Schiff cross section on the angular integration limit, Ho, for 
various photon energies with 40-Mev electrons. (c) Dependence 
of the Schiff cross section on the angular integration limit, £09, 
for various photon energies with 300-Mev electrons. 


is not available for the bremsstrahlung process. How- 
ever, general conclusions are possible on the basis of a 
comparison of the maximum impact parameters for 
bremsstrahlung and pair production. The maximum 
impact parameter for bremsstrahlung is (2EẸE/k) and 
the similar expression for pair production is (2E,#_/k) 
where E, and E_ are the total energies of the positron 
and electron. By varying the values of Æ and k for 
fixed Eo in bremsstrahlung and the values of Æ, and 
'E_ for fixed k in pair production, we find that the im- 
portant impact parameters in bremsstrahlung are 
larger on the average than those in pair production. This 
fact explains why the screening effect is much larger on 
grad than on Ppair for equal values of Eo and k (see, for 
example, the total cross sections for the two processes 
in reference 9, pp. 252 and 262). The larger screening 


5 effect indicates that the use of the Hartree form factor 
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Fic. 9. (a) Dependence of the Olsen-Maximon spectrum shape = 
on the photon emission angle, ĝo, for 10-Mev electrons. The values ps- 
for the intensities (defined as proportional to the product of the 
photon energy and number per unit time) are obtained from For- 
mula 2CS(c) with f(Z)=0 and are normalized to unity at the 
zero photon energy. (b) Dependence of the Olsen-Maximon spec- 
trum shape on the photon emission angle, 6, for 20-Mev elec- 
trons. (c) Dependence of the Olsen-Maximon spectrum shape on 
the photon emission angle, 6o, for 40-Mev electrons. (d) De- 
pendence of the Olsen-Maximon spectrum shape on the photon 
emission angle, 60, for 90-Mev electrons and f(Z) =0.925 (Z/137)2. 
(e) Dependence of the Olsen-Maximon spectrum shape 
the photon emission angle, @, for 300-Mev electrons an 
J (Z) =0.925 (2/137). : 
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Fic. 10. (a) Comparison of 10-Mev spectrum shapes predicted 
by Formula 2BS for 6)>=0° (Schiff), Formula 3BS (c) (Schiff in- 
tegrated over photon angle), Formula 3BS (Bethe-Heitler inte- 
rated over photon angle), Formula 3CS (Davies, Bethe, Maximon 
integrated over photon angle), and the Hisdal calculations with 
multiple scattering corrections for '=0° (see Sec. IVB). (b) Com- 
parison of 20-Mev spectrum shapes. (c) Comparison of 40-Mev 
spectrum shapes. @ Comparison of 90-Mev spectrum shapes. 
>) Comparison of 300-Mev spectrum shapes. 


BREMSSTRAHLUNG CROSS SECTIONS 


in place of the Thomas-Fermi factor will have a greater 
effect on raa than on pair. 

The first detailed study of the influence of form 
factors on screened bremsstrahlung cross sections was 
made by Bethe.?’ Bethe’s calculations, which are sum- 
marized in the formulas of Tables I and II, consider 
four types of screening: 


1, complete screening condition: y~0; 

2, intermediate screening condition I: y <2; 

3, intermediate screening condition II: 2<y<15; 
4, no screening condition: y>>1. 


The Bethe” and Bethe-Heitler*’ screening calculations 
with intermediate conditions I and II were performed 
numerically using the tabulations of the atomic form 
factor for the Thomas-Fermi model given by Bethe.” 

In the work of Schiff analytical calculations were 
made possible by the use of the complete screening 
condition (y+0) and an approximate screened atom 
potential, V, given by (Ze/r) exp(—r/a), where a 
= (111/2). The atomic form factor, Fe(q,Z), corre- 
sponding to this potential is given by the quantity 
[1+ (ag)*}“. For many purposes the Schiff Formulas 
2BS and 3BS(e) are sufficiently accurate. Schiff! notes 
that compared to the intermediate screening Formula 
3BS, the complete-screening Formula 3BS(e), “is 
larger than it should by by less than 2% for moderate 
values of Z and is never more than 4% high in the worst 
case of large Z and energies such that screening is 
incomplete.” 

A third procedure for including form factor effects 
was developed by Moliére.* By approximating the 
Thomas-Fermi potential with a simple analytical ex- 
pression, he obtained the following relation: 


[1-F.(Q,2Z)]_ 2 a 


(II-12) 
ha = B2+¢" 
where a:=0.10, a2=0.55, a3=0.35, 
7 B:=Z',/121; bı=6.0, b:=1.20, b3=0.30. 


The Moliére function has been applied by Olsen and 
Maximon® to obtain intermediate screening formulas 
that include Coulomb corrections. 

The most accurate predictions of screening correc- 
tions to bremsstrahlung cross sections for specific 
target elements can be obtained by the use of the 
Hartree form factors in the formulas that permit the 
use of arbitrary form factors, e.g., Formulas 3BS(b) 
and 2CS(b). Unfortunately, the screening corrections 
for these formulas must be evaluated numerically and 
are not as convenient to use as the complete screening 
formulas just discussed. 


27H. Bethe, Proc. Cambridge Phil. Soc. 30, 524 (1934). 
z ( z i Bethe and W. Heitler, Proc. Roy. Soc. (London) A146, 83 
1934). 
2 Hf. Bethe, Ann. Physik 5, 385 (1930). 
3 G. Moliére, Z. Naturforsch. 2a, 133 (1947). 
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Fic. 11. Dependence of the pair cross-section ratio, dopair (true) / z 
dopair (Born), on the photon energy. The solid curve is taken from 
Grodstein™ and the dashed curve is taken from reference 13. 
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F. Comparison of Theory and Experiment 


aie an 


Very few experimental determinations of the brems- 
strahlung cross section are available for comparison 
with the estimates given in Secs. IIC, D, and E. At 
present, experimental data on cross sections have-been 
obtained for electron kinetic energies of 34 kev*! by 
Amrehn® and Ross,” 50 kev by Motz and Placious,** 
90 to 180 kev by Mausbeck* and Zeh,“ 0.5 and 1.0 
Mev by Motz,*® and 2.72, 4.54, and 9.66 Mev by 


—— SAUTER -FANO 
~=~--- CORRECTEO SAUTER-FANO 


St, er, ae Sane eee. 


. 


Fic. 12. Dependence of the bremsstrahlung cross section at the — 
high-frequency limit, integrated over photon direction, on the _ 
incident electron kinetic energy. These data are obtained from _ 
reference 21, and the dashed curves are estimated to give the 
most accurate values for the cross section. 1 a 

31 Most of the experimental data that is available in this low 
energy region has been produced by the pioneering work — 
Kulenkampff and co-workers. Their measurements give extensi 
information about relative angular distributions and spectra 
thin targets, and show general agreement with the nonrelati 
Sommerfeld theory.§ The details of their various e 2 
included in this report which is primarily concerned with 
cross-section measurements and comparisons with — the 
approximation theory. a 

“2 H. Amrehn, Z. Physik 144, 529 (1956); D. Röss 
versity of Wiirzburg (December, 1957). eror 

3 J. W. Motz and R. C. Placious, Phys. Rev. 109, 23. 

** H. Mausbeck, thesis, University of Wiirz 
thesis, University of Würzburg (1957) 

35 J. W. Motz, Phys. Rev, 1 
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© Starfelt and Koch.*® The important results of these 
studies are combined and summarized below. 


(1) Cross-Section Differential in Photon 
Energy and Angle 


For electron kinetic energies that are small compared 
with the electron rest energy, the experimental re- 
_sults*!-®3 show general agreement with the Sommerfeld 
theory’ except for certain minor discrepancies which 
x probably occur because the theory does not account 
r relativistic and screening effects. On the other hand, 
e Born-approximation theory (Formula 2BN) i 
seriously inadequate in this energy region and no 
ulytical correction factors for the Born-approxima- 
formula are available in this differential form. 

Furthermore, no quantitative studies are available on 
=~ the importance of screening. 

For electron kinetic energies that are of the same 
rder of magnitude as the electron rest energy, the 

) -approximation theory (Formula 2B under- 

mates the experimental cross section*?5 as, shown by 
e comparison in Fig. 16. These data also show that 

erences between the theory and experiment in- 
h (a) the photon energy, (b) the photon angle, 

mic number of the target. 
on kinetic energies that are large compared 
ce rest energy, the experimental results’? 


s. P 102, 1598 (1956). 
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Fic. 13. Dependence 
of rmax on the initial 
electron kinetic energy 
for k equal to 0.17%, 
0.57, and 0.976. rmax is 
the reciprocal of the 
minimum momentum 
transferred to the nu- 
cleus (=po—p—k), and 
is given in units of the 
Compton wavelength, 
Xo. The dashed lines 
give the values of the 
Thomas-Fermi radius 
(=137Z73) for beryllium 
and gold. 
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agree within 10% with the Born-approximation theory. 
For example, in Fig. 17, the experimental cross sections 
for gold at 4.54 Mev*® show general agreement with the 
predictions of the screened, extreme-relativistic Schiff 
Formula 2BS and of the unscreened Sauter Formula 
2BN. There are differences in detail (generally less 
than 10% in this energy region): (a) near the high- 
frequency limit, the experimental cross sections are 
greater than the Schiff cross sections which in turn are 
greater than the Sauter cross sections; (b) in the low- 
frequency region, the experimental cross sections show 
good agreement with the Schiff cross sections, but are 
less than the unscreened Sauter cross sections. For low 
Z targets, there is better agreement with the Sauter 
formula. c 


(2) Cross-Section Diferential in Photon Energy 


A comparison” of experimental and theoretical values 
for the cross section differential in photon energy, dox, 
is given in Figs. 18-21 for electron energies of 0.05, 0.5, 
1.0, and 4.5 Mev. Each of these figures gives the esti- 
mates of (a) the Born-approximation cross sections 
[Formulas 3BN or 3BN(a) ]; (b) the corrected Sauter- 
Fano cross sections at the high-frequency limit [Sec. 
IIE (2) ]; (c) the Elwert-Born approximation cross sec- 
tions [Sec. ITE(1)]; and (d) the experimental results. 
The solid lines show the cross sections computed from 
Formula 3BN(a) for 0.05 Mev, and dorm a, 3BN for 
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Fic. 14. Evaluation® of the atomic form factor, F (q,Z), for the Hartree self-consistent field model (solid lines) and for the 
Thomas-Fermi model (dashed line), as a function of the nuclear momentum transfer, q. 


0.5, 1, and 4.5 Mev. The dashed lines give the Born- 
approximation cross sections corrected by the Elwert 
factor defined in Formula (II-4). The comparison with 
the experimental results shows that the Elwert correc- 
tion gives the most accurate results at very low energies 
(below 0.1 Mev). For electron kinetic energies of the 
order of the electron rest energy, the cross sections ob- 
tained with the Elwert correction factor are still less 
than the experimental values (by as much as a factor 
of two in the worst case). For very high energies, the 
Born-approximation theory overestimates the actual 
cross sections, and the Elwert factor is no longer valid, 
although it gives good agreement with experiment in the 
5-Mev cross-over region [see Sec. IIF(3)]. The cor- 
rected Sauter-Fano cross sections at the high-frequency 
A 


limit show good agreement with the experimental re- 
sults as noted previously in Fig. 12. 


(3) Total Cross Section 


The experimental values for the total cross section, 
rad (defined in Sec. IIC), are shown in Fig. 22 by the 
closed and open circles for initial electron kinetic en- 
ergies of 0.05 Mev, 0.5 Mev, and 1.0 Mev. The 
theoretical values are shown by the solid lines, which 
are predicted by Formulas 4BN(a) in the région where 
To <0.5 and by Formula 4BN for no screening. The 
curves that include screening corrections for Z=13 and 
79 are obtained by numerical integration of the inter- 


mediate screening Formulas 3BS(c) and 3BS(d). For ` 


CC-0. Gurukul Kangri University Haridwar Collection. Digitized by S3 Foundation USA ` 


g 


s 
Di í 


A, 


ee a er 


Fic. 15. Momentum 
distributions? for the 
recoil nucleus in nuclear 
pair production for sev- 
eral photon energies. 
The relative number of 
recoils are plotted in 
Figs. 15(a), (b), and (c) 
as a function of the mo- 
mentum, g. These curves 
are summarized in (d) 
and are compared with 
the asymptotic curve. 
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Fic. 16. Dependence of the 
bremsstrahlung cross section 
dok,#o,¢ on photon energy and 
angle, 6o, for 0.5- and 1.0-Mev 
electrons. The theoretical cross 
sections shown by the solid 
curves are calculated from For- 
mula 2BN, and the experimen- 
tal values for gold are given 
by the open circles. 
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Fic. 17. Dependence of the bremsstrahlung cross section 
dcok,60,¢ for gold, on the photon energy and angle, ĝo, for 4.54-Mev 
electrons. The theoretical cross sections are given by the solid 
curve (Schiff, Formula 2BS), and by the dashed curve (Sauter, 
Formula 2BN). The experimental values?’ for gold are given by 
the open circles. 


extreme-relativistic energies, the triangles give the 
most accurate theoretical cross-section values?” for 
Z=79, which are estimated by numerical integration 
of the Coulomb-corrected Formula 3CS. The most 
accurate values for Ørad, which are estimated from the 


37 For Z=13, the corrected values for energies above 50 Mev 
have only small differences (less than one percent) with the values 
shown by the solid line for Z=13. 
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Fic. 18. Dependence of the bremsstrahlung cross section in- 
tegrated over photon angle on the photon energy for 0.05-Mev 
electrons. The Born-approximation cross sections shown by the 
solid curves are calculated from Formula 3BN (a), and the Born- 
Elwert cross sections shown by the dashed curves are obtained z 
from the product of Formula 3BN(a) and the Elwert factor, 
Formula (II-4). The experimental values% are shown by the open 
and closed circles for gold and aluminum, respectively. The cor- 
rected Sauter-Fano values at the high-frequency limit are esti- 
mated in reference 21. 


above combined data, are shown by the dashed curve 
for Z=13 and the dot-dashed curve for Z=79. 
Approximate correction factors for the Born-approxi- 
mation ¢raa values with screening have been estimated 
as a function of the initial electron kinetic energy from 


by the empirical (dashed and dot-dashed) curves and _ 

by the Born-approximation curves with screening. 
These estimated factors are given in Fig. 23, and show 
that the ratios are equal to unity at the energy of 
approximately 10 Mev for aluminum and 6 Mev for es 


quency region because of a crossover effect (see ] 
in Bethe and meee reference 1), which is- 
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kinetic energies approximately equal to | 
rest energy. 
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Fic. 19. Dependence of the bremsstrahlung cross section inte- 
grated over photon angle on the photon energy for 0.5-Mev elec- 
trons. The Born-approximation cross sections shown by the solid 
curves are calculated from Formula 3BN, and the Born-Elwert 
cross sections shown by the dashed curves are obtained from the 
product of Formula 3BN and the Elwert factor, Formula (IT-4). 
The experimental values? are shown by the open and closed 
circles for gold and aluminum, respectively. The corrected Sauter- 
Fano values at the high-frequency limit are estimated in refer- 
ence 21. 


G. Summary 


A survey of the foregoing data leads to some general 
conclusions about the accuracy of the cross-section 
values predicted by the various formulas and correction 
factors. Also, suggestions can be made for selecting 
formulas that give the best estimates for the cross sec- 
tion or that can be easily evaluated to give reasonably 
accurate results. These judgments are summarized in 
the following. 

In Table I, the screened formulas depend on the ex- 
treme-relativistic approximation and therefore are valid 
only in the energy region Toœ1. For 7.1, only the 

_ nonscreened formulas are applicable.*® The nonscreened 
formulas require relatively large correction factors ex- 
cept in the region near the crossover energy (see Fig. 
22). At the extreme relativistic energies the nongcreened 

p formulas are less accurate than the screened formulas. 

= Jn Table II, the extreme-relativistic cross section 
= formulas for do%,00,¢ and dog are estimated to have an 
accuracy that is given approximately by the factor 

i) 7)?(InB/£). For f (Z) =0, the formulas in Table II 


: i i hat includes screening effects 
Born-approximation formula that incl seller 

1 i -relativistic approximation, has been given by 

une ee eae Rev. 90, 1030 (1953). This formula 


else oF to the low-frequency région and has been found 
equate in the high-frequency region. 
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are the same as the Born-approximation formulas in 
Table I except for differences in the screening correc- 
tions which are reviewed in Sec. ITE(3). 

An estimate of the general accuracy with which the 
formulas in Tables I and II predict the cross-section 
values over the whole range of electron energies can be 
obtained from a comparison of the theoretical and ex- 
perimental predictions for raa in Figs. 22 and 23. 

For the cross-section differential in photon energy, 
dog, a summary of the corrected formulas for specified 
energy ranges of the incident electron is given in Table 
V. Conservative estimates of the accuracies of these 
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Fic. 20. Dependence of the bremsstrahlung cross section inte- 
grated over photon angle on the photon energy for 1.0-Mev 
electrons. The Born-approximation cross sections shown by the 
solid curves are calculated from Formula 3BN, and the Born- 
Elwert cross sections shown by the dashed curves are obtained 
from the product of Formula 3BN and the Elwert factor, Formula 
(II-4). The experimental values? are shown by the open and closed 
circles for gold and aluminum, respectively. The corrected Sauter- 
Fano values at the high-frequency limit are estim-ed in refer- 
ence 21. 


formulas have been made on the basis of the experi- 
mental data assembled in this report. The greatest un- 
certainties are in the energy range from 0.10 to 2.0 
Mev. Because of the uncertainties of screening effects, 
no corrected formulas are given for the energy region 
below 0.1 Mev. These corrected formulas are tentative 
and it can be expected that some will be replaced by 
more accurate expressions as more data becomes 
available. 

For the cross-section formulas differential in photon 
energy and angle, dok,oo,4, no quantitative corrections 
are available for low and intermediate energies because 
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of insufficient data. For extreme relativistic energies, 
the most accurate estimates (3%) for dok,oos are given 
by Formula 2CS. 


II. ELECTRON-ELECTRON BREMSSTRAHLUNG 


The bremsstrahlung cross-section formulas for elec- 
tronenuclear interactions in Sec. IIC vary as Z®. For tar- 
gets with high atomic numbers, the additional influence 
of electron-electron bremsstrahlung can be included 
approximately by replacing Z? by Z(Z+1). However 
for very low Z elements such as hydrogen or beryllium, 
the electron-electron bremsstrahlung contributions must 
be included more accurately. Cross-section calculations 
for this process are complicated because of the exchange 
character of the interaction in which there is a large 
energy and momentum transfer to the recoil electron, 
in contrast to the electron-nuclear bremsstrahlung proc- 
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Tic. 21. Dependence of the bremsstrahlung cross section inte- 
grated over photon angle on the photon energy for 4.54-Mev 
electrons. The Born-approximation cross sections shown by the 
solid curve are calculated from Formula 3BN, and the Born- 
Elwert cross sections shown by the dashed curve are obtained 
from the product of Formula 3BN and the Elwert factor, Formula 
(II-4). The experimental values*® are shown by the open circles 
for gold. The corrected Sauter-Fano values at the high-frequency 
limit are estimated in reference 21. 


ess in which the nucleus is assumed to be infinitely 
heavy. No complete calculations are available for pre- 
dicting the detailed features of electron-electron brems- 
strahlung.®® A summary of pertinent results that have 
been obtained is given in the following. 


A. Maximum Photon Energy 


In the electron-electron bremsstrahlung process, the 
maximum photon energy that is available in the 
laboratory system at the laboratory angle ĝo is®® 


kmax= F/(1—/F cos6o), (IIL) 


where F is equal to (Eo—1)/(Eot 1). Table VI gives 
some values of kmax at zero and 90 degrees obtained from 


Formula (III-1) for various incident electron kinetic 
m Ed 
3 For a general review of the available theories on electron- 


electron bremsstrahlung, see J. Joseph and F. Rohrlich, Revs. 
Modern Phys. 30, 354 (1958). 
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Fic. 22. Dependence of the total radiation cross section, 
praal = (1/Eo) fo? *kdox,], on the initial electron kinetic energy, To. 
The solid lines are obtained from Formula 4BN for no screening, 
and from the numerical integration of Formulas 3BS(c) and 
3BS(d) with screening corrections for Z equal to 13 and 79. The 
experimental points*:35 are shown by the open and closed circles 
for a gold and aluminum target, respectively. The values shown by 
the triangles are estimated by numerical integration of Formula 
3CS for Z=79. On the basis of the experimental data at the low 
energies and the theoretical values (triangles) predicted by the 
exact theory at the extreme relativistic energies, the dashed 
curves have been drawn as an estimate of the most accurate ¢raa 
values for Z equal to 13 and 79. 


energies. From the very sparse experimental informa- 
tion**:86 available on electron-electron bremsstrahlung, 
some results? have shown reasonably good agreement 
with the values of kmax predicted by Formula (III-1). 


B. Cross-Section Formulas for Free Electrons 
(1) Nonrelativistic Energies 


In contrast to the electron-nucleus and electron- 
positron systems, the electron-electron system has no 


TABLE V. Corrected cross-section formulas for deg. 


Kinetic energy 
range for inci- 


dent electron, Corrected cross- Estimated 
Mev section formula? Restrictions accuracy®> 
0.01-0.10 dox.= fedo p BN) k>0.01To +5% 
0.10-2.0 dor=A fedo PN k>0.01To +20% 
2.0-15 dox.= Ada )3BN y>15 b 
= Ada 8Sa) 2<y<15 +5% 
= Ado SBS y<2 +5% 
15-50 dop=do; BN y>15 b 
=AdopBSD  2<y<15 +3% 
=Ado; PS y<2 +3% % 
50-500 do,=daj38N y>15 b j 
= do pls% 2<y<15 +3% i A 
2 = doj3CS(b) y<2 +3% ake 


where fz is defined in Formula (II-6), A is the 
correction factor given in Fig. 23, y is equal to 
the quantity 1004 (FEZ). 6 


a The superscripts for dex give the formula numbers defi 

b No estimated accuracy is given at photon enerBles ed 
frequency limit of the spectrum. If better accuracy is 

the cross section at the high-frequency limit 
dashed curves in Fig. 12, and the spectrum shap 
this end to the curves given by a 
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Fic. 23. Approximate correction factors for the Born-approxi- 
_ mation ¢ma values with screening shown in Fig. 22. These factors 

have been estimated from the ratios of the empirical (dashed and 
_ dot-dashed) curves to the Born-approximation curves with screen- 
‘ing in Fig. 22. 


dipole moment. Therefore the electron-electron brems- 
_ Strahlung cross section becomes zero for calculations 
based only on the nonrelativistic dipole approximation. 
a Garibyan® has made calculations beyond the dipole 
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(2) Extreme-Relativistic Energies 


Several calculations” based on the extreme-relativistic 
approximations give the following approximate formula 
for the cross-section differential in =a energy: 


4rọZ dk 
do;/ = —| (: +(=) 
137 k Eo 


3 Eo 
2EE 3 EE 
x(n B) | (III-4) 
k 2 9 


which is similar in form to the electron-nuclear cross- 
section Formula 3BN (b). 
The total radiation cross section obtained from For- 
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Fic. 24. Screening factors,"! yı and ys, for electron-electron 
bremsstrahlung. The curve marked “Hydrogen atom” was calcu- 


lated‘! with exact wave functions. For free electrons, ¥;=y~o=y. 
mula (III-4) is given as 
roZ 
Prad = 137 [n(2E)— 4]. y (III-5) 


C. Cross-Section Formulas with 
Binding Corrections 


The influence of atomic binding on the electron- 
electron bremsstrahlung cross section has been calcu- 
lated only in the extreme-relativistic approximation. 
With the Thomas-Fermi model, the corrected formula 
for the cross-section differential in photon energy is® 
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where e is equal to 100k(4)#Z})—, and yı and y» are 
given"! by the data in Fig. 24. For complete screening 
where e~0, the cross section becomes 


4reZ dk EN? 2B 530 
Pmt E 02), 
137 k Eo 3 Eo Z? 


a 
The total radiation cross section which is obtained 
for the complete screening case from (III-7) is given by 


praa’ = (4r Z/137) In(530/Z3). (IIT-8) 


(III-7) 


A comparison of this Formula (III-8) with the electron- 
nuclear bremsstrahlung cross-section Formula 4BS 
shows that the Z electrons in an atom increase the 
electron-nuclear cross section by the factor 7 so that 
the total cross section becomes 


Prades = Z (Z-++n) (Ora ESZ): (II [-9) 
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Fic. 25. Dependence of the radiation probability correction 
factor, K(=¢ra/¢ma*), on the initial electron kinetic energy and 
the target atomic number. 


For complete screening, 7 is given by 


530 183 1 
=a / (m +—), 
Z? Z? 18 
which varies from 1.04 for magnesium to 0.88 for lead. 


For most cases, a value of 7 equal to unity is sufficiently 
accurate. © 


(III-10) 


IV. THICK-TARGET BREMSSTRAHLUNG 
PRODUCTION 


Bremsstrahlung is produced in ¢hick targets for most 
practical cases. In this discussion, a target is defined to 
be thick if the scattering and energy loss processes that 
occur as the electrons traverse the target have an 
appreciable influence on the bremsstrahlung production. 
In principle, a complete description of the brems- 
strahlung emitted from a given target can be obtained 
from the cross sections for the pertinent elementary 
processes. For example, thé angular distribution of the 


4. J. A. Wheeler and W. E. Lamb, Phys. Rev. 55, 858 (1939), 
and 101, 1836 (1956). 
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total bremsstrahlung power, the shape of the brems- 
strahlung spectrum from an x-ray tube, or the efficiency 
of bremsstrahlung production can be calculated if 
detailed data are available with regard to the brems- 
strahlung and electron scattering (elastic and inelastic) 
processes. However, any such analysis is necessarily a 
complicated procedure, since the calculation for the 
energy loss and scattering of the primary electrons and 
the absorption of the x-rays in the target must be in- 
cluded with the cross-section information of Sec. II. 
Also, the analysis depends on the characteristics of a 
given experimental situation. For these reasons, this 
paper does not give a complete, systematic treatment - 
of thick-target bremsstrahlung production; instead, it is = 
confined to the presentation of pertinent experimental 

data as well as useful analytical results and procedures. p 
Also, emphasis is placed on thick-target results that z 
give absolute data on photon intensities and brems- 
strahlung production efficiencies. 

Some of the analytical results for thick-target brems- 
strahlung are most conveniently expressed in terms of r 
certain quantities which are defined in the following 
discussion. When an electron traverses a target, the 
average energy lost in the path length element dx by 
radiation can be written as 


—dEy= NE (K¢raa*)dx= K Eddi, (IV-1) 


where JV is the number of target atoms per cm? and 
Kéraa* is equal to the cross section ¢raqa defined in Sec. 
TIC. draa* is equal to (42Z?r0?/137) In(183Z—3) cm?, which 
is approximately the same as the expression for ¢raa 
at extreme-relativistic energies (see Formula 4BS). K 
is defined as the radiation probability correction factor 
and is plotted in Fig. 25 for various values of the target 
atomic number and the electron kinetic energy. The 
length ¢ is given in units of the radiation length, éo, 
which is defined as 


to=1/Noraa* cm. 


2 ———— a ee 


(LV-2) 


Values for fo in units of g/cm? as a function of the target 
atomic number are plotted in Fig. 26. 


A. Thick-Target Bremsstrahlung 
Angular Distributions 


(1) Nonrelativistic and Intermediate Energies 


For electron energies that are small or comparable 
to the electron rest energy, no analytical or empirical 
formulas have been derived for estimating the brems- 
strahlung angular distribution from thick targets, and 
only a few experimental results are available. 

In contrast to the extreme-relativistic region, the 
radiation intensity produced at these low- energies is | 


“The results that are presented for the ivisti 
intermediate energy region where Ty) <1 any ea ee 
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Fic. 26. Radiation lengths in grams per square centimeter 
for various materials. 


important at large angles, and is about the same order 
of magnitude at both zero and ninety degrees. However, 
because the absorption of the bremsstrahlung photons 
in the target is large, the angular distribution of the 
bremsstrahlung is largely dependent on the target 
geometry in specific experimental situations. One of the 
few examples in which angular distribution data are 
presented in a more general way with corrections for 
the geometry and the target absorption is to be found 
in the measurements of Buechner, Van de Graaff, 
Burrill, and Sperduto* for initial electron kinetic 
energies in the region from 1.25 to 2.35 Mev. Their 
results for beryllium and gold targets are given in Fig. 
27. The curves show the angular dependence of the 
radiation intensity integrated over photon energy for 
specified electron energies. These data indicate that the 
intensity ratio at zero and ninety degrees is approxi- 
mately 10 for beryllium and 3 for gold at 1.5 Mev, and 
approximately 40 for beryllium and 4 for gold at 2.35 
_ Mey. Also from these data, we can obtain the following 
empirical expressions for the power radiated at zero 


degrees: 


T cao) (Au) = 9.4 (To)? roentgens per minute per 
ma at 1 meter for gold, 


T(a=0) (Be)=0.92(To)’4 roentgens per minute 
per ma at 1 meter for beryllium, 


(IV-3) 


where To is the electron kinetic energy in moc? units for 
‘the electrons,incident on the target, and a is the angle 
bet een the photon direction and the direction of the 


uechner, Van de Graaff, Burrill, and Sperduto, Phys. Rev. 
rt (1948). 
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incident electron beam. Then with the approximate 
conversion factor of one roentgen equal to 3000+500 
ergs/cm? for photons with energies in the range from 
0.1 to 2 Mev,“ we have 


T(a-0)(Au)=0.5 (To)? watts/ma-steradian, 


eae $ (IV-4) 
T (a0) (Be) =0.05(7o)?4 watts/ma-steradian. 


From these equations, the fraction, R, of the total in- 
cident electron kinetic energy that is radiated per 
steradian at zero degrees is 


R(a=0)(Au) = 10-(7)! for gold, 
(IV-5) 


and 
Rao) (Be)=10-4(T)?4 for beryllium. 


(2) Relativistic Energies 


At high energies, estimates of the bremsstrahlung 
angular distribution from thick targets have been made 
on the basis of the following simplifying approxima- 
tions. First, the thin-target spectrum integrated over 
photon angle (Formula 3BS) is assumed to represent 
the spectrum shape for any angle. Second, the intrinsic 
(thin target) angular spread of the bremsstrahlung 
(Formula 2BS) is neglected at large angles where 
a>Hy but not at small angles where a< "1; there- 
fore, at large angles the photon is assumed to have the 
same direction as the electron that is multiply scattered 
before it radiates. 

With these approximations, the following analytical 
results have been obtained. For large angles where 
a>", the fraction, R, of the total incident electron 


TOTAL INTENSITY (ROENTGENS /MIN./ MILLIAMPERE AT I METER) 


BERYLLIUM Z=4 GOLD Z=79 

Fic. 27. Angular dependence of the thick-target bremsstrahlung 
intensity integrated over photon energy for 1.25- to 2.35-Mev 
electrons. These results were obtained by Buechner, Van de 
Graaff, Burrill, and Sperduto* and include corrections for the 
target. 


“W. V. Mayneord, Brit. J. Radiol. Suppl. No. 2, 136 (1950). 
For photons outside this energy range, the conversion factor has 
a significant energy dependence and the factor must be weighted 
by the bremsstrahlung spectrum shape. 
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kinetic energy that is radiated per steradian at the 
angle, œ, is given*® as 
K6? — ra? 
s| | 
1760r 17602 


where óo is the total energy in moc? units of the electron 
incident on the target, and Ei is the exponential 
integral*® 


Ra ro) = 


(IV-6) 


: mE 
-Bi(-y)= f —dz, 


Z 


Yy 2 


>0 for y>0. (IV-7) 


For small angles where aX Eo“, Muirhead, Spicer, and 
Lichtblau‘’ have obtained the following expression for 
the bremsstrahlung angular distribution 


Rees Eq7) 


K6? —6re? —6r a? 
= | -e| hel |} (IV-8) 
17607 1760 7.15 


This formula gives good agreement with experimental 
data*’ and can readily be evaluated at small angles by 
keeping the first term in the expansion of the ex- 
ponential integrals which is 


Ei(—21)— Ei(—z2) > In(zi/ze) for 21,220. (IV-9) 
Thus 
K6? : 
Reco ln246ż, for &>2X10=. (IV-10) 
17607 


For /=0.1 and &9=3, this formula agrees reasonably 
well with the result predicted by the low-energy For- 
mula (IV-5). For thin targets, this “on-axis” intensity 
becomes*® 


Rano) =Kt&0?/4r, for 12X10. (IV-11) 


Estimates of the ratio Ra/R(a=o) for tungsten (Z= 74) 
are given in Fig. 28 for three target thicknesses. 
Several conclusions for high-energy angular distribu- 
tions can be drawn from the form of the Formulas 
(IV-8), (IV710), and (IV-11). The logarithmic form of 
Formula (IV-10) shows that most of the radiation comes 
from the front part of the target. Also, since the frac- 
tional energy radiated depends on 6o, the total energy 
radiated at zero degrees will depend on &*. Two addi- 
tional effects influence the dependence on & of the 
total radiated energy. The factor K, according to Fig. 
25, increases slightly with & , and for very thick targets 
the effective ¢ in Formula (IV-10) will increase loga- 


45 J. D. Lawson, Nucleonics 10, No. 11, 61 (1952). 

46 Exponential integral functions are tabulated in National 
Bureau of Standards Tables of Sine, Cosine, and Exponential In- 
tegrals (U. S. Government Prifting Office, Washington, D. C.), 
Vols. 1 and 2 (MTS and MT6). 

47 Muirhead, Spicer, and Lichtblau, Proc. Phys. Soc. (London) 
A65, 59 (1952). 
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Fic. 28. Theoretical bremsstrahlung angular distributions from 
thick tungsten targets for relativistic energies. These data are 
obtained from the National Bureau of Standards Handbook 55. 
Ra is defined as the fraction of the total incident electron kinetic 
energy that is radiated per steradian at the angle a. 


rithmically with ĉo. Therefore, the total energy radiated 
on the axis of the bremsstrahlung beam will depend on 
at least a 3.2 exponent for a thin target and on a 
slightly higher exponent for a thick target. The specific 
exponent to be used will depend on the energy range 
of interest, the effective target thickness, and the ex- 
perimental geometry. 


B. Thick-Target Bremsstrahlung Spectra 
(1) Nonrelativistic and Intermediate Energies® 


In this low-energy region the radiation has a broad 
angular distribution (see IVA), and the dependence of 
the spectrum shape on photon angle is important.4® No 
general analytical expressions which accurately predict 
the spectrum as a function of angle for any experimental 
situation are available at these energies. Part of the diffi- 
culty has been the inadequacy of the Born-approxima- 
tion cross-section differential in photon energy -and 
angle (Formula 2BN). Nevertheless it has been possible 
to obtain reasonable agreement between theoretical and 
experimental thick-target spectrum shapes shown in 
Fig. 29 for a particular application with an initial 
electron kinetic energy of 1.4 Mev, photon angles of $ 
zero and ninety degrees, and a tungsten target. In this a 
example, the experimental results confirm the theo- 
retical dependence of the spectrum shape on photon 
angle after distortions due to photon absorption in the 
target and surrounding materials are eliminated. The _ 
results also show that the relative number of photons 
in the high-frequency region increases as the emission _ 
angle becomes smaller. This trend is just ‘opposite to 
the behavior observed for thin-target spectra 3 ada 


te Eee 


+ One of the earliest experimental indications of t des 

ence was found by C. E. Wagner, Physik. Z. 21, 621 noo 

® Miller, Motz, and Cialella, Phys. Rev. 96, 1344. D7 n 
Pas E 


952 H. W. 


RELATIVE INTENSITY 


ce) 02 04 10 1.2 14 


06 08 
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Fic. 29. Relative spectral intensities at 0° and 90° for 1.4-Mev 
electrons incident on a thick-tungsten target. The solid curves 
are obtained from theoretical estimates that include electron 
scattering effects and photon absorption in the materials sur- 
rounding the target. The experimental values have been normal- 
ized and are shown by the open (zero degrees) and closed (90 
degrees) circles. To obtain absolute spectral intensities in Mev per 
steradian per Mev per incident electron, the ordinate should be 
multiplied by 10~ for the theoretical curves and by 2.11073 for 
the experimental points. 


With regard to estimates of the shape of the spectrum 
integrated over the photon direction, Kramers®? ob- 
tained the following simple, analytical expression: 


i= A= (IV-12) 


where I, is the energy radiated in all directions in the 
energy interval, (k, k+-dk), A is a proportionality con- 
stant, and ko is the photon energy at the high-frequency 
limit. This result was derived on the basis of a nonrela- 
= tivistic, semiclassical calculation, in which electron 
scattering effects (including backscattering) were neg- 
_ected and only the electron energy loss was considered. 
In spite of these limitations and because of its simplicity, 

j the Kramers Formula (IV-12) has been used extensively 
to estimate the thick-target spectrum (not including the 
= Characteristic radiation) at a given angle for various 
experimental cases, with corrections included for the 
photon absorption in the target and surrounding ma- 
terials. Results obtained for various electron energies 
in this low-energy region have shown general qualitative 
agreement between the theoretical (Kramers) and ex- 
perimental spectrum shapes,” and indicate that For- 
mula (IV-12) is satisfactory, at least for order of mag- 
nitude estimates. 


a 


e oe, PIL Meg 46,1836 (1923). 

' a E s of the thick-target characteristic radiation in- 
‘tensities that are superimposed on the continuous spectrum, see 
Nae ‘Compton and S. K. Allison, X-Rays in Theory and Practice 
Miran Yostrand Company, Inc., Princeton, New Jersey, 1949), 


of such comparisons is given in Natl. 
U.S.) Handbook 62, 20-24 (1957). 


KOCH AND J. 


W. MOTZ 


(2) Relativistic Energies” 


Two complementary procedures for calculating thick- 
target spectra at high energies which include effects of 
electron scattering in the target are given by Penfold’ 
and Hisdal.°! The Penfold calculations estimate the 
thick-target effects primarily in the high-frequency 
region and give the spectrum integrated over photon 
directions up to a maximum angle, T, with respect to 
the direction of the incident electron beam. The Hisdal 
calculations estimate the thick-target effects on the 
over-all spectrum shape in the forward direction and 
should not be applied to the high-frequency region. 

The Penfold method*:®* assumes that (a) the Schiff 
Formula 3BS(e) describes the intrinsic spectrum at all 
angles; (b) the electron energy loss rather than electron 
scattering in the target produces the ‘predominant 
effect on the shape of the spectrum for large values of 
T; (c) no electron radiates more than one photon; and 
(d) the photon absorption in the target is negligible. 
With these approximations, Penfold obtained the fol- 
lowing formula for the thick-target spectrum integrated 
over photon direction to a maximum angle T deter- 
mined by the detector: 


60 
Pr=nve f S (Eo, Eo, r ,xo)do;d Eo, (IV-13) 
k+1 


where P; is the number of photons in the energy interval 
k to k+-dk, No is the number of target atoms per cm’, 
n is the number of electrons incident on the target, & 
is the total energy of the incident electron in moc? units, 
xo is the target thickness in g/cm?, and do; is given by 
Formula 3BS (e) for electrons with energy Eo. The func- 
tion S represents the probability that radiation pro- 
duced by the electrons reaches the detector, and can 
be written as 


S (80, Eo, l, x0) 
T0 
= f Bs(&0,Eo,0,x) B,(T,Eo,2)Bs(xo,c)dx, (IV-14) 
0 


where the function B; gives the fraction of the radiation 
emitted by electrons with energy Eo at the target depth 
x, B4 is the fraction of electrons that penetrate beyond 
the thickness y, and Bs accounts for path length 
straggling. These B functions require involved nu- 
merical evaluations, and the results are described in 
detail in the Penfold report. Motz, Miller, and 
Wyckoff’ have estimated the thick-target spectrum for 
a particular experimental situation in which the brems- 


53 A. Penfold, University of Illinois Report (unpublished). 

“E. Hisdal, Phys. Rey. 105, 1821 (1957); E. Hisdal, Arch. for 
Math. Naturvidenskab 54, No. 3, 1 (1957). 

5» A similar but less general method has been used by R. Wilson, - 
Proc. Phys. Soc. (London) A66, 038 (1953). Wilson’s calculations 
did not include electron scattering effects in the target and his 
results give a spectrum shape averaged over the photon directions. 

56 Motz, Miller, and Wyckoff, Phys. Rev. 89, 968 (1953). 
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strahlung is produced with an 11.3 Mev (kinetic energy) 
electron beam incident on a tungsten target (approxi- 
mately 0.010-inch thick), and is measured on the beam 
axis with a small detector ('~0). They used the fol- 
lowing simplified, analytical form for the thick-target 
generating function: 


1 /&\? 
S=1-ex| ( ) | (IV-15) 
? 51 (60— Eo) \Eo 


and, as shown in Fig. 30, have obtained good agreement 
with experimental results. For the more general calcu- 
lations, Penfold used Formulas (IV-13) and (IV-14) to 
estimate the thick-target spectrum shapes for an inci- 
dent electro:. kinetic energy of 15 Mev, a 0.020-inch 
platinum target, and for two detectors which subtend 
different angles on the electron beam axis, (T=10 
degrees, '>>10 degrees). A comparison is made in Fig. 31 
of these two Penfold results (curves C and D) with the 
spectrum shape predicted by Formula 3BS(e) (curve A) 
and with the shape resulting from the application of the 
S function in Formula (IV-15) (curve B). The curves 
show that Formulas 3BS(e) and (IV-15) give a greater 
number of photons in the high-frequency region relative 
to the total number in the spectrum compared with the 
more accurate spectral shape predicted by the Penfold 
procedure. For certain cases, the spectrum shape pre- 
dicted by the simplified Formula (IV-15) may be 
sufficiently accurate. 


30, 


INTENSITY (ergs cm* sec! Mev’) 


4 6 8 10 12 
PHOTON ENERGY, Mev 


Frc. 30. Bremsstrahlung intensity spectrum in the forward 
direction for 11-Mev electrons incident on a thick-tungsten 
target.5® The thin-target Born spectrum, modified by the photon 
absorption in the materials surrounding the target, is shown by the 
solid curve. The dashed curves show the spectra expected for a 
10-mil and a 20-mil target, and the experimental values are given 
by the open circles. 

A 
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Fic. 31. Comparison of the spectrum shapes predicted by the 
thick target Penfold calculations% and by Schiff’s thin target 
Formula 3BS(e) for 15-Mev electrons. The Schiff curve A shows 
the spectrum integrated over photon angle. Curve B is the spec- 
trum shape predicted by Formula (IV-13) with the simplified S 
function given by Formula (IV-15). Curve C is the spectrum shape 
predicted by the Penfold calculations [Formulas (IV-13) and 
(I1V-14) ] for a 0.020-in. platinum target and a detector which 
subtends an angle, T, of 10 degrees. Curve D gives the Penfold 
spectrum for '>>10 degrees. 


The Hisdal method’? assumes that (a) the spectrum 
variation with angle as given by the Schiff Formula 
2BS has the dominant effect on the thick-target spec- 
trum shape; (b) the energy loss of electrons in the 
target is negligible; (c) no electron radiates more than 
one photon; and (d) the photon absorption in the 
target is negligible. With these approximations, Hisdal 
has calculated tables for estimating the spectrum shape 
to be expected in a small detector placed on the electron 
beam-target axis. Hisdal’s results are given in terms 
of a correction factor which multiplies Formula 3BS (e) 
for a given value of k to obtain the corrected spec- 
trum for a particular target thickness. Examples of 
spectra calculated by Hisdal’s method are given in 
Figs. 32(a)-(e) for 10-, 20-, 40-, 90-, and 300-Mev 
electrons, and are compared with the Schiff spectra 
integrated over photon direction [Formula 3BS(e)]. 
When the detector subtends a large solid angle at the 
target, the measured spectrum shape is given by the 
cross section integrated over the photon direction. 
Figures 7 and 8 give data for the spectrum shape in- 
cluded within a given detector angle. If this shape for 
a given experimental arrangement is estimated to be 
similar to the zero degree spectrum, then the Hisdal 
correction will be important; if this shape is estimated 
to be more similar to the spectrum integrated over all 
angles, then Hisdal’s correction will be unimportant. 


C. Efficiency for Bremsstrahlung Production 
The bremsstrahlung production efficiency for a given 


electron kinetic energy and target material is defined 


WeySimilanealeulntions that apply only to a specifi . 
terial and electron energy us pecihic target ma, 


Rev. 106, 637 (1957). While the Hisdal calculations include only 


the Schiff complete screening approximations, Sirlin examined the 


effects of both the intermediate and complete sc i 1 
: reenin. = 
mations on the spectrum shape. R TERRES 


have been made by A. Sirlin, Phys. 


He W. KOCH AND J. W. MOTZ 


a 


O.5ITo «90 Mev f 
Z:78 (THICKNESS = 0O 0005 inches) 


0.51To=10 Mev 
Z=78 (THICKNESS =:0.0005 Inches) 


(d) 


RELATIVE INTENSITY 


0.51Tp = 300 Mev 
2:78 (THICKNESS = 0.0005 inches) 


0.51T> +20 Mev 
Z=78 (THICKNESS =0.0005 inches) 


RELATIVE INTENSITY 


Ge 


Fic. 32. Comparison of bremsstrahlung spectrum shapes 
predicted by the thick-target calculations of Hisdal*” and 
by the thin-target calculations of Schiff!‘ for (a) 10-Mev 
electrons, (b) 20-Mev electrons, (c) 40-Mev electrons, 
(d) 90-Mey electrons, and (e) 300-Mev electrons. The 
Schiff spectrum is integrated over the photon direction, 
Formula 3BS(e), and the Hisdal curve gives the spectrum 
jn the forward direction, '=0, after corrections have been 
made for multiple scattering in the target (see Sec. IVB). 
gy The values of the intensity (defined as proportional to the 
_ product of the photon energy and number per unit time) 

i ized to unity for zero photon energies. 
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as the ratio of the total bremsstrahlung power radiated 
when an electron current is incident on a target to the 
total power in the incident electron beam. The results 
of various theoretical and experimental determinations 
of the efficiency are given in the following. 


{1) Nonrelativistic and Intermediate Energies 


The efficiency results for this energy region apply 
only to the cases in which the electrons expend all of 
their kinetic energy in the target. Experimental deter- 
minations are complicated by (a) the large photon 
absorption in the target and (b) the large electron back- 
scattering from the target. In the available measure- 
ments of the efficiency, corrections have been made for 
effect (a) but not for effect (b). Therefore, these experi- 
mental efficiencies are less than the values that would 
be obtained if all of the electrons were completely 
stopped in the target. 

In this low-energy region, most experimental and 
theoretical results®> are in agreement within a factor 
of two with an efficiency, e, given by the following 
formula : 


€=5X10-Z7. (IV-16) 


(2) Relativistic Energies 


(a) Intermediate thickness targets—A target is de- 
fined to have an intermediate thickness if the incident 
electrons do not expend all of their energy as they 
traverse the target. This condition usually exists in 
high-energy electron accelerators. 

The efficiency of bremsstrahlung production for 
targets having an intermediate or small (fo) thickness 
can be estimated from the expression 


[MOEA ual Re (IV-17) 


where (d&)r is the energy loss by radiation, ¢ is the 
target thickness in units of the radiation length fo 
[Formula ([V-2)] and & is the radiation probability 
correction factor given in Fig. 25. 

(b) Thick targets—Yor thick targets, the incident 
electrons lase all of their energy in the target. Formula 
(IV-16) obviously does not apply at high energies for 
which the efficiency must remain less than one. An 
approximate relation for the efficiency in this high 
energy region has been derived*®® by assuming that the 
total electron energy loss per unit path length can be 


58 These results are summarized by H. Kulenkampff, “Physics 
of the Electron Shells,” Fiat Rev. Ger. Sci. 1939-1946, 95; R. D. 
Evans, The Atomic Nucleus (McGraw-Hill Book Company, Inc., 
New York, 1955), p. 616; and by S. T. Stephenson (reference 1). 

69 H. W. Koch and J. W. Wyckoff, IRE Trans. on Nuclear Sci. 
NS-5, No. 3 (1958). 
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955. 
TABLE VII. Approximate percentage efficiencies 
for x-ray production. 

To: 2 4 10 20 50 600 
Carbon 0.36 0.72 1.77 3.47 8.3 52 
Aluminum 0.77 1.54 3.75 7.2 16.3 70 
Iron 1.54 3.0 7.2 13.5 28.1 82 
Lead 4.7 9.0 19.7 33 59 94 

written as 
d&o pZ 
=== (=) {(6+3.5X 108Z}, (IV-18) 
dx A 


where p is the density and A the atomic weight of the 
target material, and where the first term is the collision 
loss and the second term is the radiation loss. Integra- 
tion from the initial energy & to 1 gives the following 
distance, xo, traveled by an electron in losing all of its 
energy: 

A 


0 
3.6 10~%pZ? 


In(i+6X10-4Z&). (IV-19) 


Then the efficiency becomes 


collision loss 


e= | — —_—_ = 


So 


6pZ Xo 


A Êo 
3X 10~2ZTo 


= N) 
1+3X10~ZTo 


This procedure does not account for the large fluctua- 
tions in the radiation loss process and, therefore, pro- 
vides only a rough estimate of the efficiency. At low 
energies, Formula (IV-20) reduces to 3X 10~4Z To, which 
agrees roughly with Formula (IV-16). Some representa- 
tive values of the efficiency obtained from Formula 
(IV-20) are given in Table VII. These values do not 
include corrections for the x-ray absorption in the 
target, which cannot be neglected for most situations. ; 
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1. INTRODUCTION ó 


e NONSIDER an electron trap in a solid with two 

J u (upper) and g (ground). A fundamental 
how does an electron transition affect the 
al modes of the crystal? The answer 
btained either with the use of a phe- 
C opic men or y the use of 


The phenomenological theory, due to Fröhlich (F1,* 
F2, and H4; also Born and Huang, B2, p. 82), ignores 
details of the lattice structure. In a polar crystal 
Fröhlich assumes that the forces on the ions, as well 
as the polarization, depend exclusively on the macro- 
scopic electric field and relative ionic displacements 
at the point in question. This model leads to pure 
transverse and longitudinal modes. It is only true for 
the longest “optical” waves in which the positive and 
negative ions vibrate 180° out of phase. This theory 
does not account for the change in the character of the 
normal mode with decreasing wavelength. In NaCl the 
longitudinal optical branch in certain directions has a 
frequency drop of 57%, and the shortest waves are 
pure acoustical, in which the ions vibrate in phase 
(M3). Froéhlich’s model is useful in the theory of 
reststrahlen where one deals with electromagnetic 
radiations whose wavelength is large compared to the 
inter-ionic distances, but this approach is questionable 
for phenomena which depend critically on short range 
effects. 

The configurational coordinate model assumes that 
a very small number of local modes surround a point 
imperfection. This approach is supported by recent 
calculations. (See M2 and references therein.) One 
would expect the local modes to have frequencies in the 
gap between the acoustical and optical branches. This 
suggests that their frequency spread may be small. We 
refer to this model by the letters LM. It is assumed 
that there are one or several “effective” frequencies 
associated with the local modes. The extreme situation 
where there is only one mode is referred to by the letters 
CC (configuration coordinate, D1). The LM models 
cannot give a completely correct picture since long 
range effects must exist in polar solids. The Frohlich 
model cannot either, since it ignores local lattice 
distortions. = 

One of the most important effects of the electron- 
phonon interaction is the broadening of optical absorp- 
tion lines. Studies of these phenomena will give in- 
formation regarding the effects of electronic transitions 
on normal modes. 

In general, rigorous quantum mechanics has been 
used on this problem when the phenomenological model 
is employed. The original techniques required infinitely 
small interaction between the trapped electron and the 
modes. It, therefore, could not be used in the LM or 
CC models. 

While these developments were taking place, Williams 
(W2), as well as Williams and Hebb (W3), developed 


* Refer to items in the Bibliography at the end of the paper. 


INTERACTION OF NORMAL 
an independent treatment which was applied to the CC 
model. This approach makes a series of approximations 
and is sometimes referred to as “semiclassical.” A more 
detailed justification of Williams’ method has been 
made by Lax (L1), Klick (K2), and Dexter (D1). This 
development shows that at very low temperatures the 
shape of an absorption band should be Gaussian 
(provided that the electronic transition does not induce 
a frequency, change in the CC mode). Further, the 
position of maximum absorption is temperature 
independent even if the frequency changes. 

Since 1950 the rigorous quantum mechanical method 
has been fully developed, and one can now apply it to 
various LM models. Williams’ simpler approach is still 
very useful, since it gives a vivid picture of the absorp- 
tion process. However, it is beset by many approxi- 
mations, and the final results do not always agree with 
more rigorous calculations. 

To summarize the development so far, we have 
several models—phenomenological (Fröhlich), local 
modes, and configurational coordinate. We have two 
means of calculating shapes of bands—Williams’, and 
the rigorous quantum mechanical approach. For some 
models the second method can be carried through in 
detail. Most realistic ones, however, require Williams’ 
approach. A rather formal theory for all models has 
been worked out by Kubo and Toyozawa (K6). 

This paper attempts to synthesize the rigorous theory 
and to apply it to various models. We start by examin- 
ing the Hamiltonian to be used (Part I). Fortunately, 
it can be written in a form that applies to all the models. 
We next discuss the properties of these Hamiltonians 
and point out the essential difference between phonon- 
electron and photon-electron interactions. Only in 
special types of solid state problems may they be 
considered similar. At this stage, one can see why 
multiphonon processes occur. Some of the results of 
Part I are known, but their interrelations have not 
been fully understood. This has led to appreciable 
confusion. Some of the mathematical techniques are 
those used by Born and Huang in connection with 
another problem. 

After this development, the photon absorption 
problem is considered rigorously and in some detail 
(Part II). The Hamiltonian developed permits a 
unification of the field which is most desirable, since 
experimental results are being obtained. 

Applications of the theory are not considered because 
the author does not trust the presently available data. 
The absorption band associated with the F center in 
KCl has been measured a great many times. It seems 
to fulfill the requirement stated in the first paragraph. 
Various laboratories, however, do not report the same 
results. A typical set of half-width, H, measurements 
are shownin Table I. ` 

At low temperatures H can be measured to about 


t Half-width is defined in Appendix I. 
` 
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TABLE I. H of KCL 


0 (°K) 


H (ev) Observer 

28 0.19 Mollow (M10) 

4 0.18 Russell and Klick (R1) 

4, 0.17 Burstein and Oberly (B5) 

5 0.168 Compton and Klick (C3 

5 0.17 Duerig and Markham (D2) 

5 0.163 Markham and Konitzer (M4) 
85 (App) 0.26 Molnar (M11) 
87 0.23 Mollwo (M10) 
87 0.243 Kanzaki (K1) 

77 0.3 van Doorn and Haven (D3) 
77 0.22 Russell and Klick (R1) 

77 0.20 Burstein and Oberly (B5) 
78 0.20 Duerig and Markham (D2) 
78 0.193 Konitzer and Markham (K4) + 
77 0.196 Compton (C2) 


five parts per thousand so that the variations cannot be 
attributed to experimental error (M1). The author 
doubts that the scatter is related to the sample purity, 
and the temperature variation of H is too small to 
explain the discrepancies. Experiments have shown 
(K3) that, unless special care is taken, any values 
between 0.19 ev and 0.25 ev can be obtained for H 
(in KCl at 78°K). Improper quenching and very slight 
exposure to light produce new bands, and H then 
corresponds to a composite. 

Although not comparing the results with experiment, 
an attempt is made to present the argument in such a 
manner that a comparison with data is readily possible. 
Indeed, a comparison has been made with recent data 
on the F center in KCl. The agreement on the whole 
is satisfactory. It is hoped that in the next few years 
sufficient experimental data will become available to 
justify and amplify the theory presented here. 

The first theory of the shape of an absorption band 
in a solid is due to Smakula (S2), who assumed that a 
trapped electron behaves like a damped oscillator in a 
dielectric medium. Nothing in this theory explains the 
temperature dependence of the damping term. The 
first attempt to associate the width theoretically with 
the thermal vibration was done independently by 
Muto (M13), by Huang and Rhys (H5), and by Pekar 
(P1) (also P2, 3, and 4). Huang-Rhys and Pekar used 
Frohlich’s model and assumed that the electron 
interacts only an infinitesimally small amount with all 
the longitudinal optical modes. Thus they limited 
themselves to first order terms in the various expansions, 
as did Muto. 

Pekar obtained two expressions for the shape; it 
should be nonsymmetric at low temperatures but 
Gaussian at high temperatures. He employed the 
reststrahlen frequency combined with the Lyddane- 
Sacks-Teller (LST) relation to determine “the longi- 
tudinal frequency (B2, p. 82-128). He also made a 
comparison of his theory with Mollwo’s data (M10). 

O’Rourke (01) made à fundamental extension of the 
previous calculations, eliminating the restriction th 
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the electron-phonon interaction must be infinitely small. 
His most general expressions are limited by the assump- 
tion that electronic transitions do not cause frequency 
changes. To use his expressions one must assume that a 
single “effective” vibrational frequency exists. His 
results apply to the Fröhlich Hamiltonian, the CC 
model, as well as to an LM model, if a single effective 
frequency exists. Pekar’s expansions may therefore be 
combined with the CC or LM model to calculate the 
shape rigorously at very low temperatures. We do this 
and show that the shape is not the one predicted on the 
basis of the Williams’ approximation. Present low- 
temperature data indicate that the more rigorous 
treatment is correct. 
At about the same time, Lax (L2) developed some 
completely general relations between the moment of 
_ the distribution and the temperature. Meyer (M8 and 
9) extended the development of Lax, giving a general 
approach to the problem. This method is not limited to 
Frohlich’s Hamiltonian which was employed by Meyer. 
For the calculation to be valid, it is necessary to use an 
orthogonal set of eigenfunctions. We show here that 
this is true. Further, we indicate that Meyer omits 
some important terms. 
Meyer also made a serious attempt to analyze 
Mollwo’s data by means of the new theories. His 
results are most impressive and by themselves give a 
good proof of the reliability of his calculation and a 
validity of the LST relation. This is not cofifirmed by 
more recent measurements (see Table I). The method 
of analysis to be employed is of fundamental im- 
portance. An alternate analysis of Mollwo’s data 
indicates that the effective frequency is 3.210" sec™ 
oe for KCl. The LST relation gives 6.310" sec™. This 
difference suggests that Frdhlich’s model does not 
A aly to the F center in KCI. Other analyses (R1, K4) 
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Fic. 1. Schematic diagram showing the effects of trapping an electron at an imperfection. 


support this view. The means of analysis is the im- 
portant difference between Meyer’s conclusion and 
those obtained recently. 

This field has been reviewed by Pekar (P3, P4), Lax 
(L2), Meyer (M9), Klick and Schulman (K2a), and 
Dexter (D1). Pekar’s reviews, though out of date, are 
extremely important contributions to the field. Lax’s 
review concerns itself primarily, though not entirely, 
with Williams’ approximation. Meyer’s review is limited 
to Fröhlich’s model and omits some important de- 
velopments made by Pekar. Klick and Schulman con- 
sider only Williams’ approach. Dexter considered both 
aspects of the problem, applying the quantum me- 
chanical treatment to Fréhlich’s model and Williams’ 
approximation to the CC model. 

This paper concentrates on the pure quantum 
mechanical approach and treats all models. Some 
aspects of the problem are considered in greater detail 
than has been done previously. These aspects are most 
important since they can be checked experimentally. 

The theory is not developed in full; instead, results 
are quoted from the recent book of Bors and Huang 
(B2), which is cited by the letter B. Those parts of the 
theory which have not been presented in book or 
review article form are more fully discussed. Although 
the theory applies to any imperfection with two or 
more bound states, the development is closely related 
to the F center, since this has been studied in greatest 
detail. 


Part I. The Hamiltonian 


2. PHYSICAL CONCEPTS INVOLVED IN ELECTRON- 
PHONON INTERACTIONS 


The trapping of an electron at a site has three effects. 
These can be intuitively understood by means of Fig. 1, 
which is a schematic representation of the nuclei 
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(shielded by the inner electrons) for a one-dimensional 
polar crystal. 

The nuclei of the solid are assumed to be initially 
distributed uniformly. When an electron is trapped 
nearby (to the left of the diagram) the following effects 
occur. 


(1)*The electron induces a force on the nuclei, 
causing a nonuniform shift of the rest positions, as 
indicated by: the arrows. This action is very much like 
a compression (or expansion) of a spring and leads to 
the storing of energy. 

(2) The forces between the nuclei are slightly 
modified, leading to a shift in the angular frequencies 
of the lattice, Aw. At absolute zero the probable position 
of an ion is given by a Gaussian curve if an Einsteinian 
model is used (no coupling between ions). The trapping 
causes a broadening or narrowing of the distribution. 
The Einsteinian solid is used here only to illustrate the 
effect. 

(3) Finally, there is a third effect similar to a photon- 
electron interaction—the electron’s self-energy. The 
electrons and nuclei readjust their probable distributions 
so that the total energy is at a minimum. This effect 
may be visualized by means of time dependent pertur- 
bation theory leading to electron-phonon transitions, 
or by time independent perturbation theory (or even 
variational methods) leading to wave functions which 
are no longer single products of electronic and nuclear 
wave functions. This is referred to as the electron- 
nuclear correlation. 


These effects also occur when an electron redistributes 
itself, i.e., acquires a new wave function by jumping 
from one bound state to another, or by transitions to 
the conduction band. Item (3) is similar to phenomena 
which occur in electrodynamics, while (1) and (2) arise 
in systems made of light and heavy particles. The 
relative importance of these effects depends on the 
forces within the solid. In many solids, effects (1) and 
(2) are so small that phonon-electron interaction 
problems resemble problems in electrodynamics. This 
is true in metals and in some semiconductors. In 
general, the_phonon-electron problem has a different 
character because of effects (1) and (2). Our interest 
here is the difference between phonon-electron and 
photon-electron interactions; hence, effect (3) is not 
considered in detail. Usually the effects are considered 
independently. 

Treatments of the correlation problem appear several 
places and take various forms. Born and Huang (B, 
p. 166) considered it briefly in a general manner, while 
Haken (H1) studied its effect in a special one- 
dimensional model. 

Simpson (S3) and Pekar (P3) considered effect (1) 
in their c@iculations on the F center. Williams (W2) 
in an ambitious and detailed study calculated the 
displacements which occur when a TI* ion replaces a K+ 
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ion in a KCl lattice. Williams finds a change in the 
second derivative of his CC plots which is related to Aw. 
Klick and Schulman (K2a) have attempted to compute 
Aw from experimental data. 

The idea of a large frequency shift was invoked by 
Seitz (S1) to explain the observations of Dutton and 
Maurer (D4). They measured a thermal activation 
energy for the V; band in KBr of 0.23 ev compared to 
an optical activation energy of 3.0 ev. These ideas have 
some theoretical difficulties. A similar problem exists 
for the F’ center whose optical activation energy in 
KBr is 1.3 ev compared to a thermal activation energy 
of 0.29 ev (D4). 

A problem which involves all three effects is an 
electron at the bottom of the conduction band if a 
polar solid. The electron interacts strongly with the 
surrounding ions. Due to the large difference in the 
velocity of the electrons and the nuclei, the forces on 
the nuclei depend on the average distribution of the 
electron, thus the electron displaces the surrounding 
ions. This causes effects (1) and (2) and results in a 
potential well. The well in turn traps the electron. The 
electron will also be influenced by the polar modes of 
vibration about the well. This is similar to the inter- 
action between an electron and the zero-point electro- 
magnetic vibrations. The presence of the well may be 
referred to as self-trapping,{ while the electron-phonon 
interaction may be called the polaron effect. In reality, 
they cannot be separated, although at present the 
connection between the two is not completely clear. 


3. FORMAL THEORY 
The complete Hamiltonian for a solid of unit volume is 
H=T.+T+V(R). (3.1) 


Here 7, is the kinetic energy operator for all the elec- 
trons, T is the kinetic energy operator for the V nuclei, 
and V is the total potential energy of the system. V 
depends on the position of all the electrons (denoted by 
r when explicitly stated) and all the nuclei (denoted by 
R). We include in V implicitly any point imperfection 
such as a missing ion. The notation of (B) will be used, 
namely, the coordinate of a nucleus is X.(k) (a=1, 2, 3 
and k=1, 2---N). The explicit expression for T is 


T= — (H?/2) Dx (1/M:) Vè, (3.2) 


where all the nuclei need not have the same mass, Mg. 

The Boyn-Oppenheimer technique assumes that for a 
fixed value of R the eigenfunctions, ¢,(R), and eigen- 
values, €,(R), associated with the operator 


he(R)=Tet+V(R) (3.3) 


are known. 


_ +The assumption has been made that the self-trapping concept 
implies a stationary distortion of the lattice. There is no reason 
for this assumption. Indeed, Seitz and the author (M6) suggested 
at an early date that a self-trapped electron might move through 
the lattice. The work of Castner and Kanzig (Č 

self-trapped hole has relatively little mobility, 


ough 
1) implies thata 
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Next, one uses the Hamiltonian 
Ny (Rn) = T+ en(R)—e,(R,) (3.4) 


to obtain a second set of eigenfunctions x and eigen- 
values «,. The point R, of (3.4) is defined by the 
relations à 

- [d€n/OXo(k) Ir, =0. (3.4a) 


To terms in [AX.(%) ]* we may write (3.4) in the form 


ty=T+i > [<] OAD- (3.5) 
afk Il IX a(k)IX eD) Irn 


Here AX,(k) is the displacement relative to R,. One 
may introduce normal coordinates into (3.5) (B, p. 173) 
which are linear functions of the AX,(k). The result is 


=; 5 (p? +o), (3.6) 


= (h/1) (0/8q;), (3.6a) 


and 7 goes from 1 to 3 N. The q’s have the dimension 
of length times (mass)? and the w’s are given by the 
secular determinant, 


where 


| 
[= 6 ap0K1 |=0 0. (3.7) 
Rn | 


1 [ O7€n 
(MM) 0X 2(k)dX a(2) 
Here 6 is the Kronecker delta. 

The individual term in (3.6) corresponds to a simple 
harmonic oscillator with an eigenfunction, x,;, and an 
eigenvalue, #iw;(v;+3). The eigenfunctions associated 
with /, are products of such functions and are denoted 
by Xn. 7 corresponds to an electronic state and appears 
through the use of e,(R), and v; is the vibrational 
quantum number of the jth mode. v: is the sum of the 
vjs. Since we are mainly concerned with two states, 7⁄2 
will at times be replaced by g (ground) and (upper). 
Usually the subscript n will not be required. gy and x 
will be used for the ground state while gy’ and x’ will be 
employed for the excited state. 

To complete the notation, the following definitions 
are made: 


(enlO|en)= f en*O ent, (3.8) 


where the integration is over all positions of the elec- 
trons, and 


(x01 Ole) = fxs OxmdR, N (3.9) 


where the integration is over all positions ofrthe nuclei. 
O is any operator. It will be omitted when O=1. 
The essential problem is-to find approximate eigen- 
functions for the’total Hamiltonian (3.1). By using the 
foregoing equations, we may define two approximations: 


A (1) the static—¥(S)= o(R,.)x (3.10) 


(2) the adiabatic— (4) = 9(R)x. (3.11) 


MARKHAM 


Neither (3.10) nor (3.11) isa complete solution of (3.1). 
For some problems one desires to split H into a zeroth- 
order Hamiltonian and a residual which represents a 
small perturbation. The way this is done depends on the 
form of ¥ employed. 

For the static approximation 


H(S)=he( Rn) +), (Rn) (3.12a) 
H,(S)=V(R)—V(R,) 
Oren 
—i > | AX (R)AX (Ll). (3.12b) 
«,B,k,l (Rk)OX g(D) IR» 


On the other hand, in the adiabatic approximation, 
H(A) o(R)x=xh-(R) e(R)+ ¢(R)Tx 
ee 


(3.13a) 


1 
| ve Ya r o( R|. (3.13b) 
M 1; 


H(A) assumes that (R) permutes with 7 

An elegant method of arriving at (3.12) was given 
many years ago by Born and Oppenheimer (B3). This 
treatment is reproduced in a simpler form by (B, pp. 
166-173), who have also found a systematic way to 
obtain the adiabatic approximation. In the latter case, 
the x’s contain higher-order anharmonic terms. The 
Born-Oppenheimer technique involves several steps. 
Equation (3.4a) is a particularly important one, since 
it is involved in /,. Limitations on this technique are 
discussed in Sec. 5(b). 

To obtain the eigenvalues of (3.13a) we may proceed 
as follows: 


H(A) ¢(R)x= o(R)Len (R)+ T]x 
= o (R) [en (Rn) +h, |x 
=[e,(R»)+tele(R)x, (3.14a) 


where (3.4a) and (3.5) have been used. The eigenvalues 
Ho(S) and Ho(A) are equal since 


Ho(S) e(Rn)x=Len(Rx) +e Jo (Rn)x- (3.14b) 


Ip view of (3.14a) and (3.14b) we détine the total 
energy of the system as /’,,=e,(R,)+e,. We shall 
again omit the subscripts, using Æ and Z’. 

The method of splitting H into parts is based on a 
physical intuition and the fact that the nuclear mass 
is much heavier than the electron mass. Mathe- 
matically, the splitting has an unfortunate aspect, since 
some of the nicest mathematical properties do not 
hold for the Ho's. 

One may show (M4) that (1) the static eigenfunctions 
do not form an orthogonal complete set while (2) the 
adiabatic eigenfunctions form an orthogonal set with 
respect to r and R space. 

The latter can be shown by noting that the g(R)’s 
form a complete set of functions in r space for a fixed R, 
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and that the x’s form a complete set in R space for a 
fixed n. Further, we note 


{Cen (R)xn | ¢n(R)xnv)} = {Xnr (Pn! (R) | ¢n(R))xXnv} 
= {Xn |ônn |Xnv} = nnns. (3.15) 


(3) It can be shown that (B, p. 189) 


eee R e 
N oN 


(Feynman’s Theorem). 

For a perfect ionic crystal the lowest eigenvalue of 
le is the total interionic potential, i.e., the coulombic, 
repulsive, and high-order terms, from which one 
determines the normal modes by well-developed 
methods. In the case of interest, all cells in our solid 
are not equivalent, since there is a point imperfection. 
In the case of the F center there is a missing negative 
ion with a trapped electron. Mazur, Montroll, and 
Potts (M2) have shown that the modes for such a 
crystal are not identical to those of a perfect lattice. 
A missing ion creates local modes surrounding the 
imperfection. These are the vibrations of greatest 
interest to our problem. In a general way the motion 
of the ion right around an imperfection is determined 
primarily by the local modes (Bjork, B1). The object 
here is not to describe these modes in detail but to 
indicate what effects electron transitions have on them. 
The exact relation between local and crystal modes is an 
extremely complex problem. One of the best ways to 
study the local modes is to measure their frequencies 
indirectly. This can be done from measurements of the 
thermal broadening of absorption and emission bands 
(Part II). 


en(R,) ) (3.16) 


(a) Ground State 


The development so far is completely general. We 
now use this development for a crystal with an electron 
trap. Assume that there are two (or more) bound 
states. Every state generates its own én, w;’s, and g;’s. 
We start the calculation by using the values appropriate 
for the ground state. To stress this point we write 
€ for en; w;(g) for wj; (g) for ls, etc. We shall retain 
the notation q; and ~;, however. 


(b) Upper State 


We could make all the equations apply to the upper 
state by simply replacing the subscript g by u. The 
p’s and q’s of (3.6), however, will be different. The 
object is to find relations between the two sets; hence, 
we proceed in a different manner. In this case, the 
eigenvalue of /, is eu(R) and the trapping energy may 
be defined as 

a Ac(R)=eu<R)—€,(R). (3.17) 


Ae is approximately equal to the energy the trapped 
election acquires when it is excited from one state to 
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another (the nuclear coordinates remain at R). 
Actually, the difference in energy is related to all the 
electrons rather than just one. 

Since we are interested in the q’s, e.(R) is expressed 
as 


eu (R)= e (R)+Ae(R) = e (R) + Ls oF (ga? a 


+Ael(R) HE; 6qi +4 Die éjxgigrt:--, (3-18) 
where : 
g= [dAc/0q; In, (3.18a) 
and 
Ejk = [0*A€/q,;0gx Ir,- (3.18b) 


The dimension of e; is (mass)? (length) (time)? while 
the dimension of e;x is simply (time)~*. The second term A 
in (3.18) arises because e(R)— e(R) =} >); w7(g)q7, 
which follows from the definition of the g’s. 

At this point the theory runs into some difficulties. 
Equation (3.4) assumes that terms in AX,(7)AX (2) 
XAX,(l) are so small that they can be neglected. If the 
gs are determined by thermal vibrations, they will be 
small at low temperatures. Equation (3.18) includes, 


however, the displacement which occurs during a 
transition, i.e., the difference between-R, and R, de- 
fined by 

Lde,/OR jz, =0 (3.19a) 
and 

[de./OR Jp, =0. (3.19b) 


Actual calculations of the difference between the R’s 
have been made on some simple models. Rọ— Ru is 
about 0.3 A for the nearest neighbor in the case of a 
TI* trap in KCl (W2) and 0.06 A for the self-trapped 
electron (M6). These results are to be compared to the E 
root-mean-square displacement of an atom (Einsteinian 
model with a mass equal to 30 times that of the proton 
and frequency of 5X10"). For n=0 (the quantum 
vibrational number), it is 0.06 A and increases to 
0.13 A for n=2. X-ray measurements of the root- 
mean-square amplitude of vibration in NaCl at 86°K 
gives 0.15 A for Nat and 0.13 A for CI- (L3, p. 50). 
These large values must be attributed to the lower- 
frequency modes. The self-trapping calculations showed 
that the first anharmonic term was of no importance, 
due to a fortuitous geometry. However, this is not a 
general conclusion. If we include additional terms in 
G. 18), there seems to be no apparent simple way to àd 
relate the normal modes in the upper and lower state 
Therefore, we consider (3. 18) to be exact. For shalle 
traps and for any trap in nonionic crystals, 
assumption must be good. It must also be a fair a 
mation for deeper traps in ionic crystals, such 
F center. f; 

It is further assumed that the modes 
degenerate. To those familiar with I 


assumption indeed. One” must T 
approximation is not suppartedii 
pre 


cá 
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(M3). Secondly, we may be dealing to a large extent 
with local modes. This is supported by the analyses on 
the shape of the absorption bands. These are in the 
forbidden gap and cluster about a single value. There 
is no reason, however, to assume that the local modes 
are actually degenerate, and the above assumption is a 
natural’one at the present stage of development. This 
assumption probably does not affect any major con- 
clusions of this paper. 
Now the transformation 


wr (g) —w; (g 2) 


is made. The prime on the sum indicates that J% k. 
Neglecting terms of order eşjekı or ejexı for (k=l), 
(3.20) gives an orthogonal transformation. To the 
same approximation we write 


€u(R)= €,(R,)+-Ae(R,) 
+3 joj u) (qi P +205 sg (3.21) 
w(t) =w7 (g)+6;. (3.21a) 
The new equilibrium position is obtained from 


(3.19b). Since the AX’s are linear functions of the q’s, 
(3.19b) gives 


Tia Cia (3.20) 


(0/09;')éu=07 (u) qj + e= (3.22) 
The shift in the equilibrium position is 
qj = — s/o? (u). (3.23) 


Thus, the final transformation 
Q5= i +6)/w7(u)=9i' — Aq; 
gives the energy of the ground state in the form 


€u(R) = ég (R,)+Ae(R,) 
O SOU) a Bo ? (uO? 


= éu (R.)-+3 >; wf (u)Q?. 


By direct substitution one may show that 


(2)-n(2)-2(2). 6m 


hence, the transformation does not modify 7. 

The first three terms on the right of (3.25) appear 
in the eigenvalue of ke. The derivation serves to show 
the relation between the q’s and Q’s. This is of, essential 
importance and not the final form of eu. 

This concludes the formal treatment. The assump- 
tions which have been made are now summarized to 
emphasize their limitations. These are that 


(3.24) 


(3.25) 
(3.25a) 


(1) all cubic terms can be neglected; 
(2) térms in é:jex and ejer: for k1 can be neglected; 


(3-27 
ny the normal modes are nondegenerate. 
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In view of the work of Montroll and co-workers, we 
know that lattice perturbation can localize some of the 
modes. This effect does not appear above. The trans- 
formation (3.20) without the neglect of higher-order 
terms (item 2) may cause a radical change in the modes. 
This change most probably does not occur however 
during an electronic transition. 

The third assumption probably is not a fundamental 
one, although the first two may be. The problem of 
degenerate modes is that the g’s and #’s of Eq. (3.6) 
are not unique. Consider two degenerate modes qı and 

The substitution g1’=qitge and g2’’=qi-—q will 
Hee change the required form of the Hamiltonian, and 
the q” modes are equally satisfactory. The additional 
terms which occur in the upper state put a requirement 
on the possible linear combination of the q’s. These can 
be obtained from argument similar to those used in 
degenerate quantum mechanical perturbation theory. 
If we use the proper definition of the q’s, the degenerate 
problem is similar to the nondegenerate one. For this 
reason, the author feels that item 3 is not an essential 
restriction. The author has not carried through the 
details of this argument. Certainly the transformations 
(3.20) and (3.24) allow one to make a simple synthesis 
of many theoretical ideas at a time when it is most 
needed from the experimental point of view. Hence, 
we assume their validity without any further ado. 

For a point imperfection we may expect two types of 
ej. The es for the local modes will be especially large 
in solids with some polar characteristics. This leads to 
the LM approximation. The eys for individual longi- 
tudinal modes are very small; the net effect, however, 
may be large. This leads to Hamiltonians similar to 
those employed by Fröhlich. For these modes the «;;’s 
must be extremely small and the transformation (3.20) 
need not be used. 

Item 1 of Sec. 2 enters into the problem through 
(3.19b) or (3.24), item 2 through w;(g) or (3.21a), 
while item 3 arises through (3.12b) or (3.13b) depending 
on the form of Y employed. 

The technique used on the F center by Huang and 
Rhys and by Pekar assumes that e; is of the order 
of 1/N. This does not hold for local mode*; hence the 
expansion techniques are questionable. O’Rourke does 
not require that e;— 0, in the first part of his paper, 
thus includes localized modes. 

The division made above regarding the types of 
modes is not completely satisfactory since the Fröhlich 
description is somewhat oversimplified. The usual 
classification of modes, i.e., acoustical versus optical, 
transverse versus longitudinal, apply only to the long 
modes in polar crystals (M3). The majority of the 
modes actually have a wavelength of the order of 
several interionic distances and do not fit into the above 
classification. The experimentalist should “remember 
that the LST relation is not completely reliable, and, 
therefore, should not take it too seriously. The theorist 
should attempt to formulate his equations in such a 
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manner that he does not rely on the extreme simplifi- 
cation involved in this relation. As our understanding 
of the problems treated here increases (this involves 
further experimental data), we shall undoubtedly obtain 
much more reliable information regarding the vibra- 
tional modes in solids. 


4. PHYSICAL INTERPRETATION OF OPTICAL 
AND THERMAL TRANSFORMATIONS 


A physical interpretation of Sec. 3 is now attempted. 
It starts with a one-dimensional Franck-Condon 
diagram which will subsequently be elaborated. Next, 
the various activation energies are defined and their 
interrelations discussed. The results are applied to 
the Vi center. 

We start by considering the following simple model: 
The temperature is 0°K. Only one mode is affected by 
the transition. This implies that only one ¢; is different 
from zero. In Fig. 2 we plot eu and e, as a function of a 
coordinate q;. It is convenient, for now, to set ¢;;=0; 
hence w;(%#)=w;(g). Although e is an eigenfunction of 
lte it is useful to divide it into two parts, namely, 
€,(R,)=«,(0), which is considered as an electronic 
energy of the ground state, and €,(¢;)—€,(0), which is 
considered as a lattice potential energy. This division 
is justified by Eqs. (3.14a and b). According to the 
simplified Franck-Condon principle, the system may 
absorb a photon and jump from point a to point b on eu. 
The use of the Born-Oppenheimer technique assumes 
that the electron has sufficient kinetic energy to occupy 
all the space required by | ¢’|* before there has been 
an appreciable displacement of q; from 0. e, has been 
replaced by ex, giving an additional “force” — e; on q;. 
In the case considered in Fig. 2 ¢;= —w;*Aq;. Hence the 
total potential energy of the lattice relative to point 
b is 3w; + 6g; and its minimum appears at q;= — ¢;/w7? 
in agreement with (3.23). As soon as the force on the 
q; th mode has changed, its equilibrium value has as 
well, although the ions are not at —¢;/w7. 

During the shift from b to c, Ae decreases by —«;Aq 
= (€?/w?), while the potential energy of the lattice 
increases by 46;/w7, thus giving a total decrease 

h 


Fic. 2. Plots of eg and éu 
against gj. 
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of 4¢7/w,*. Hence the third term of (3.25) is Cone 
posed of two effects, the decrease of electronic energy 
and the increase of lattice energy. The simplicity of the 
interpretation arises because of the assumption that the 
restoring forces are not affected by the jump of the 
electron in the trap. 

Quantum mechanics (still assuming that ¢;;=0) only 
slightly modifies the picture. The stationary states are 
represented by horizontal lines since normal vibrations 
have kinetic as well as potentiak energy. The most 
probable transitions arise when the classical turning 
points of the upper and lower states occur for the same 
value of g;. While this is the most probable, it is not the 
only one; indeed, the probability of ‘‘nonvertical”’ 
transitions explains the shape of the emission and 
absorption bands in solids. 

The theory developed in Sec. 3 is now used to define 
the various types of activation energies. Neglecting 
the broadening due to the emission or absorption of 
phonons, the zero-point energy, and limiting our 
consideration to 0°K, we may define the following: 
activation energy for emission of a photon—E*—(c—d) ; 
activation energy for absorption of a photon—#2*— 
(a—b); thermal activation energy which is the same 
for emission and absorption—E‘—(a—c). The a, b, c, 
and d refer to Fig. 2. 

As has been stressed previously, the Stokes’ shift, 
which is the difference between E* and £%, is related to 
the third term of (3.25). After an absorption (from 
a to b), the force —e; is added, and the system relaxes by 
going to c. For e;;=0, the stored energy after an emission 
equals the stored energy after an absorption, i.e., 


€y(— €;/wj?) — €(0) = eu (0)— €u(—€;/w7’). (4.1) 


In (4.1) the arguments of eu refer to values of q not Q. 
Summarizing, the absorption activation energy includes 
the readjustment energies in both the upper and ground 
states; the emission activation energy excludes both, 
while the thermal includes the readjustment energy in 
only one state. 

Although (4.1) was derived by considering the 
displacement of only one mode, it is quite general 
because eg and eu are functions only of the q’s and not 
of the path taken. The displacements can be compared 
one by one to show that for an actual trap the total 
stored energy is the same in both states. In general then, 


` one may write 


E°+26E= E'+6E= E" (4.2) 


where 6£ is the readjustment energy (4) >; ¢7/w;, or 


Et= Ee— (1/2)(B*— E*)=\(172) (E°#E°). (4,2a) 
A i If E°=0, Et=}E*; hence, in general ` = 
E> (1/2)E* (for ¢j=0). > (4.2b) 
ie In some problems we may be able to introduce an 
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| 
effective frequency (Secs. 9 and 10). In this case 


ii òE=3 Di; e/o =hol{d 2; 67/hws}=hwS (4.3) 


where w is the effective frequency, defined by (4.3), and 
S=3 bij 67/hw§, (4.3a) 


| 
| 
| ; the Huang-Rys factor. One may measure S directly for 
ian some transitions (Sec. 11). It is the ratio of the stored 
energy at b to the average effective phonon energy. For 
| the F center in KC], S~30 (K4). 
| A relaxation of (4.2) occurs when e;j0. When 
e; ( jzk) is small compared to ¢;;, the effect of the 
second derivative is to shift the frequencies of the 
= modes. One may speak of a frequency shift super- 
= imposed on a displacement of the equilibrium position. 
5 E _ Now- (4.2b) does not apply, and E‘ can be much 
= smaller than 3°. In other cases it can be even larger 
= than £*, as illustrated in Fig. 3,§ where only one mode 
is eonsidered, If EKKE”, e; must be of the order of 
wj. For ¢;<0, one obtains the strange case where 
E< Et. However, the author knows of no such case 
nature. 
= In the self-trapping calculations made by Seitz and 
ary the author (M6) for NaCl, items 2 and 3 of Sec. 2 
"were not considered; hence, E'=}H*. This gives 
+ 0.34 ev for E‘, not the value oxeinally reported. The 
SR -error arises from an erroneous formulation of Feynman’s 
| gel theorem (Eq. 3.16). The agreement between the calcu- 
| —— Jation of Fröhlich, Pelzer, and Zienau (F3) and the 
| reported value of Z* (i.e., 0.13 ev) must be fortuitous. 
| ~ The author believes that higher-order terms than é; 
D = must be included. Further, the mobility of- the trap 
“ <n has to be included; hence the value 0.34 ev is a very 
Fle D eCi1min ore. 
me In rey, the values of Æ‘ and £” for the F’ center, 
» e5 must almost equal ws for the local modes 
h ch ster“about negative-ion vacancies. The upper 


problem of the intersection of the potential 
considered. 
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F’ state corresponds to an F center and an electron in 
the conduction band. The lower state corresponds to 
two electrons in a negative-ion vacancy. To the approxi- 
mation of interest here we may assume that the free 
electron does not interact with the modes. Ae(R) of 
Eq. (3.17) is simply the binding energy of the second 
electron in the F’ center. Since there is a radical 
difference in the upper and lower states, the assurnption 
of €;;~w;(g) seems plausible. The author finds it hard 
to believe that a corresponding situation exists for a 
hole trapped at a positive-ion vacancy. One assumes 
that it possesses two bound states. This is Seitz’s model 
for the V; band in KCl and KBr. The hole must be 
spread out similar to an electron in the F center. 
Transitions of the hole from an excited to a ground 
state could influence the vibrational frequencies but 
hardly enough to change the activation energy from 
3.0 ev to 0.23 ev. Our present knowledge of the V 
centers is so meager that the possibility of an alternate 
interpretation of the experiment should not be 
overlooked. 


5. PROPERTIES OF THE TOTAL HAMILTONIAN 


In Sec. 3 a theory was developed which is completely 
general, provided one may use the Born-Oppenheimer 
technique. In this section we compare this development 
to alternate formulations of the problem in order to 
contrast the ideas. We also compare the development 
of Sec. 3 with some problems in electrodynamics, in 
particular, to contrast the behavior of phonons and 
photons. 

In complex systems one likes to split the Hamiltonian 
into parts which interact only slightly. Thus, in electro- 
dynamics, it is divided into the radiation field Hyaa, the 
part related to the charged particles Ha, and the inter- 
action term Hint, which relates the electrons to the 
photons. One usually develops expressions for Hraa 
and Ha without regard to each other. The introduction 
of an electron in a vacuum does not affect Hraa, nor 
does the introduction of black-body radiation affect 
He. When one considers a trapped electron in an 
electromagnetic field one adds Hint to HraatHei 
without modifying either. 

The solid state problem is more compiex, and the 
méthod of splitting the Hamiltonian depends on the 
problem at hand. Once Wis selected, the term correspond- 
ing to Hint is given by (3.12b) or (3.13b). The problem is 
to split Ho(A) or Ho(S) in two. These we denote by 
H. and HL. In general, H. and Hz are intimately 
coupled and cannot be separated. One may use a 
purely mathematical argument like Born and Oppen- 
heimer (B, p. 166) to evaluate the importance of the 
various terms. The convergence of the approximation 
has not been studied, however. Crude calculations show 
that the adiabatic approximation must be used with 
caution for shallow traps (4). The splitting procedure 
is discussed in a semi-intuitive manner; further 
analytical proofs are most desirable. 


—_ 


INTERACTION OF NORMAL MODES WITH ELECTRON DRAPS 965 SS 


(a) Trapped Electron 


For an electron trapped at a point imperfection, we 
may use (3.1) or a form which is related more closely 
to the way actual calculations are made. For the latter 
purpose we define the impurity center Hamiltonian. 
With proper care one may make the treatment com- 
pletely rigorous. Thus we write 


H=t+T+Vi(R)+Vr(R)+V.(R,x). (5-1) 


Here ¢ is the kinetic energy of the trapped electron; 
Vi+Vr is the interionic or interatomic potential; 
V.(R,r) is the potential energy due to an excess electron 
at r when the ions (or atoms) are at R. 

It is convenient for pedagogical purposes to split 
Vr+Vr into two parts. Vz is the potential energy for 
the perfect lattice, and Vz arises from the point im- 
perfection. Thus, in the case of the F center, Vz is due 
to the negative-ion vacancy. Mott and Littleton 
considered this problem (M12), so one may call Vy the 
Mott-Littleton term. In the more conventional treat- 
ment, V+ Vz arises from the coulombic and repulsive 
forces of the crystal. In treatments using the Born- 
Oppenheimer technique, it arises from the eigenvalue 
of an electronic Hamiltonian which includes all the 
electrons except the trapped one (B, p. 173). 

First, deep traps are considered; hence, we are 
required to use the adiabatic approximation. Equation 
(3.13) gives 


(H1+H.) ¢(R)x= H(A) o(R)x 
= ¢Tx+ (t+VitVit V.)ex, (5.2) 


H,(A)=H—4H)(A). (5.2a) 


There is a striking contrast between our problem and 
a photon field. The splitting in electrodynamics is done 
purely classically and in an elegant manner. Here we 
have had to introduce quantum mechanical operators 
and further assume an approximate wave function. 
How does one split Ho into two parts? 

Following the development in Sec. 3, we expand 
about R, defined by 


— 


[Aent Vyr]? =0. (5.3) 
OX alk) Rn 


Ae, is an eigenvalue of /+V.(R,r). We split Ma(A) by 
writing 


Hy=T'+VitVitAen(R)—Aen(Rw) (5.4) 


and 
H,=t+V.. (5.5) 
T’ means that T only operates on x. We note that 
a - Ho HIF He. (5.6) 


When the sum operates on ¢x an extra function 
(>) 


S 
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Ae, (R)—Ae,(R,n) appears. Hz includes approximately 
the cohesive energy of the distorted crystal.|| . 
The modes defined by (5.3) and (5.4) depend on the 
state of the electron due to the presence of Aen. Using 
the foregoing development, we may understand the = 
various simplifications that have been made. 
(1) The phenomenological approach (a) igngres the — 
presence of Vr and (b) makes a crude approximation 
of the actual Hamiltonian by assuming that the equation 


{H1— Vi—Aen(R)+Aen(Rn) }x= &X (5.7) 


gives pure longitudinal and pure transverse optical 
modes. It further assumes that the lower branches 
are pure acoustical. 

These extreme simplifications are inadequate for 
some problems involving normal modes in alkali 
halides. The interaction between the trapped electron 
and the phonons is obtained by adding Ae, to (5.7); 
namely, one requires solutions of the following equation: 


{Hi—V1}x= &x. (5.8) 


In principle the inclusion of Ae, leads to no difficulties, 
and the method developed in Sec. 3 can be employed. 
Meyer has developed a perturbation scheme using Ve. 
This, however, leads to the omission of some terms. In 
Appendix II we compare both methods. 

(2) The local modes approach: One assumes that 
the eigenequation 


att rabees 0 pete Pew) 


A rx= x. (5.9) 


gives two types of modes, q:, which are local, and qe, 

which pertain to the perfect crystal. For this approach 

to be valid, Ae, must be independent of the qe. If Aen 

is only a function of a single localized mode then we have 

the CC model. The Ae’s can again be used to calculate 
the effect of the trapped electron on the modes. A 

priori, there is no clear reason for believing that Aen 
will be independent of the qẹs. Hence, there is no 
apparent reason to assume that Ae, will not be a 
function of the q.’s. Only extremely complex calcu- 
lations, not available at present, could tell the exact 
form of Ae(qi,g-). One may resolve this problem some- 
what by obtaining equations for the shape of absorption 
and emission bands in solids with adjustable parameters. 
These parameters can be evaluated by analysis of the 
experimental data. This, in turn, will give basic in- 
formation regarding the relations among Aen, qe, and 
qi. Part II of this paper is devoted to this problem. 


(b) Free Electron 


Consider next a free electron in a metal or semi- 
conductor. Two arguments suggest that the Born- 
Oppenheimer technique should not. be used. First, K: 
consider the effect a free electron at r has on the mod 


|| Approximately because We use R 
stead of the values of R determined by min 


made frem Bloch functions. The electron proceeds in a 
straight line for about 200A and then gets reflected 
through an angle. If the electron has the energy of 3 ev 
(approximately the Fermi level of a metal) it collides 
with the lattice every 2X107 sec. It is not confined to 
a local region, but wanders in random fashion, so that 
an average effect over a period of vibration is meaning- 
less. Analytically, for a pure Bloch function; e; equals 
zero, so such an electron does not exert a force on the 
normal modes as shown below. Therefore, one splits 
the system’s Hamiltonian as follows (Vr=0) 


H,=T+V1(R) (5.10a) 
H,=i+V.(Rz) (5.10b) 

and 
H,=V.(R)—V.(Rz) (5.10c) 


to obtain the usual theory (W4-Chap. IX). The normal 
modes about Rz are determined from Vz. Here another 
assumption is made, that one should use ¢(Rz) rather 
than (R). The continuously readjusting wave function 
requires a type of kinetic energy, and one suspects that 
the loss in potential energy does not compensate for 
this gain (M4). 

The distinction between the static and the adiabatic 
approximation is academic, since, as Haug (H2) has 
shown, that first-order perturbation theory gives the 
same results for both. The treatment assumes that one 
does not use the Born-Oppenheimer technique for the 
free electron—that is, that Hz determines the q’s. 
This requires that e;(R,)=0. Proof of this relation for a 
Bloch electron is now given. 

The interaction between an electron and the lattice 
in the conduction band has the form 


H=; a;A;, (5.11) 


where the A,’s are determined from (5.10a). To a 
sufficient approximation we write 
aj= const exp (ik;- r) (5.12) 


where k; is the wave-number vector of the jth mode. 
The Bloch function for a conduction electron may be 
written in the form 


Pp=up(r) exp(ip: r) (5.13) 


where p is the momentum vector and p has the 


_ periodicity of the lattice. Hence from (3.16) it follows 


= that 


a Es 
tof E 
— 


m4 ee 


A 
sig 


g= lgppl a] or) =const |u exp(ik;-r)dr=0 (5.14) 


‘since “p is periodic. This shows that the A,’s describe the 
true modes. f| This argument also applies to nonpolar 
_ semiconductors. 


a 


D? a 


arrive at the expression for «j in a more rigorous 
c i da gge pal Hi| pp) =C only if p— gtk=0 Ewa, 
) with £,=0]. Here p= q; hence, for k;0, ¢=0 
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This proof as it stands should apply to polar crystals 
as well. Detailed considerations show that this develop- 
ment is incomplete. This follows because some e;; are 
zero in spite of the fact that Ae,(R) is not small. For 
this case M, has a major influence on. the q’s, and the 
presence of the electron may generate a new set of 
modes. Here the problem does not correspond to the 
one in electrodynamics. 

Applying this approach, Eq. (5.10), to deep traps 
where the electron’s kinetic energy is large, (5.14) does 
not hold, and exceptionally large values af the inter- 
action are obtained, simply because the q’s derived 
from (5.10) are not the appropriate ones. These x’s 
form an orthogonal set, although a force —e;(Rz) 
remains on the nuclei at A;=0. 


(c) Comparison with Quantum Electrodynamics 


Coming now to some general comparisons between 
quantum electrodynamics and the theory developed 
here, no equation in quantum electrodynamics corre- 
sponds to (3.4a). First, consider case b where all the 
e;(Rz)’s are zero and the A’s are the true modes. Hence 
Hrxa corresponds to Hz, Ha corresponds to H., and 
Hint corresponds to Hy. The frequency and wavelength 
of the photon modes are determined for free space and 
are not influenced by the electrons, although their 
number is determined in part by electronic transitions. 
The number of phonons is also determined by the 
electronic state, if we use time dependent perturbation 
theory, although the q’s are obtained from Hy. The 
polaron problem in semiconductors and metals corre- 
sponds to the self-energy problem in electrodynamics. 
This is also true for many optical problems in valence 
semiconductors, i.e., such as the band-to-band transi- 
tion in germanium, silicon, and diamond. Here the 
normal modes do indeed behave in a manner similar 
to photons, and the word “phonons” is most 
appropriate. 

This is to be contrasted to the situation where 
method (a) should be used. The term which mixes the 
electronic and normal mode wave functions or causes 
transition is H(A) of (5.2a). The «s and ars have no 
counterpartin guantum electrodynamics.2=.ey generate 
a new set of g’s for every electron level ana cause an 
essential difference between the two fields. Their 
presence makes multiphonon processes natural in some 
solids. In this case, optical problems are similar to those 
occurring in diatomic molecules (Part II). The similarity 
between phonons and photons breaks down. If one 
applies method (b) to polar crystals, there is no simple 
way to separate the energy associated with the poor 
choice of the modes and that caused by H1(S). 

Thus we have two types of problems in phonon- 
dynamics, In one the e’s can be ignored and“ese can be 
made of the tools of modern electrodynamics. In the 
other, we are in a different domain, where’ the ele- 
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mentary concepts of electrodynamics do not serve as a 
guide. The first situation is in a sense a “special” case. 


6. CONCLUSIONS OF PART I 


1. By the use’of the Born-Oppenheimer method, we 
have shown what effects electron transitions have on 
the.ngrmal modes of vibration. The problem has been 
formulated in a new manner to show the interrelation 
of the various terms. The two major effects are (a) a 
shift in the equilibrium position of the normal modes 
first-order effect; (b) a change in the frequencies of 
vibration—second-order effect. The analytical reason 
for these shifts is shown and related to the change in the 
electronic binding energies of traps. The relations which 
are derived from the microscopic Hamiltonian are 
completely general. It was indicated why one would not 
expect the calculations made by Huang and Rhys, by 
Pekar, and by Lax to apply to an F center. To obtain 
an accurate theoretical expression, greater care must 
be taken in the use of various expansions. 

2. Various activation energies are defined and general 
relations obtained between them. These impose re- 
strictions on the interpretation of some experiments 
done on color centers. 

3. A comparison has been made of quantum electro- 
dynamics and phonondynamics. ‘Terms not present in 
the former arise. They account for multiphonon 
processes. These additional terms play a very important 
role in the shape of optical absorption and emission 
bands. They may enter into the polaron problem. The 
electron in the conduction band of some polar solids 
may create new local modes about itself. These local 
modes do not appear in electrodynamics, hence the 
polaron problem in some crystals is not equivalent to 
problems in electrodynamics. These terms make it 
possible for a hole to be self-trapped, as indicated by 
the experiments of Castner and Kanzig (C1). To 
understand the stability of some of the new types of 
centers, the molecular color centers, we must examine 
the phonon-electron Hamiltonian discussed here in 
fuller detail. 


Part Il~Rroadening of Absorption Lines Due 
to Multiphonon Processes ` 


7. INTRODUCTION TO PART II 


In the following sections the theory developed in 
Part I is applied to the broadening of absorption bands 
in solids. We only consider transitions between two 
bound states. The breadth of a line may be due to 
several causes. 


(a) One is the Heisenberg principle which requires 
an uncertainty in the energy of a level, provided its 
lifetime is finite. An alternate way of looking at this 
problem to use higher-oftler terms in the standard 
time-dependent perturbation theory (H3, p. 181). This 
explains the breadth of the hydrogenic lines at low 
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Tase II. Breadth of absorption lines in ev 
Atomic 4 
One of the Ha lines of hydrogen 3x 1075 
A very wide atomic line of 
Cul (*#s—4D; at 4540 A) 6X 1074 
Solids = 
The 278 cm™ line of B doped Si 7X10- 
The F center breadth in KCI at 0°K 0.16 
The F’ center breadth in KCl about 1 
A narrow band in LiF (P5) 0.01 
0.01-0.08 


Energy of one optical phonon 


pressure. The breadth of lines due to this effect should 
be very roughly of the order of that observed in atoms. 

(b) Broadening is also caused by the interaction 
between centers. We must assume that there is more 
than one trap in a solid so that (5.1), which applies 
only to an individual imperfection, does not contain 
the whole Hamiltonian. Their sum does not form the 
total Hamiltonian, for one must consider interactions 
between centers. Without these additional terms, éu 
represents a highly degenerate level, since any im- 
perfection can be excited. With the interaction terms, 
the degeneracy of e, is removed, and the excited 
states spread out into a narrow band. Since no detailed 
calculation has been made, it is not known whether the 
degeneracy of eu can be completely removed. If a crystal 
has 5X 10!® centers per cc, the mean separation is 270 A. 
This type of interaction must be small in polar solids, 
since 270 A is large compared to the spread of the 
electron wave functions. For example, the excited state 
of the F center in KCl does not extend beyond 13 A 
(Smith, S6). The interaction may not be small if the 
imperfection is such that the spread in the wave function 
is of the order of the distance between centers. 

(c) Although this effect may make a small contri- 
bution to the broadening of an absorption line in solids, 
our interest is in the fact that during an optical transi- 
tion no strict selection rules apply to the v;s. If 
€;=€j;=0, strict rules would apply, and this effect 
would not make a contribution to the problem. 


Effects (a) and (b) are essentially temperature 
independent, while (c) depends strongly on the 
temperature. 

In view of the complexity of the problem, equations 
are derived with parameters which are to be evaluated 
experimentally. This is the first necessary step, since 
at present our knowledge is far too meager to attempt 
exact @ priori calculations. Some experimental facts 
will be presented before embarking on the theoretical 
details. 3 

The width of lines varies over a large range. Some 
typical values are given in Table II. Thus the width o 
some lines in solids approaches the broadest found in 
atomic spectra. In general, lines in solids are considered 
broader, however. In polar crystals some bands 
very much wider, and the absorption or emissio 1 
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single plfonon will not explain the phenomenon. The 
marked temperature dependence of the shape of some 
lihes strongly suggests that (c) can dominate the others. 
Detailed studies (M5, K4) support this conclusion. 

The theory developed here does not apply to the F’ 
band since the upper state is not bound. It.is not 
completly clear why some bands in polar materials are 
extremely narrow. One may simply assume that 
certain parameters, to be derived, are very small. On 
the other hand, more complex concepts may be involved, 
such as the breakdown of the Born-Oppenheimer 
technique; further studies are required. The theory 
developed applies to bands whose width is approxi- 
mately equal to the F center and is temperature 
dependent. 

Experimentally, the absorption, a, is measured as a 
function of the wavelength or the photon energy e 
(in ev). The observed a(e)’s are not simple analytical 
curves; hence a way must be found to characterize 
them. We may in general define: em, the value of e where 
a has its maximum value, am; €- is where «æ has half its 
maximum value on the red side of em, and e, is the 
corresponding point on the violet. Further useful 
definitions follow: 


the moments— m= f a(e)e”de (7.1) 
0 
the average— é=M,/M (7.2) 
= fS eao (1.3) 
m’ =— e—é)*a(e)de 17.3 
Mo 0 

and the half-width— H=e,—e,. (7.4) 

In general, 
n= (M2/M >) — ë. (7.5) 


The absorption is due to upward transitions induced 
by the vector potential of the light which falls on the 
crystal. To relate the measured absorption at photon 
energy e to the transitions, four factors must be 
considered. 


(1) The vector potential at the imperfection has to 
be related to the intensity of light falling on the crystal. 
Lorentz suggested a way of doing this. It is not known 
whether his theory applies rigorously to traps in solids 
and deviations from his simple equation would not be 
surprising. In any case the Poynting vector (the light 

intensity) is proportional to the square of ‘the vector 
potential at the imperfection. 

(2) The second factor is‘due to the y’s which give 
matrix elements. The expression may be simplified by 
the introduction of an oscillator strength, f.-Classically, 
the second item contains no unknown since f equals 
unity. Quantum mechanically, the problem requires 


exact wave functions, which at present are unknown. 
< 4 
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Most probably, we shall have to introduce the f factor 
empirically for some time to come. 

(3) The imperfection concentration. 

(4) Finally, one must consider the shape factor. This 
can be done by multiplying the usual atomic absorption 
equation, item (2), by a function G,,(e). Gna arises from 
items (a), (b), and (c) considered previously. In all 
cases studied so far, Gn has the property /G,(6de=1., 


The measured absorption at e is the product of the 
above four factors. If one could make exact calculations, 
the impurity concentration could be treated as an 
unknown and computed from a single point, e. Even 
without an exact calculation, one might evaluate the 
factor associated with (1) and (2) from experimental 
data and obtain the concentration with the use of a 
single constant. This would require a detailed knowledge 
of Ga, which is lacking. The G, can be eliminated by an 
integration over e. This procedure is impractical in 
many cases, for one cannot measure the absorption at 
the wings of the band (due to overlapping from other 
absorption bands). The integration technique is also 
extremely lengthy. Many years ago Smakula suggested 
replacing /a(e)de by a product of a» times H. This 


requires the introduction of a constant a, such that ` 


a;G,,f[=1, where Gm is the peak value of G. 
To illustrate this procedure, let us assume that Gp 
has the triangular shape, i.e., 


G,=0 for a 
= (1/H)(a—¢/a—en) for a>een 
=(1/H)(e—b/em—b) for e,>e>) 
=0 for b>e. (7.6) 


The area under G, is unity. The integration over the 
frequency equals G,,H and as=1 for the shape given 
by (7.6). Hence, the product GmH or æmH eliminates 
the fourth factor. The advantage of this procedure is 
that the temperature dependence of H seems to be 
eliminated.** 

The essence of Smakula’s famous equation is the 
substitution of G,,H for /G(e)de, not the exact form 
of G,(¢). Therefore we refer to the general equation by 
his name and call a, Smakula’s constant. In this 
equation a, is always multiplied by two factors arising 
from (1) and (2); hence the actual value of a, is of little 
experimental importance. Nevertheless, a, is a measure 
of the shape of an absorption band, as are the variables 
defined in Eq. (7.1) to (7.5). 

It is useful to determine H, m, and a, for several 
types of G,(e), namely, 


The Gaussian 


a 
Gr=— exp{ —a?(e— em)?). 7.1) 
J exp{—a ( 


T : ae 
** “Seems to be,” since there is practically no experimental 
data to support or refute this assumption. ; 
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Hence, 
1 
H(G)=-[4 In2 ]}. (7.7a) 
a 
Further 
m?= (1/2a?) = (1/8 In2)H?= (1/5.545) H? (7.7b) 
and . 
a,(G) = (1/G,,H) = [1/4 In2]?= 1.0645. (7.7c) 
The Lorentzian 
b 1 
G,=- ——— (7.8) 
m L1H? (e— em)?] 
Now 
H(L)= (2/0) (7.9a) 
and 
a,(L) = (t/2)= 1.5708. (7.9b) 


M» does not exist for (7.8). 


The Double Gaussian 


G, might be made of two Gaussians, one for the 
violet side with a=a, and another for the red side, with 
a=a,. Then 


Gm= (2/4/m)[(1/ar) + (1/av) J~”, (7.10) 
H(DG) = (1n2)[(1/a,)+ (1/av) J, (7.11a) 
m?=%3[(1/ar)+ (1/av) J+ 
X([(1/a,3)+ (1/a,8) J, (7.11b) 
and 
as (DG) =a,(G) = 1.0645. (7.11c) 


Various attempts have been made to describe the shape 

of the absorption band associated with F centers and 

all three types have been considered. One of the most 

successful is the double Gaussian. For this curve a, is 
| the same as for the single Gaussian. 


The Pekarian 


This curve is discussed in Sec. 9 and Appendix IV. To 
obtain aL, numerical integration techniques were 
used, since the analytic form is complex. It actually 
depends on H. However, unless the half-width is very 

i small, 


m= H?/5.57 (7.12a) 
and 
a,(P)=1.07, (7.12b) 
which agrees with Eqs. (7.7b) and (7.7c). A Pekarian 
curve resembles a double Gaussian, so that the simi- 
larity of a,(P), a,(G), and a,(DG) is not surprising. 
| In Smakula’s original derivation a,(Z) was used. In 
recent yeers experimental. studies indicate that the 
Gaussians or double Gaussians give better agreement 
| with the observed curves. Dexter has suggested that 
one might replace a.(L) by a,(G). There are good 
=F ; 
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arguments for this change. The absorption shapes, = 
however, are not true Gaussians, and the suggestion N 
simply means that we replace one empirical constat 
with another without gaining any fundamental in- 
formation. There is some evidence that the band shape 
is Pekârian. 

Since in Smakula’s equation a, is always-mult%plied by 
two other factors, the actual value of a, is of little 
experimental importance. The danger in making the 
change is that confusion will result. The day may come 
when one will know the true form of G, and have a 
correct expression for a,. As this day has not yet 
arrived, the standard form of Smakula’s equation 
is preferable. s 


> 


8. GENERAL FORMULATION OF 
ABSORPTION PROBLEM 


In this section the absorption problem is formulated 
in general. The next two sections discuss the shape and 
moments. Since deep traps are to be considered, the 
adiabatic approximation must be used. The electron- 
phonon interaction term, i.e., (3.13b), will be ignored. 
The Hamiltonians of the solid can therefore be written 
in the forms: for the ground state 


H,=«,()+T, +V, (8.1) 
or 
E= e, (0) +5; ħws(g) (mt), (8.1a) 
while for the excited state 
Hu= eul) HT HV (8.2) 
or 
E' = eul) +HZ; holu) lot), (8.2a) 


where v; is the vibrational quantum number of the jth 
mode when the electron is in the ground state, while 2,’ 
corresponds to the upper state; e, and éu are evaluated F 
at R, and Ru, (3.19a) and (3.19b) H,,. is an operator, } 
while Æ is the total energy, (3.14a). V, and Vu are 

e,(R)—e,(R,) and e.(R)—e.(R.), respectively [see 

Eq. (3.4) ]. b 

Effects of thermal expansions are not included in 
the theory of Part I. The thermal expansion of the 
crystal arises from the dependence of the vibrational 
frequencies on the volume (Slater, S5, p. 199). In 
principle, the thermal expansion may influence all the 
variables. It is believed that the primary effect of the 
temperature (0) is on eg and eu. 

To arrive at an expression for the absorption coeff- 
cient, we start with the standard equation for the energy 
absorbed by an atom when light falls on it (2 of Sec. 7). ; 
This expression must be corrected for the presence of __ 
the medium and for various broadening effects, as 
discussed m the last section (1 and 4). Since Dirac’s 
delta functions are used, special care must be eraployed. 
First-order perturbation theory gives, the folloy 
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eae Fic. 4, Schematic diagram showing the broadening due to the 
emission or the absorption of phonons. 


vacuum (H3, p. 180): 
2 


4r? e 
Satomn=—— —v|{(¥'(A)|r|¥(A))}|77,, (8.3) 
3 hc 


where v is the angular frequency of the absorbed light{t; 
wW’(A) and W(A) are the total wave functions for the 
ground and excited states, Eq. (3.11); and J, is the 
intensity of the light per unit angular frequency 
interval, i.e., the energy of the light which falls on the 
= crystals in the angular frequency interval Av is J,Ayv. 
a _ In a medium, the following modifications must be 
nade. 
(1a) If 1 is the high-frequency dielectric constant, 
a E intensity J, (Poynting vector) is (cno¥?/4r). Since 
_ the transition is caused by F?, (8.3) must be divided 
‘os b no.8§ 
= (1b) The space average field must be replaced by 
local field. Since they are proportional to each 
we introduce a constant bm. For a Lorentz local 


ei (2+-n.?)? (B, p. 100). 
pees csSed lines have a finite width. 


wal (1b) correspond to 1 of Sec. 7. (2) corre- 
the broadening due to the uncertainty 
d uke interaction between neighboring 


(8.4) 


| god=. e (8.5) 
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and is not the same as Gn. Figure 4 illustrates the 


problem. There is an over-all absorption band and a. 


fine structure. We may expect to observe the fine 


structure only if the half-widths of the g’s are less than. 


the energy of one phonon (i.e., hw). Figure 4 assumes 
that one is dealing with a single phonon frequency. 
When a center interacts with phonons of several 
frequencies, i.e., if there is dispersion, the ¢,’s and €;;’s 
can cause a broadening of the “lines” shown in Fig. 4. 
This problem has been discussed by Krivoglaz and 
Pekar (K5). There seems to be no simple way to 
separate this effect from those discussed in Sec. 7. 
This effect is not considered further here. 

We now digress slightly to indicate how the trap-trap 
interaction produces a broadening. The breadth of an 
electron-spin resonance line also arises from two effects: 
one is because of the interaction between the spins 
(spin-spin), and the other is because of the interaction 
of the electron with the magnetic moments of the 
surrounding nuclei (hyperfine). The hyperfine inter- 
action often has a much larger effect. The problem is 


thus similar to the one considered here. Strangely — 
enough, we have a much better understanding of the _ 


broadening problem in the newer field (electron-spin 
resonance). 

For this development the solid is considered to be 
a large molecule of unit volume. There will therefore be 
N imperfections. We ignore the lattice vibrations, heice 
denote by R; the rest position of the jth imperfection, 
and by r; the position of its electron (relative to Rj). 
Again two bound states per trap are assumed. The total 
ground state eigenfunction has the following form in the 
zeroth approximation, where the interactions between 
imperfections are ignored, 


$o(g) =I ye(Rj+r4)). 


(8.6) uses the Hartree approximation (S2, p. 234), 
while the excited state is of the form 


Bo (ik) = o (Rı+ rı): - - o (Rrrr) 
Xo (Retre) o(Repitrigs):>-e(Rytry). (8.7) 


It is N-fold degenerate, since any imperfestion can be 
excited. p 

We next consider two perturbations: (1) due to the 
optical waves and (2) due to the imperfection- 
imperfection interaction. The first has the form [H3, 


Eq. (13.7)], 


(8.6) 


Xi (¢/me)p;(e)- A(R; +r), (8.8) 


where p;(e) is the electron’s momentum and A is the 


vector potential. If one combines (8.6), (8.7), and (8.8) | . 


with the usual theory of absorption (H3, p. 143 and 
p. 179), the following transition matrix is obtained, 


| Soluk) | Rers|Bo(g)) |? 


Rers= 2; r; exp(— ix: RA) 
d VSA Foundation USA 


where 


INTERACTION OF NORMAL 
and x is the wave vector (v/c) of the exciting light. The 
total effect isa sum of (8.9) over all the upper states. 
In this approximation (8.9) is just V|(¢’|r|¢)|?, as 
one would expect. 

The second perturbation produces changes in (8.6) 
and (8.7). We first apply the trap-trap interaction and 

"y ‘then use (8.8). Since o(g) is nondegenerate, its energy 
level will be slightly shifted. The perturbation, however, 
will affect &y(#,k) more profoundly. It removes some 

y or all of the degeneracy and the upper level will fan 

=- out (Fig. 5): The various eigenfunctions are denoted by 
Pı(u,k). One may no longer replace Rots by r, since the 
phase factor is of utmost importance; hence, the matrix 
element becomes 


(Dı (1,2) | Rers| 1 (g)). 


Many of the cross terms in the square of the matrix 
cancel due to the random distribution of the Rett. The 
old theory of spectroscopic stability (V1, p. 137) gives 
some useful information. It can be applied directly to 
(8.10) to show that 


(8.10) 


3 (Bo(2,k) | Rere| Po(g)) =N | Co'l r] o)l? 


FZ] (;(1,k) | Rere|®1(g))|2. (8.11) 


o_o 


The perturbation is assumed to be small enough that 
the ®,(2,k)’s can be expanded in terms of the o(z,k)’s. 
The next step is to define g(v) by the relation 


yA 


x | (1 (uk) | Reee|®1(g))|? 
=Neg(v)|(¢' |r| o) 


The summation is restricted to states where the energy 
difference lies between #v3HAv. The left-hand expres- 


"Av. 


(8.12) 


n sion is simply, the sum over all the possible transitions, 
while the right-hand side defines g(v). The theorem of 
er spectroscopic stability assures that ° gdy=1 as re- 


quired by (8.5). 
. The development is quite formal and gives no 
=> information regarding the shape and width of g. These 
are the required steps, however, to introduce the factor 
properly into the theory of optical absorption. It shows 
that the upper level widens. 


WITHOUT WITH 
INTERACTION INTERACTION 
EXCITED Boce 
. s STATE SS N 
Fic. 5. Schematic di- e 
agram showing the ef- T 
fect of the interaction 
: between centers. 
k ente ¢ 
k 
p 
GROUND 


STATE SSS 


MODES WITH ELECTRON TRAPS 


` CC-0. Gurukul Kangri University Haridwar Collection. Digitized by S3 Foundation USA TN aS 


O71 


- This development indicates why one would not expect 
sharp lines in the absorption spectra of solids, except 
when one “imperfection” is very well shielded frora 
another. Hence the absence of sharp lines in thespectros- 
copy of solids is to be expected (see the edge emission 
of CdS.K2a, p. 117). This factor also accounts for the 
fact that the F center absorption does notehave a 
fine structure. Unfortunately, we are unable to estimate 
the width of g, or to separate this broadening from the 
one caused by dispersion (K5).  ~ 

It is assumed that g is wider than fw, where w is an 
effective phonon angular frequency. The g function is 
next replaced by the Dirac 6 function. To obtain the 
moments, we integrate over many such functions. For 
the shape calculation, more care has to be used. On 
introducing the 6 function into (8.3) and recalling 
items (1a) and (1b), we obtain 


s(v) =b (4n°/3) (€/he)v| {X (4) |x] (A))} |? 
x 5{»— (1/4) (E'—E)}I,. (8.13) 


The dimensions of Satomic and s(v) are not the same. 
Satomic has the dimensions of mlt’ (m for mass, } for 
length, and ¢ for time). This arises because J, has the 
dimension of mt, not energy per lt. Equation (8.3) 
assumes implicitly a 6 function (S2, p. 215). In (8.13) 
we have written the ô function explicitly; hence s(v) 
has the dimensions of mèi. 

The absorption cross section per imperfection in cm? 
is defined by 


energy absorbed per unit time in interval Av 
energy falling on imperfection per unit time 


for unit area in interval Av 
or 


a (v)=b(4r°/3) (ve?/hc) | {X (A) |x| ¥(A))} |? 
Xô{v— (1/4) (E'—E)}. (8.13a) 


Tbis cross section must be summed over all the point 
imperfections in unit volume. Before summing (8.13a) 
we must consider the probability of finding a given 
imperfection with energy Æ. This is given by 
(B, p. 178) 


P(v) =N (v), (8.14) 


where 
plo) =e (1— ei) ; 
= 2e-@i+D8i sinh46; (8.14) eee 


and 7 EER 
Bi=ħo;(g)/0. (814b) 


The time average absorption constant is Ze Phu) o(v). a 
To obtain the actual measured absorption coefficient, — 
the 6 function must be eliminated. This can be done | 
the integration ws 


rhe =e 
` Lew P (u)o(w)dx. “(8.1 
y—hw ae: hd 


1 
Q.=—N 
w 
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Cn averaging o(v), the error of replacing g by ô is 
eliminated. For the calculation of the moments, one 
nied not use (8.15) but simply integrate >>, vP (v:)o (v) 
l over many narrow “‘lines.” 

| To carry the solution further, the delta functions 
| must be expressed in terms of integrals. v appears in the 
! 6 functien and as a multiplier of this function. If the 
; Pit oscillator strength] || is introduced (S4), 


S(R)= (2/3) kn/h)v| (eR) |r] e(R))?. 


a. takes the following form 


(8.16) 


1 rio T2er? 
aozan oE E rolh 


9 cm 


xé[ (BH | (8.17) 


which nicely hides the frequency dependence of the 
term in front of the 6 function. The mass of the electron 
(m) is introduced by using (8.16). It can be considered 
as an adjustable parameter provided we have a knowl- 
edge regarding the behavior of f. The next step is to 
forget about f’s tependence on v and R and write 


2he2 


JN 


y—hw 
Gu()=E POl hA lao (1/A)(B"—B)}. (8.18a) 


In Sec. 10 we shall show that G, is normalized. In 
(8.18) the shape and moments only depend on the 
x’s. The assumption that f is independent of R is known 
as the Condon approximation. Meyer has eliminated 
this restriction. There is no good theoretical justification 
for the steps taken in obtaining (8.18). The frequency 
spread may be large over the bands, and one can hardly 
replace » by an “average” value. In emission, a v* 
appears before the 6 function, and it certainly cannot 
be ignored. For instance, from the data on the emission 
from the F center in KC] at 77°K (D3) we know that 
€,(em)=1.08 ev and ¢,(em)=1.35 ev. Since (1.25)*=2, 
the frequency factor could lead to an experimentally 
__ detectable skewness. One can assume that f is a function 
of the q’s and of v and still solve the moments problem. 
The present absorption data are not good enough to 
_ justify these added complications; they are considered 
= only in Appendix V. 2 
= The theoretical problem is to know whether to treat 
$ Ilo (R)lr|o(R))| or | ¢e’(R)|p;()| e(R))|-" as the 
TaS mental quantity independent of the frequency. 


Qe (v) = (8.18) 
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elements of p;(e), while the elementary theory uses the 
matrix elements of the electron displacement. To refine 
the theory, this problem must first be resolved. This has 
not be done so far. 

One cannot simply borrow theorems which hold for 
atoms since the phonon modes give additional degrees 
of freedom to the problem and their effects on the sym 
rules have to be explored with care. 

Peierls, many years ago (S2, p. 671), considered the 
absorption problem when the centers have a lattice 
geometry. In this case the ©; (w,k)’s have a rather simple 
form and strict selection rules exist for the transitions. 
The problem here has only an indirect relation to the 
one considered by Peierls. We assume a random 
distribution of imperfections; hence, no strict selection 
rules. 

Lax and Burstein (L2a) have formulated the above 
in an alternate manner. They consider H; as a pertur- 
bation which broadens the W’s making bands out of the 
ground and excited states. This is an interesting and 
useful point of view. 

Although there are limitations on (8.18a) we shall 
now use it to obtain the absorption coefficient when 
the es equal zero (Sec. 9), and the distribution 
moments in the most general case (Sec. 10). The 
foregoing equations apply rigorously to all the models 
discussed in the general introduction. 


oo 


9. SHAPE OF THE BAND 


Using a development of O’Rourke and Pekar, the 
integral G,,(v) can be evaluated giving expressions for a. 
The problem, however, has to be simplified by the use 
of two assumptions: (1) that ¢;=0, and (2) that an 
effective mean frequency exists. Use is made of Mehler’s 
formula and various expansions of the modified Bessel 
function. 

The delta function can be expressed in the integral 


form (H3, p. 66) 
ERED 
-—f edt. 
Ir _» 


By substituting (8.14a) and (9.1) into (818a), one 
obtains 


ô(v) (9.1) 


1 ~ 
Galo) =— f etru) tI F; (Ùd, (9.2) 
21r/ _» 


where 


F;=2 sinh3p; f f Za expL— (t) 


€ rules are in terms of the f (S4, p. 503). Diracs XIZ, expl— (o +} (Ox (Q) dag,’ (9.3) 
tion of the radiation problem deals with matrix 
; d to be real; thus /? is real. This allows Yuo= (1/7)Leu(0)— ea (0)] (9.3a) 
|) isaassumed to be real; thus f* is real. This S 
Elp) ioe the notation without any limitation on the A= bitiw; (9.3b) 
tions. ; ed 
Deer un is small, Twe may assume that » is pp O 
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To obtain (9.3) we have employed (3.9), (3.24), (8.1a), 
and (8.2a). Also, a term has been added and subtracted 
in the exponent to make the next step simpler. u; arises 
from (9.1), while A; is a result of a combination of (9.1) 
and (8.14a). Finer mathematical points, such as the 
use of (9.1) and the rearrangement of the order of the 
summation will not be considered. 

O’Rourke now uses Mehler’s formula which can be 
written . 


Deny OH TD ENS (Gs) x5 l) 
=a;(2m sinhé)—? exp{ —4a;°[ (q;+q,’)? tanhdé 


+(g—4;' coth3e]}, (9-4) 

where 
ap=w,/h (9.4a) 
(Appendix III). The substitution of (9.4) into 


(9.3) makes the integrand take the simple form, 
expl—ax*-+b]. The operation can now be carried out 
by the use of standard formula, giving 


—= © 
F=exp| - : = 
wj (wh) Lcoth ($8j;+4iw;t) — cothdiw;t] 


E” 
=esp] = (=) [coth$6;—7 sinw,l 
Jw 3H 


2w’ h 
—coth}ß; coso}. (9.5) 


The steps between (9.3) and (9.5) are essentially 
elementary, although slightly involved. The use of 
Mehler’s formula does not require that the frequency 
of the two states be equal. One may obtain more general 
expressions for the Fps [ Vasileff (V1), or Dexter (D1) ]. 
We want to obtain a,(v), hence must obtain a rather 
simple expression for II;/;. This is the reason for 
requiring that all the ¢;; vanish. 
Now 


a? 
WF; =exp| =)-3( : ) 
hw; 


€ 
» 


-X (coth36;—7 sinw,;!—coth3B; cosw;!) l. (9.6) 


The existence of the following equality (for any @ and £) 
must be assumed 


1.2 
26 
z( ) [coth$8;—7 sinw;!—coth3B; cosw;t } 


hw 
36? 
=[coth}¢—i sinw!—coth}8 cosw!] ;——. (9.7) 
7 2 ~ hw; 


The relation certainly holds for the CC and Fréhlich’s 
model. If the local modes cluster about a point between 
a 
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the acoustical and optical brafches, (9.7) should be a 
good approximation. One would not expect (9.7) to 
hold if the important modes are a combination pf 
several types (longitudinal optical and local) involving 
radically different frequencies. Combined with (8.14b), 
it defines the “effective” or “average” frequency w. In 
Sec. 11 a method is described which can besused to 
determine whether or not several effective frequencies 
exist. For the time being, (9.7) is used. Introducing the 
famous Huang and Rhys factor, we write 


a 


1 Y 
Gra(v) =— db emi (e—Pug)t 


2rl_.p 
Xexp{ —S[coth46—i sinwt—coth36 coswi]}, (9.8) 
where 
Ej 1 2 
S=: 2 OE (9.8a) 
wh wih 


S is nondimensional. Equation (9.8) resembles integrals 
which appear in the definition of Bessel’s function. To 
relate G, to these functions, the following transforma- 
tion is introduced: 


n= (1/2)ip. (9.9) 
The integration variable of (9.8) is now changed to 
x=al—n. (9.10) 


Since one may show, with the use of (9.9) and (9.10) 
that 


i sinwt-++coth4B cosw!=csch48 cos, (9.11) 
Eq. (9.8) takes the form*** 2 
1 ° dx j 
Gn) =— — exp{ —i(pn+ px) 
Qr4_.~ w 
—S coth$8+-S csch$6 cosx}, (9.12) 
where 
P= (v— Pug) /w. (9.13) 
One may show thatttt 
1 


— | exp{y cost—ipt}dE=¥ 1(p—B)I p(y), (9.13a) 


21J o 


where Jis the modified Bessel function and & is an 
integer. Now (9.12) takes the form 


e~ipn-S coth}p a 

G(r) =, D d(k—p)I,LS esch3 ]. (9.14) 
u) —0 

) an alternate path of integration is used. This can 

y taking contour integrals and assuming that pis 


ETAO: 
be justified 
ADES . 

To establish (9.13) ong must use an integral expr . for 
as (J1, p. 547), and it is noted that 6(x) = 22, etina Reta 
etc. 
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Equation (9.14) requires that p or (1/w) 0; w0 — uy 
if be an integer for any value of v; and v;’. This means that 
w='w;. Evidently Eq. (9.7), which holds for any ż, can 
only be satisfied under this condition. Therefore, the 
derivation holds exclusively for this condition. 

The author believes that this is the result of using 
the 6 fuviction in Eq. (8.13) and that in reality Eq. 
(9.14) will give the correct shape (envelope—see Fig. 4) 
even if the frequencies vary over a small range. We 


thus write 
p= vl — n: (9.15) 
Equation (9.14) can be put in the alternate form 
: D+177/? 1 
Gnl) = [=] — exp[—S(25+1)] 
i J w 
f 
| XDB(k—p)I(25LH6+1)}}, (9.16) 


q where J= ż[ coth(6/2)—1], the mean quantum number 
ao. of a normal mode in the ground state. 


TAATA A 


The actually observed shape is obtained from (8.18); 
! hence, 
His 1 phe 1po+177? 
| a z Goa- ] 
H j ME wL Ü 
i | 
| Xexpl—S(20+-1) 7 {2S[a(a+1)}}, (9.17) 
| where p is the integer closest to (1/w)(v—vu,). w 


{ appears in (9.17) because the unknown g function has 
been effectively replaced by the step function. 


1 
g=0. for g EA —tw 
1 


g=— for 
(5) 


1 
a E)—4 TS (Bi E)+3w (9.18) 


1 
g=0 for pe tae <7. 


The factor 1/w exp| —S(20+1)] does not affect the 
frequency dependence of the absorption, i.e., the shape. 
It is convenient to introduce an average shape factor 


+17? 
G.=|—| T,{2S[0(0+1) }}}. (9.19) 
Uv 


The term 1/w is a measure of the width of the 
band. Larger ws cause wider absorption bands 
provided that the ¢;’s remain the same. We shall see that 
ee SS a a 


a 


la I a = 
3 ) (Qro) 


> 


A ied 
b 


TA 
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Se SPUGsdp=1 so that (Ap- 
pendix IV and Sec. 10). 
Equation (9.19) is exact for the models considered 
provided (9.7) applies. It does not require that e; — 0. 
This is not obvious unless the derivation of O’Rourke is 
used. Two alternate forms of G4 will now be found 
using some developments originally due to Pekar (R4). 


SG,(v)dv= 1 


(a) Low Temperature 1 


If @— 0, ð is a small quantity. It is useful to define 
z= 2S[0(0+1) }} which is also a small quantity. Here 
v: — 0, and p>0; hence, our concern is for the situation 
where p is a positive integer. Now we expand I, as 
follows (J1, p. 542): 


G G Ga 
no-t a] 0.20) 
p! p+1 2(p+2)(p+1) 
and obtain 
Ga(v)~Gi(v)=S?/p!. (9.21) 


This approximation improves as p increases. For large 
S, one may use the Stirling approximation and show 
that the maximum occurs at 


This corresponds to the angular frequency a 
v= Vugt Sw. (9.22a) 


Since $ is an integer, S in (9.22) refers to the nearest in- 
teger value. The violet side of the band corresponds to 
higher values of p, where (9.21) is a better approximation. 

It seems appropriate to refer to (9.21) by the name 
Pekarian, since Pekar was the first to employ expansion 
(9.20) in relation to this problem. The use of Eq. (9.21) 
requires the inequality, 


2/4p<K1. (9.23) 


At the point of maximum absorption this is equivalent 
to the conditions 


(Pekarian Criterion) 


SUK. (9.24) 


The shape of the absorption band of some F centers 
seems to be given by (9.21) (K4). Little is known about 
this type of curve. Its behavior is studied in Appendix 
IV. 


(b) High Temperature 


At high temperatures ð is not small and z>>1. As the 
temperature increases, z approaches the value 2Sk6/hw. 
This suggests the use of the asymptotic expansion{{f 


ea (9.25) 
n!(8z)" . 


———— 
itt This type of. expansion is discussed by Jeffreys and Jeffreys (J1). Their equation for Jp, however, is slightly in error, 
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Our interests are not limited to small values of p or to 
the first few terms of the sum. p may have the value of 
100; hence, for appropriate values of z the major 
contribution arises from terms with high values of x. 
If the series caf be broken off before r=, the simple 
form is obtained, 


`~ ED o 


9 


g ež p 1 p 2 1 P 3 
LOS fi- ni (>) -—(=) +] 
(Qnayil 2s 21\2s/ 31X22 


o r ( P) 
= -exp({ z—— }. 
(2arz)* 2z 


For (9.26) to hold, one requires that (2n—1)*«<4p? for 
all important terms. This will be true if p?/2z<1. 
When #?/2z>1, the expansion (9.26) must be explored 
further. The highest terms of the series occur (approxi- 
mately) at n= p*/2z, or Eq. (9.26) requires that 
4p*> (p'/z*). These facts suggest that criteria for the 
applicability of (9.26) are 


22/p|>1 


(9.26) 


(9.27a) 
and (Gaussian criteria) 
|p| >1. (9.27b) 


The high-temperature shape is obtained by substituting 


(9.26) in (9.19), 
Ga (v) =G)(v) 
1 
~ (Que)! 


1 p 1 
exp |p in(1+-) l. (9.28) 
2z 2 J 


Its maximum occurs at 


1 hw 
p=} (142) -2 (9.29) 
v/ 2k6 
=S (9.29a) 
s; and 
a Va=Vugt (hw?/2k0)2 (9.30) 
s = Vugt Sw. (9.30a) 
E The peak value of Gh is e 
| ) 1 dh he? (0.31) 
G),(max) = exp|( ) | 9.31 
(2x2)! 8262 
Hence 
onan) epf- (2 ye) | ¢ 
Gr=G,(max) exp; —-{ ——-—/z 9.32 
ee i ONC os 


1 
=G;,(max) eap] -— o-n}. (9.32a) 
S 230" 

Thus at high temperatures the absorption should be 
Gaussian provided (9.27a) and (9.27b) apply. 
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G TABLE III. C 
4k8 fo@+1)} 
@°K v Tio 8 z for S =307 
5 3.11 1078 0.139 1.111077 3.34X 10-5 
10 5.58 X 1077 0.278 7.47X 1075 448X107 
20 * 7.471074 0.555 1.36 10-3 1.63 
30 8.30X 10-3 0.834 3.04 1073 547 
50 5.95X 107 1.39 5.02K10-3 ~ 15.1 
75 0.172 2.08 5.96 10 26.8 
100 0.311 2.78 6.40 10-3 38.4 
200 0.949 5.56 6.80% 10-3 81.6 
400 2.31 bes! 6.911073 165. 
600 3.69 16.7 6.93X 10-3 250. 
800 5.07 22.2 6.96X 1073 334. 


If the e;’s do not approach zero, p will be greater than 
unity, and criterion (9.27b) will hold, except in the 
region where p=0, i.e., v= vug. Since pa> 0, this occurs 
to the red of the peak. In the broad bands which arise 
in polar solids, this will not be a real limitation on the 
applicability of (9.32a). 

To test criterion (9.27a) at the peak, it is combined 
with (9.29) to give 


22/ pr=4k0/hw> 1. (9.33) 


In Table III some typical values of 3, z, (4k0/w) and 
(1/6)[o(+1)]! are given. The phonon frequency 
selected was 27(3 10") sec! (M5). The table indicates 
that (9.33) applies above room temperature. Since 
<ph for the red side of the band, Eq. (9.23a) should 
hold there. 

If we multiply Ga by (1/w) exp{ —S(20-++1)} one sees 
that /G,dv=1, provided one neglects small quantities. 
In the next section we show that this relation holds 
exactly. 

Deviation from a Gaussian form should be looked for 
on the violet side of the peak where p> pz; here (9.27b) 
could break down. This treatment does not suggest 
that a band should be Gaussian for all v, although at 
medium temperatures one would suspect that the red 
side would approach such a shape. At some temperatures 
the following conditions may occur: (2z/p)>1 for the 
red side (Gaussian), and (z*/4p)<1 on the violet side 
(Pekarian). 

We may obtain an expression for H at high tempera- 
tures. Equations (7.7) and (9.32a) show that 272w2s 
=1/a? and 

H= (5.545) fiwst 


A = (5.545)! Shw csch?46. (9.34) — 
For high 0 


~ H= (11.090Skitw) 103. (9.34a) 


Equation (9.30) predicts that the peak of the £ x 
absorption band is temperature. dependent. The 
dependence is given by vx,(6) and 2/6. In Table Tt 
dependence of [0(d+-1) ]}/@ on @ is given. This qua 
is proportional to 2/@ of (9.30). Except at low tem 
tures, where our approximations break down 
is insensitive to variations of 0. ; 


Fic. 6. Diagram illustrating the broadening effect. The upper 
and lower configurational coordinates are shown. The distribution 
probability at 0°K is given in the lower left corner. Point “a” 
is at e,(R,), while“‘c” is at eu(Ru). 


The dependence of v, and H on temperature given 
by (9.30) and (9.34) is due to the approximations used 
in this section. These relations require correction, but 
the high-temperature prediction is valid and (9.34a) 
holds. 

The physical interpretation of the above equations 
is indicated in Fig. 6. Although the development 
considers more than one mode, this description is 
limited to the CC model. This is done for convenience 
only, as the arguments can be generalized to any 
number of effective modes. Since the ¢;,;’s are zero, 
the frequencies are not affected by the transition. In 
_ Fig. 6 e,(g) and «u(g) are plotted against q. The hori- 
zontal lines are various values of E=e,+/iw(v+4) and 
E!=e,+hw(v'+4). The relative probability of finding 
a particular value of g, when the electron is in the 
ground state, is shown by the shaded area on the lower 
left. The probable distribution, p(q), at absolute zero is 
x (q) for v=0. At any finite temperature it is 


: 1 
ey Bee =O owl s/o}, (9-35) 
hr} 2 
i= (-) coth?38 (9.35a) 
@ 5 


eo 1 35) from (8.14a) and (9.4). At high 

ape ne oh O has the value 20/w, and (9.35) has a 

one can obtain by Boltzmann’s statistics if 
he system isassumed to be (1/: 2)w*q?. 

fh the system, however, is not 
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(1/2)w°g’, since (9.35) is the sum over many states 
unless 6=0°K. 

The first question is: what is S? It is simply the ratio 
of the difference in potential energy at 0 (b) and at 
Aq (c) to the energy of a phonon, hw (Fig. 2). This 
follows from (3.24) and (9.8a). The maximum of the 
absorption occurs for transitions from v=0 to v'=S-as 
is seen with the use of (9.22) and (9.29a). This is the 
vertical transition if we ignore the zero-pvint energy. 
One would expect this to be the peak because of the 
shape of p(q). So far complex analysis has not led to 
anything very surprising. 

The bandshape at low temperature, however, is 
non-Gaussian; indeed it is nonsymmetric. This means 
that | {xs+1'|xo} |? does not equal | {xs—1'| xo} |*. Here 0 
and S1 refer to the quantum states of x and x’. This 
lack of symmetry leads to the skewness in absorption 
curves at low temperatures. This has created a serious 
problem in the experimental analysis (K2a, p. 119). 
We stress that this lack of symmetry stems from the 
properties of the x’s and not €u(g)—€,(g), which for 
this problem has the simplest form possible, namely 


hv =f ug+sw (Ag)?—w?(Agq)q. (9.36) 


v is a linear function of q. 

At high temperatures the spread on the red side 
occurs because of transitions where 2;’—2 is smaller 
than S. The violet side is affected by transitions frem 
the ground state where v,’ is greater than S. The theory 
suggests that these transitions tend to make the curve 
more symmetrical and in the extreme case, Gaussian. 

The Williams’ approximation suggests that the 
transition probability at frequency v is obtained from 
(9.35) and (9.36). First, (9.36) is solved for q; then the 
absorption is obtained from p(q). Since p(g), however, 
is composed of several states (unless 0=0°K), this 
approach is inconsistent, since e.(q¢)—e,(qg) does not 
equal Z’— E. Further, the matrix elements are somewhat 
asymmetrical. 

The question of the asymmetry of the absorption 
band is an extremely fundamental one. The Williams’ 
approximation happens to give the correct temperature 
dependence of H (Sec. 10). Equation (9.34) might also 
be a good approximation. Actually, it is a poor one. 
It is the asymmetry of the curves, i.e., Eq. (9.21), which 
gives us the necessary insight to understand what is 
really happening. 

In conclusion, one should mention that Krivoglaz and 
Pekar (K5) report a shape where the term const (v— va)? 
is added to the expression in the curly brackets of 
Eq. (9.32a). This term will create a skewness when 
(v—y;) is not too small. As v approaches v», the curve 
becomes symmetric and almost Gaussian. This fits the 
experimental data better than (9.32a), since there is 
another adjustable parameter. The experimental data 
seems to resemble a double Gaussian curve more nearly, 
however. There is no evidence, however, to believe that 
a discontinuity in the absorption occurs ali V= Vh. 
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Arguments by Kubo and Toyozawa (K6) indicate from the matrix element, giving 3 i 
that the simple model employed here gives a Kf i 
symmetrical shape at high temperatures. This same leh iH u—-H 9) tlh ay 
conclusion can be obtained from the higher moments = 27 BN | ERG \C ES aie (10.6) 


calculated by O’Rourke. The conclusion of Sec. 9(b) 
regarding the symmetry of the shape is not due to 
(9-25) Experimental data suggests (K4) that the 
results may not be completely valid. However, the 
simplifying ‘assumptions give some useful information 
regarding the low-temperature shape and the transition 
probabilities. 


10. METHOD OF MOMENTS 


The method of moments was developed by Lax (L1) 
and extended by Meyer (M8). We now combine the 
development of Part I with this powerful approach. 

Returning to Eq. (8.13a), we first sum over v; and 
then take a weighted average over v: using (8.14) ; thus, 


K x 
OO ras avf Deo dl{x| Pix Hx Ilh 


1 
Xexp [e-e (10.1) 
i 


K=2rbeN/cm. (10.1a) 


In this approach e(v) is used. To obtain (10.1) we have 
made use of (9.1). Av implies a weighted average over 
the ground vibrational states. By expanding exp{iZ’t/h} 
in series, one may show that 


x’ expfiE't/h} = e'Hutiiy'. (10.2) 
Now (10.1) takes the form 
K 
a(v) = Av fa Den 
2r 
X {x | fient, x} { x! |fe yei», (10.3) 


We note that E’ depends on v: ; Hu, however, does not. 
Therefore, we'take it out of the sum. Equation (10.3) is 
a product of the three integrals: one over /; the other 
over the nuclear coordinates involved in the first 
{ }; while the third arises because of the second { }. 
Since the x’s form a complete set in g space, we use the 
following property of eigenfunctions (H3, p. 66) 


Dox (G)xi' (GF) =8(Gi— 4) (10.4) 


and write 


K 
a0) =z f di Avf] fiT fie iHe yjet, (10.5) 


Now the Condon approximation is used to remove integer) are i i ae 
{ pp i ger) required. The matrix elements can be 


° 


The quantum mechanics is contained in the order of the 
operators; thus one cannot remove exp{#(Hu—H,)t/h} 
from the matrix (Appendix V). 

By assuming that f is frequency.independent, we see 
that @ is the Fourier transform of the function 
Kf Av{x|exp[i(H.—H,)t/h]|x} and it follows that 
(M7, p. 252) 


Kf Av{x| e Houh] x} = f a(v)e*dv. (10.7) 


Setting /=0, we find 


f e@)av=Ky. (10.8) 
Therefore, it follows that G,(v) of (8.18a) is normalized. 
The thermal vibration spreads the absorption band 
but does not affect the area. Equation (10.8) is based 
on the approximation that f is independent of frequency. 
Had we assumed that (¢’|r| ¢) is frequency dependent, 
the zero moment would be temperature sensitive. Using 
(7.1), it follows that 

Mo=nK f. (10.9) 


Further, 


[ar (a/at) f aoea) = f ROEN - 


= (h2/i)K f Av[ (3/3) {x| ei #04 |x} J,29 (10.10) 


and 


M,=hKf Av{x|H.u—H,|x}. (10.10a) 


Now, 


[e f aterat) = f Gvad) (10.11) 


and 


M.=hKf Av{x| (Hu—H,) |x}. (10.14) 
By similar methods we may evaluate the higher 
moments, For these situations, one may not write 


exp{ (i/ħ) (Hu—H,)t} 
for 


exp{ (i/h) Hut} exp{(—1/A) Ht} < 


as explained in Appendix V. $ 
To evaluate (10.10) and (10.12), expressions fer the 
operator Hy—H, as well as Av{x|q;"|x} (n is an 
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| calculated from the properties of the x’s and aré 
(E, p. 173-181) 
¢ 


| | Av{x|qi|x} =0 (10.13a) 
Av{x| 9,7] x} =[%#/2w;(g) ] coth38; (10.13b) 
Avfx|9#]x} =0 .(10.13c) 
Av{xl glx} =3[%/20;(g) P coth? 38;. (10.13d) 


B; is defined by (8.14b) where the ground state w; is 
used. 

An expression for (7,.—H,) is obtained from (8.1), 
(8.2) and the expansion technique employed in Sec. 3. 
It now follows that 


Hu—H,=Ace(0)+-D; gti Li 697 
Hilar 62959. (10.14) 


Ae(6) = €u(Ro)— €9(Ry) =hrup +3 Di; 67/07 (u). (10.14a) 


To obtain (10.14) and (10.14a) use has been made of 
(3.27). When comparing Secs. 9 and 10, it should be 
remembered that Ae does zot equal vug. The prime on 
the sum means that jk. The temperature dependence 
of Ae is due to the expansion of the crystal. We as- 
sume that the es and ¢;,’s are independent of 0. The 
derivatives are evaluated at the equilibrium position 
of the ground state, and the q,’s are the normal modes 
for this state. 

The evaluation of M, and M» is now straightforward. 
Using (10.9) and (10.10) 


¿= (M:/Mo) 
=Ae(0) +3 Z; e[%1/w;(g) ] cothz6;. 


To evaluate M» we square (10.14), i.e., 


(Hu—H,)} =A (0) +E; eg + Dj! csengign 
+4 Ds r Din’ A 
+2A€(8) Zi eq; +Ac(0) Liez? 
+Ac(O) Xj Gx959% 

+ Lis ejekkgig, 


(10.15) 


(10.16) 


where again terms involving ¢;e.; and ejere, (kl) 
have been omitted. Using (10.12) and (10.13) results in 


(M2/Mo)=Ae (0) +}; [71/2e;(g) Je? coth3; 
+i D 67 [n/w (g) ] coth? 26; 
tre Lise sjer% /w;(g)ws(g)] coth3p; coth3B. 
+3Ae(8) Z; e[%/w;(g)] coth6;. (10.17) 
A combination of (7.3), (10.15), and (10.17) gives the 
important equation : 
2— 5; e [%/2w;(g)] coth26; 
= a A S AO] oE H (1018) 


e made only two assumptions—that f is not 


We ee of v or q, and that the adiabatic approxi- 
a func be used to terms of éij, €x2, and ejex1, where 


. ‘on can 
matlo. 4 
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kALS§8§$ These equations are extremely general. The 
possibility of calculating the q’s and es about a point 
imperfection with any degree of accuracy is small. 
Hence, the equations as they stand are of little use. 
We must make further assumptions to obtain relations 
between é, m, and @ which involve parameters to be 
evaluated experimentally. The generality of —the 
equations assures one that terms will not be omitted. 
Thus, we may apply the equations to the Huang-Rhys 
problem (the Fréhlich approximation) and compare 
our results with those of O’Rourke and Meyer. On the 
other hand, we may make the approximation used by 
Klick. 


(a) CC Model 


Here, there is only one q whose ground state has the 
angular frequency w(g); hence, from Sec. 3, 


X=q/M}, (10.19) 


where M is an effective mass of the order of an ionic 
mass. For this model 
Ae(q)=ħvugt zw (u) (q— 4q)? — 30 (g)¢? 

= vug t3% (u) (Ag)?—w* (u) (Aq)q 


+2[w* (u)—w*(g) ]g. (10.20) 
Using the definition of (3.18), 
6= —o*(u)Ag= —w?(w) MAX (10.21a) 
and 
€jj= w" (4) —w? (g) = 2u(g) Aw. (10.21b) 


Equations (10.15) and (10.18) now have the form 


E= Ae(0)+32Aw cothsB (10.22a) 
and 
1 w? (u) 
m= ;— M (AX)? thw? (u) coth48 
2 ñw(g) 
X +324” (Aw)? coth? 48. (10.22b) 


A generalized S is defined by the use of (40.18) and 

(10.22b), : 

1 wj- (u) 

2 Ls (Q)—q;)?. 
hw;(g) 


The last step uses (3.24). S equals the expression in the 
curly brackets of (10.22b). This definition agrees with 
(9.8a) and is equivalent to the one used by Meyer. An 
exact comparison is somewhat involved since Meyer 
takes a thermal average over modes which do not 
include Vz and V. of (5.2). The S for absorption and 
emission need not be the-same. Both w(g) and w(u) 


1 
S=— Die? = 
2 w;(g)e0;?(u) 


(10.23) 


enter into the definition. 


§$§ We also assume that the modes are nondegenerate (Sec. 3). 
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The second term in (10.22b) is new; it arises froma * indicates that the Pion given by Williani and Hebb 


superior approximation.|||||| To estimate its effect we 
set 0=0°K and Aww. Now 


. mo=helS+ (1/2) ]} 
=ħwS [1+ (1/45) ]. (10.22c) 


Ti S*is of the order of 30, the correction term is of the 
order of Q.01, which is at the experimental limit of 
accuracy. If S were smaller than 30, and if Aw is large, 
one might be able to detect the second term in (10.22b). 
This would be done from the temperature behavior of m. 

Equation (10.22a) has a great deal less information 
than (10.22b), since no reliable information regarding 
Ae(@) exists. As stated, Ae may change because of the 
lattice expansion. ém for the F center seems to be a 
function of the interionic distance, d. From empirical 
data, one may obtain the following relation, €m(d) 
=const d-!-§ (Mollwo (M10), Ivey (I1)). Combining 
this with density measurements at various tempera- 
tures, one may calculate Ae(@). Using Henglein’s (H3a) 
expansion data for KCl, one obtains 


€m(193°K) — €m(273°K) =0.011 ev 
€m(80°K) — €n(273°K) =0.025 ev, 


where é has been replaced by ém. This change of em is 
too small to explain the observed temperature variation 


—Q.040 ev and 0.070 ev), which is an indication that 


Aw+0 for this particular absorption band. One might 
attempt to combine the Ivey relation with (10.22a). 

On the other hand, we may assume that Ae(@) equals 
a constant. This can be tested by plotting em against 
coth6/2; B having been evaluated empirically from 
(10.22b). If the data give the equation A+B coth@/2, 
one may evaluate Ae and Aw. There is a danger in this 
procedure since Ae(@) may also vary as A+B coth@/2 
over a limited range of temperatures [for small £, 
cothG/2~ (2/8) ]. In this situation the coefficient of 
cothB/2 would not equal 37#Aw. As the range of 0 
increases, the possibility of accidental correlation 
decreases. 

Using recent data on the F center in KCl, w has been 
evaluated with the use of (10.22b) (without the second 
term). From (10.22a) w, ë, and “Aw were determined. 
One again concludes that the correction to (10.22b) is 
very small. 

Klick and Schulman (K2a, p. 110) have used the 
Williams’ approximation to obtain an expression for 
H. It is identical to the first term of (10.22b), provided 
H and m are proportional. This does not imply that 
the derivation given here is identical to the one obtained 
previously. The latter method does not give the second 
term of Eq. (10.22b), nor the Pekarian shape (at low 
temperature). The latter is readily observable and 
requires cunderstanding. If one uses Eq. (9.34), H 
depenGs on the-csch8/2 instead of coth@/2, which 


|| | || A term of this nature appears in the formal treatment of 
Kubo and Toyozawa (K6) 
i 2 


(W3) and by Lax (L2) is fortuitous. 2 
ţ 
(b) Huang-Rhys Problem 


In, this case we assume that all the modes have a 
single frequency given by the LST relation; then 


E= Ac(0) +34 >); HAw; coth (10.24a) 
and > 
m=ħ (u)S cotht8+4 (O; #7Aw?) coth? 36. (10.24b) 


Equation (10.24a) was first obtained by O’Rourke, 
using the approximation that Aw is very small. Later _ 
Meyer obtained this equation by an alternate method. 
These authors did not note the thermal dependence 
of Ae(0). The first term of Eq. (10.24b) has been given 
before, while the second term is new, see however (K6). 
It may be detected when careful measurements are 
made on bands with small S’s. 


(c) Double-Frequency Model 


We may apply (10.15) and (10.18) to more complex 
situations. An alternate model is one with two effective 
ground frequencies w; and wə. For this situation 


h 
2= Ac(6) = aren eos 
Tl) 


Wg 


h 
(Lien) coth4B. (10.25a) 
4wə(g) 
and 
wm? =hw? (w)S1 cothłbi thw (u)S2 coths62,  (10.25b) j 
Here E 
S Le Dil t/hw;(g)w;? (u) Je? (10.26a) 


=} E [1/te:(gw2(u)Je2  (10.26b) 
ME and B= (h/k8)wz(g).  (10.26c) 


j is summed over the modes associated with the 
frequency w1, and 7 is over the w modes. The second 
term in (10.18) is omitted. 

We again use an effective frequency and write 


Si=[1/2hor(g)o? (a) ] Z; €. 


Since the data are always limited to a range in tem 
ture, this may be less restrictive than (9.7) wl ch. ist 
hold for every value of ¢. If (10.27) is to hold fo: i ze 
value of 8;, there cannot be a spread in the valu 
the w’s. á 
We may assume that 5; is egi to the loca 
and S% is associated with the longitųdina l 
Bı will be roughly half Bs. The qu 
detect the difference between ( 
(10.22a and b) or (10.24a and b) 
this question, we mA c ; 


= ` were 


~980 


p 


11. ANALYSIS OF EXPERIMENTAL DATA O 


n the last two sections, we derived several equations 
waich can be compared with experimental data. 
Various attempts have been made to analyze the data 
with radically different results. Some of the difficulties 
stem from the data itself, yet this is not the major 
problem“ While at low temperatures (Table I) there is 
considerable disagreement, this is not the case at room 
temperature (Table- IV). The agreement between 
Mollwo’s (M10) and the most recent data (M5) is 
good. Russell and Klick’s (R1) value of H seems to be 
too large at all temperatures. 

Russell and Klick conclude that (10.22b) (without 
the second term) applies to the F center, provided the 
angular frequency in KCI is 1.610" sec™!. On the 
other hand, Pekar (P2 and 4) and Meyer (M8 and 9) 
conclude that the same equation holds, provided a 
value of 3.9510! sec is used. It was obtained from 
the LST relation. Pekar and Meyer use Mollwo’s data. 
The reason for the difference stems from the method 
of treating the data. One may use Mollwo’s data with 
an alternate method of analysis and obtain 210% 
sec”? for w. 

Pekar’s analysis is most remarkable (P4, p. 129 ff 
and p. 145 ff). He had only one adjustable constant, the 


TABLE IV. F center in KC] at room temperature. 


Experimenter (M10) (R1) (M5) 
Em EV 2.20 DP 2.225 
H ev 0.35 0.39 0.35 


effective mass of the electron. It was evaluated from the 
experimental ém at room temperature; hence, his results 
should apply to the most recent data. Using this value, 
Pekar calculated S by employing (9.8a) and (3.18a). 
For the relation between the q’s and the X’s he used 
the very long wave approximation (B, p. 213). His value 
of S for KCl is 23.8. Using this value, the LST relation 
(10.24b), and (7.7b), we compute the values in Table V. 
Equation (7.7b) was used to relate m and H; this gives 


H?=8 \n2 (hw)? coth} (hw/k6). (11.1) 
The calculation of H at 300°K does not agree with the 


(11.1). However the corrected numerical values of H 


ly quite remarkable, since 


A 


ıssed. if one has to use Eqs. 
various values of j, the 
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TABLE V. Comparison of Pekar’s parameters with experiment. 


0 H (cal) H (exp) 
300°K 0.43 0.35 
0°K 0.30 0.16 


problem becomes hopeless since the number of adjust- 
able parameters is<too large. The opposite point of 
view, which is very naive, is that one exists and that 
Eqs. (10.22a) and (10.22b) or (10.24a) and (10.24b) 
apply. If we further assume that Ae is independent of 
temperature, there are only four adjustable parameters 
and the equations of interest take the forms} 44 


E= Ae+ B coth3 (hw/k0) 
m= S coth} (hw/k0)+3B? coth? 4 (ħhw/k0). 


(11.2) 
(11.3) 


Next we may assume that B? is very small compared to 
Ww’ S and hence can be omitted from (11.3). This is 
true for the F center in KCI. To obtain m from an 
arbitrary curve is a laborious process involving a great 
deal of numerical work. In some cases, one may be 
able to replace m by H [Eqs. (7.7b) and (7.12b)], hence 
in the simplest situation we wish to test critically the 
relation 


EH? = H? (0) coth4 (hw/k0), (11.4) 


where H (0) is the half-width at 0°K. It can readily De 
obtained from measurements at low temperatures 
(approx 4°K). The validity of (11.4) is based on the 
assumption that the ratio of m to H is temperature 
independent. This has to be tested by numerical 
integration. 

Equation (11.4) suggests plotting coth[H/H(0)}? 
against 1/0. We have the desired test if 1/0 can be 
varied over a large range of values. A straight line 
going through the origin is a clear indication of the 
validity of (11.4) and the development in Sec. 10. 
Further, it shows that there is one effective mode, so the 
theory developed in Sec. 9 can be used. Such a plot is 
a more critical test of the theory than the previous 
methods of analysis, since one deals with straight lines 
rather than a complex curve. If one camines this 
technique of analysis with Mollwo’s data, one obtains 
the 210" sec value mentioned above. 

Three alternate methods have been suggested. 


(1) Klick (K2) plots H against 6!. At high tempera- 
tures, a straight line is obtained whose slope is propor- 
tional to Swi. At moderately low temperatures, a 
departure from a straight line is obtained and the 
analysis requires the determination of this departure. 
At extremely low temperatures (about 25°K), one gets 
a straight horizontal line. While Klick’s approach is 
useful, one cannot determine the validity of 411.4) ina 


TIT We assume here that (Z;hAw;)? =D; (hAw,;)? which, of 
course, need not be true except in the CC model. The B from 
(11.2) will however give us the order of magnitude of B? in (11.3). 
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critical manner since the points are not supposed to lie 
on a straight line. 

(2) Pekar (P3) has plotted H against 0. In this case 
one does not get a straight line except at extremely low 
temperatures. An analysis of experimental data by this 
method is difficult, and the author knows of no detailed 
study using this approach. 

(3) Meyer (M8) has plotted (H/hw)? against 
coth3 (#iw/k@) using a known w (from the LST relation). 
An erroneous w can be detected only at low tempera- 
tures (under 100°K). If a great many points are 
available below 100°K, one can see that his technique 
does not give a straight line except for the correct value 
of w. Thus, using recent data on the F center in KCl 
(M5) on® may show that his w is incorrect. Mollwo’s 
data had too few points in the critical region, so that 
Meyer “was unable to detect the deviation. Actually, 
Meyer was unable to confirm (11.4) and was forced to 
add a constant [Appendix V and Eq. (11.5) ]. 


We are particularly interested in what happens when 
there is a breakdown in the approximation made in 
arriving at (11.4). In particular, how does Eq. (10.25b) 
look on a plot of coth!LH?/H?(0)] against 1/0? This 
question is only meaningful provided the ratio of a: to 
we is not near unity, and Sı and Sz are approximately 
equal. One cannot hope to detect the difference between 
Eqs. (10.25b) and (11.4) With the use of experimental 
data unless there is an appreciable contribution from 
both types of modes, and the frequencies are distinct. 


2 DeLee 
1 $702 Sp2W3 S3 
w2 W 4 Ws 


One may assume that both the local modes and the 
longitudinal optical modes interact with the trapped 
electron. The local modes should be in the gap between 
the acoustical and optical branches. In NaCl and KCl 
this gap is roughly at one-half the frequency of the 
longest longitudinal optical modes; hence the ratio of 
the w’s should be of the order of 2. 

To understand the problem various coth—[_H(@)/ 
H(0) P against 1/@ plots have been made using several 
ratios of Sı to Sə and several ratios of wı to we 
(Fig. 7).**** In making these graphs, the values of Sj, 
S2, w1, and we were assumed to be known. Then, H?(@) 
was calculated. From this H?, coth—'H?(6)/H?(0) ] was 
computed. Since the data refer only to a limited range 
of.temperatures, the curvature shown in Fig. 7 will not 
be detected. A sure indication of the difference between 
(11.4) and (10.25b) is in the intercept, where 1/0=0. 
If the line goes through the origin, we have a fair 
experimental proof that (11.4) applies, and that there 
is a single effective frequency. If not, one must attempt 
to express H by means of a more complete equation. 
Strangely enough, the data on the F center in KCl can 
be interpreted in this very simple manner (K4). 

Similarly, in Fig. 8 plots have been.made with the 
use of Meyer’s equation, namely, - 


e 2= C+D coth} (hw/ké). 


a 


oe (11°s) 


This equation assumes that f* is a linear furiction of aan 


**** In Figs. 7 and 8 we assume that m~H. 
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= In some cases m can be replaced by H. 


erception at 1/6=0. 

Use of this kind of plot requires measurements at 
th low and high temperatures. Measurements at 
remely low temperatures cannot be used, since 
H?(0) approaches unity, and there is a large un- 
inty in the coth! of H?/H?(0) due to slight 
rtainty in the evaluation of H? (one or two parts 
i usand). This is one disadvantage of this plot. 
Phe other proposed methods are no better in this 
emperature region if (11.4) applies. 
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As suggested by Meyer, one may test (11.2) by 
plotting ē or ém against coth}(#w/k0), w having been 
determined from (11.4). If Ae is independent of @ one 
will obtain a straight line. Of course, Ae could vary as 
const+% coth@/2 for a limited range of 0. 

No completely satisfactory method suggests itself for 
testing the shape of an absorption band, since we danot 
have a simple analytic relation between a and e. It is 
fairly well established that (7.7) and (7:8) will not 
apply to absorption bands in solids, except in very 
special?cases. For detailed comparison of ‘the theory 
and experiment, see the forthcoming paper (K4). 


12. CONCLUSIONS OF PART II 


The theory developed in the first sections is applied 
to an impurity center with two bound states. Considera- 
tions are limited to absorption and, for a large part, to 
the simplest model (Secs. 9, 11, and 17). It has thus 
been possible to make exact calculations. 

The broadening of an absorption band is on account 


‘of at least several causes. The band is composed of 


narrow subbands. The width of these arises from the 
uncertainty principle, from the center-center inter- 
action, and from the dispersion of the normal modes. 
The envelope curve (the height of the subbands) is 
due to the displacement of the normal modes during an 
optical transition. The over-all shape arises from terms 
which appear in phonondynamics and are absent 
electrodynamics. The broadening effects are illustrated 
in Fig. 9. The energy levels of the system are drawn on 
the left-hand side. At absolute zero before the transition, 
the system is in its ground state (v,=0). Various 
transition probabilities exist for the excitation to levels 
in the excited state (v: =0, 1,2, ---). The height of 
the vertical lines on the right are proportional to these 
transition probabilities (Eq. 9.21). The result is an 
asymmetric curve which resembles the F center in 
KCl (K4). We have plotted v;’ from right to left as is 
usually done. For emission one may exchange v, and 


RI 


Fic. 9. Band shape at 
6=0°K for S=20. The 
vertical lines on the 
right are proportional 
to the transition proba- 
bility. Note e,—én>€m 
—e, and the lack of 
symmetry for all values 
of a. 
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v’. Here the emitted energy is Ae(Q=0)—fw(v,—2/’), 


and the diagram should be “flipped over.” The absorp- 
tion curve shown on the right is probably not limited 
o the model on the left but only requires that all the 
Sane: normal modes (whose e; do not equal zero) 
have an effective single frequency (Sec. 9). The generali- 
zation of Fig. 9 to any model with a single effective 
frequency is a result of the use of Mehler’s formula. 
This is a ewesult of the fundamental contribution of 
O’Rourke. ` 

The shape presented is thus exact for the model 
considered ,at absolute zero. To extend these calcu- 
lations to higher temperatures, various approximations 
must be made. The asymmetry disappears and the 
agreement with experiment is much less good. One may 
indeed show rigorously that the model on the left of 
Fig. 9 gives a symmetrical curve at high temperatures. 
This can be done most simply by calculating the third 
moment (01). Hence the model has to be expanded to 
predict the shape observed at high temperatures in 
the F center of KCl (K4). 

As to the accuracy of the theory developed, the 
simple model (Sec. 9) does not predict the correct 
relation between /¢ and £*. For example, in the F center 
of KCl, S=30, #w=0.012 ev, and E (at 0°K)=2.3 ev. 
From (4.2a), E°= E*—2hwS=1.6 ev compared to the 
experimental value of 1.2 ev (V1). For this situation we 
may not let the frequency change be zero, and a more 
complex calculation is required (Sec. 10). For large 
values of S, we do not have to describe the potential 
curve over the entire range if we are interested only in 
emission or in absorption shapes. Returning to Fig. 2, 
the absorption shape is determined primarily by the 
potential curves about points “a” and “b.” Likewise the 
emission shape also is eer primarily from the 
curves about ‘‘c” and “d.” Therefore, one would expect 
that the general features given in Sec. 9 will be observed. 
While the general features are given by the develop- 
ment, the calculations are not truly exact. The change 
in the frequency and higher-order terms must have a 
detectable influence. These corrections must somehow 
be used to explain the asymmetry at high temperatures. 
How to do this remains an unsolved problem. 

In Secs. #0 and 18 several more general models are 
considgxed where various corrections are introdticed 
in the expression for the moments. The effects on ë and 
m? are, however, small and usually beyond the range of 
detection, except for the case where we have two 
radically different frequencies. We thus conclude that 
the basic equations have considerable validity. This 
field needs more reliable data and its complete analysis 
in terms of absorption and emission. In Sec. 11 some 
methods of analysis are suggested. 
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13. APPENDIX I. LIST CF SYMBOLS 


Although an attempt was made to associate a single 
physical concept with every symbol, this has not always 
been possible; thus, e has been used for an eigenvalue 
as well as for a photon energy. Similarly, m has two - 
distinct meanings. Usually subscripts have been 
employed to differentiate between physical concepts, 
or the symbols have been employed at distinct parts of 
the text. Gaussian units have been employed except 
for experimental values which are reported in electron 
volts. Only the important symbols are listed here. 


a,;—Smakula’s constant—ratio of the area under an 
absorption curve to the product of anH; also 
(GmH)* 
b—local field correction (Sec. 8) ~ 
c—velocity of light 
e—charge on the electron 
f—oscillator strength (8.16) 
g—shape factor associated with items (a) and (b) of 
Sec. 7, as well as the dispersion of the modes 
i—Planck constant divided by 2m 
h-—electronic Hamiltonian (3.3) 
h,—vibration Hamiltonian (3.4) 
k—Boltzmann constant 
(k)—the kth nuclei : 
m—See (7.3) except (8.16) 
gy—reduced coordinate of the jth normal mode 
(In Part II it is associated with the ground 
state.) 
Aq;—shift in the equilibrium position g;/—Q; (3.24) 
p—see (9.13) 
pr—see (9.22) 
pr—see (9.29) 
;—momentum operator associated with q; 
t—coordinate of all the electrons of interest to the 
problem 
i—kinetic energy operator associated with the 
excess electron 
v;—vibration quantum number associated with the 
a jth mode, v; (ground), and v; (upper) A 
v-—total vibrational quantum number, i.e., >>; uz 
?—thermal average quantum number of single 
mode 
B—reference (B2) ` ° e 
E—total energy in vibrational state U: associated 
with the ground-electronic state 
E'—total energy in vibrational state v: associ: 
with the upper-electronic state eer 


tt 
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E*-activation energy for emission 

[E — thermal activation energy 

Fa—see (9.19) 

G,—normalized shape factor associated with the 
absorption or emission of phonons 
Gm—maximum value of Gn 

H—width at half-height (half-width) (7.4) ` 
H—total Hamiltonian [see (3.12), (3.13), (5.4), 
and (5.10) for various meanings] 

K—see (10.1a) 

LST—Lyddane-Sack-Teller relation (see B, p. 86) 
M,—xnth moment of the absorption curve (7.1) 
N—number of imperfections per unit volume 
Q;—reduced coordinate of the jth mode associated 
with the upper state 

P(v1)—see (8.14) 

R—coordinate of all the nuclei 

Rz—value of R which minimizes Vz, 

R,—see (3.4a) 

S—Huang-Rhys factor 4 2; ¢7/fw;? (43a) and 
(10.23) 

T—kinetic energy operator for all the nuclei 

’—T which only operates on the x’s 

T.—kinetic energy operator for all the electrons 
V—total potential energy of the system 
V.—potential energy due to an excess electron 
V;—potential energy due to a point imperfection 
Vz—potential energy of a perfect lattice 
X,.(k)—actual coordinate of the kth particle in the 
ath direction 

a—absorption coefficient 

@m—absorption coefficient at em 

B;—see (8.14b) 

e—stands for an eigenvalue 

€g—ground state eigenvalue of Je 

e;—see (3.18a) 

e; —see (3.18b) 

en—nth eigenvalue of e 

eu — upper state eigenvalue of h, 

é&— vibrational eigenvalue 
Ae(R)=e,(R)—e,(R)—see after (5.3) for special mean- 
ing in Secs. 5(a) and 14 

Ae or Ae(8)=Ae(R,) in Part II 

e—also stands for a photon energy 

ém photon energy for maximum absorption 
ē—see (7.2) 

@—absolute temperature 

$ = p—angular frequency of the absorbed light - 
©- Yug—see (9.3a) o 
_-_—s g-—absorption cross section 
s „—electronic eigenfunction [¢ (ground) and ¢ 
(upper) are also used’] PpS : 
—ground- and upper-state vibrational eigen- 
unctions associated with the jth mode 

tal vibrational eigenfunctions of the ground 


excited states pic 
T of the system 


l 


ra 
= 


a 
an 


we 
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w—the effective angular frequency of the inter- 
acting phonons ; 
w;—angular frequency associated with the jth mode 


a 


14. APPENDIX II. MEYER-PEKAR METHOD 


This appendix relates the method developed for the 
trapped electron (Sa) to the techniques used previously. 
Meyer (M8) has made the most extensive calculation. 
We now split Ho(4) of (5.2) as follows: : 


Hy,=T'4+Vi4+V_z (14.1) 
H.=t+V.(Rr) (14.2) 

and 
Hre=V.-(R)—V-.(Rz). (14.3) 


The division of Ho here is not identical to the one made 
previously. Equation (14.1) determines a set of normal 
modes, A;, from an expansion about the point defined by 


(Vit va| =0. (14.4) 


ð 
Xalk) RI 
There is a close similarity [see Eqs. (3.12a) and (3.12b) ] 
between Hz+H,. and Ho(S) and between Mre and 
H,(S). The Ho(S) and Hi(S) here correspond to the 
impurity model of Sec. 5. The potential energy term_ 
in h,(R) has been modified slightly. Meyer refers to Hr. 
as the lattice-center interaction. This term is actually 
given by H(A). 

An appropriate way to find a solution is to expand 
Heze, as follows, 


Hre=> 5 ajA;+3 Die bj AGA ke 


We assume that the eigenfunctions pn and eigenvalues 
Ae, of H. are known at Rz. The perturbation method 
is now used to obtain eigenvalues and eigenfunctions of 


H.+H 1. for any R. Thus, 


(14.5) 


(oml a;| Pn) 
onl R) = pnr F2 jm ~~ nA; 
Aen — Aem 


a 
= 


+higher terms. , (14.6) 


In the last term mn, and we assume that the eigen- 
values associated with H, are nondegenerate. The 
argument Rz of g and Aeon the right-hand side of (14.6) ` 
has been omitted. The higher-order terms which are 
proportional to A;A, arise from the use of second-order 
perturbation theory as well as from the last term of 
(14.5). Í 

The ¢g,(R)’s must form an orthogonal set with 
respect to r for a fixed value; of R. This follows since 
the complete séries (14.6) gives eigenfunctions of the 
Hamiltonian H+H ze. This was not evident to Meyer. 
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Associated with (14.6) is the eigenvalue 
Aen(R) =Aen +2 j(¢n| a| on; 
\ s Ta Dik Pn| bjr | PnA;jAk 


1 
` HÈ Di — (on | a; | Em) (Pm | ar| EnA jAk. (14.7) 
En — DEm 


e 
Terms of (14.7) may be identified with those which 
arise in an expansion about Rz. We assume proper 
convergence. In such an expansion, terms similar to 
those in (3.18) appear. Thus, 


s 


e= (pnl ajl pn) (14.8a) 


and 


Ejk = (Pn | bjr Pn) 


TE (Pnl a;l Pm) (Yml ail Pn). (14.8b) 


En — A Em 


Equation (14.8a) is a form of (3.16). The first term in 
(14.8b) was omitted by Meyer. One may use the ¢,;’s 
exactly as in Sec. 3 to determine the relation between 
the true g’s and the A’s. One should stress that Hz, does 
ot arise from a true electron-phonon interaction but 
occurs only because a fictitious definition of the phonon 
field was used. On occasion the A’s are assumed to have 
a very simple form, and the e’s obtained by these means 
correspond only very approximately with the real ones. 


15. APPENDIX III. MEHLER’S FORMULA 


Mehler’s formula, Eq. (9.4), is “well known” in the 
theory of Hermite polynomials and has been used since 
it was discovered in 1866. Its proof is not available in 
standard texts. The steps are rather simple, providing 
one interchanges two integrals with an infinite sum. 
A proof due to Hardy is presented. It originally appears 
in a paper by Watson (W1). We shall not concern 
ourselves “with the convergence of the series, so the 
proof as presented is not completely rigorous.“ A 
standard integral of interest is 


w% A/T 
f exp[— au’ + ibu |du=— exp —b?/4a°]; (15.1) 
a 


=o) 


hence, 
v 


d 
H, = (~ 1)” aled exp[ —x°] 
x” 


> — 2i) ephe] p° ° 3 
KEDE] f a” exp[ —w+2ixu]du. (15.2) 
Va ve 
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ELECTRON TRAPS 
One would like to examine the sum o 4 
2 Se 
D bxv(x)x0(y) a 
0 


1 
Ern T exp{ —3(x" +9") } eH, (x) Ho(y) ; 


(—2tuw)” 
= aena [= — 


Xexp{—w—w*+ 2ixu+2iyw}deudw. (15.3) 


For convenience, the usual w/%’s which appear in the 
x’s have been omitted. v is the quantum state of, x. 
Hardy’s device is to change the order of the sum and 
the integrals. Further, he replaces >> [ (—2taw)?/v!] by 
exp{—2tuww}. The convergence of (15.3) is such that 
this is permissible, provided ?<1. Watson states this 
requirement but does not give details. One can see that 
(15.4) requires this relation. After this step, Eq. (15.1) 
is used twice, going from right to left. Thus, 


Lotty (x) x0(y) 
= tr} exp{3(a*+y")} 


2 


X f f exp{—w—2iuw—w+2ixu+2iyw}dudw 


=! exp{}(2?—y?)} 


x f exp{ — (1—#)1?+-21(a— y)u}du 


l t | = = = 
= ? exp — 
1—P 2 12 


t 7 (a+ y?) 1 +2 2t 
-{.) of 
1—? 2 1-—# 1—P 
(15 


If we set t=e§(£>0), Eq. (15.4) gives (9.4) after some 
elementary transformations which use the properties 
of the hyperbolic functions. 


16. APPENDIX IV. PEKARIAN CURVE 


Properties of Pekarian curves, Eq. (9.21), are 

discussed here. This equation gives the relative proba- 
bility fo? a transition from v,=0 to any v/=p (Eq. 
9.15). Since the number of modes does not enter into - $ 
the problèm, we again use the CC model as an illus- 
tration. The Pekarian shape is not limited to this case 
but also applies to the Fröhlich and LNM models. The 
only serious limitation is the use of Eq. (9.7). 

Absorption shapes do not seem to be completely 
symmetric (K2a, p. 118), and one may inquire as to the 
Symmetrical properties of the Pekarian curve. Consider 
first the case where S=20. The most probable transi- - 
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c TABLE VI. The symmetry properties of a Pekarian curve S=20. 
pao’ 23 22 21 20 19 18 17 16 
ae 3 (20): (20)? 20 19 19(18) (17)(18)(19) > 
(6 —— = 1 1 = —— = S 
p! 3 (21) (22) (23) (21) (22) 21 20 (20)? (20)? 
Trangition probability 0.7529 0.8658 0.9524 1 1 0.9500 0.8550 0.7268 


tions occur for v,/=S—1 and S. If one assumes that (10.9)]. Hence, we know that the relation « 
the transition probability for these jumps is unity, then ; ke 
one may obtain the properties in Table VI. The transi- LP (01) Devel ix lxo) [?=1 (16.4) 
tion probabilities do not lie symmetrically about the holds for any 0. Now, since 
midpoint, which is very nearly at 19.5. 

~ Tọ stress the asymmetry of the problem, Fig. 9 is Yo. Pw) =1, 
included. The left-hand side is similar to Fig. 6 except 
that the upper state has been shifted twice as much, Deve’ | {xrel Xve} P= (16.6) 
and only the case where v,=0 is considered. On the 
right-hand side | {x’|x}|? is plotted against v,’. On the 
x axis v; increases in going from right to left, in agree- 
ment with experimental procedure. It is evident 
that €—€m~é€n—e, so that the Pekarian curve is 


(16.5) 
we conclude that 


for any v;. This is a sum-rule which has one surprising 
feature—it does not involve the frequency. In atomic 
spectra, the sum-rules involve the transition frequency. 
This means that we may not accept the atomic sum- 
rules without further careful considerations. 


asymmetric. : j ran 
In Fig. 10 three Pekarian curves for S=3, 10, and By numerical calculation we obtain Table VII. Here 
20 are plotted. On the curves the ratio of ¢,—em to TABLE VII. Some properties of Pekarian curves. 


€m—€, is noted. These values show that the curve is —— aes 
slightly lopsided for all values of S and resembles a 


double Gaussian. The violet side is indeed almost a 5 Pp (Omar a Area H?/m?=H?/S_a.(P) 
perfect Gaussian. 3 2.5 0.229 4.18 1 5.82 1.065 
To calculate various factors associated with a aa a O10 ae fig 237 107 
Pekarian curve, we use (9.17) and let 0=0. The follow- 20 19.5 0.0892 1056 41.01 5.58 1.07 
ing formula is obtained : B00010728 1202 1.01 556 1.07 


1 2= Gp—-8— P/pl\e-s 
| {xx 1x0} [?=Ge OA en the lines of Fig. 9 have been replaced by the envelope 
0 and # are the total vibrational quantum numbers of curve. Column 3 is the value of the absorption at the 
the states x and x’. The term 1/w has not been included. peak (at p:).țttt The fifth column is the area (planim- 
It determines the width (or height) of the curve. Also, eter) under the curves when the factor e~ is included. 
we set w=7w=1. Using (10.9) and (10.24b) we note This area might be slightly larger than unity because 


PRE SE ee ey 8 


that, we are replacing a sum by a smooth curve. At S—4 
ANG? 24 16.2) the envelope is larger than the values at S or at S—1. 

d Zarl fx | xo} (6:2) To obtain the sixth column, use was made of (16.3), and 
an Pio (16.3) the seventh is obtained from the definition of Smakula’s 


constant found in Sec. 7, combined with (16.2). Columns 
6 and 7 should be compared with the va*tue tound for 
the” Gaussian shape, that is, 5.545 and 1.064. The 
Lorentzian value of a, is 1.57. The deviation from the 
Gaussian values is very small. The variation for small 
values of S is probably real; however, we attach no 
importance to the slight deviation found in the last 
four rows. One may readily superimpose various 
Pekarian curves unless S is smaller than 10. 


u 
5 
la] 
© 


Mo=1 it is temperature independent [Eq. 


17. APPENDIX V. EFFECTS WHICH OCCUR 
WHEN f IS NOT A CONSTANT 


RELATIVE ABSORPTION 


i 5 We now consider what happens to the equations of 
a a PHONONS CREATED Sec. 10 when the simplifyirig assumptions regarding f 


. 


men Three Pekarian curves for various values of S. - {ttt Since no fine structure is observed p will not be limited to 
Fic. 10. Note the Jack of symmetry. integer values. 3 
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INTERACTION OF NORMAL 
are removed. One might use the Lax-Meyer method 
throughout, but the algebraic complexities become too 
involved. In Sec. 10 we replaced exp (iHut/h) 


SXexp(—iH t/h) by exp[1(H.—H,)t/h]. It is correct 


only to terms in (H,.—H,)*. This arises because 
exp (iH,t/h) is only a shorthand notation for an infinite 
serigs and the terms in the two sums do not in general 
commute. Hence, this method is jot used in the first 
section of fhis Appendix. 

The two problems are: what happens to the expres- 
sion for € and m? when f becomes a linear function of 
the frequency, or a linear function of the q’s? 


ə (a) Oscillator Strength Depends on 
the Frequency 


First, we consider the effects of having the oscillator 
strength proportional to the frequency. For this 
purpose, the dipole matrix M is introduced.fitt It is 
assumed to be a constant. Further, we limit our con- 
siderations to the case where one may assume a single 
frequency and omit the ¢;,’s, i.e., the situation treated 
in Sec. 9. 

Combining (8.16), (8.18), and (9.8) results in 


Ar? À 
a(v)=bN—|(%' | er| ¢)|?»G(x) 
3hic 


> 


1 (el 
= K'M%y— f emi =ru) 


Xexp{ —S[coth$6—i sinw!—coths8 coswt]}di, (17.1) 


where 
K'=4r bN /3hc (17.1a) 
and 


M=(¢' ler|¢). (17.1b) 


Since our interest is in the moments of the distribution, 
we assume that the absorption is a series of delta 
functions. 

Taking the Fourier transformation of a(v)/y, one 
arrives at 


~ 


% 3 
f [afx)Lv]e”'dv=K' Meira! s 
Xexp{—S[coth}ß—i sinwi—coth4ß coswt]}. (17.2) 
From this it follows that 
M= a (a/a) f [a0)/vJeav| 
Cal t=0 
=hK'M*{yy.9+Sw}=K'’M*Ac=Khf, (17.3) 
where ° E A 
G ` f=2mi M? heh a (17.3a) 
tttt For simplicity of notation M is considered to be a real 
scalar. a 
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A 


cr 
p= Ae/h. (17 3b) 
Use has been made of (10.1a) and (10.14). (1¥.3) 
should be compared to (10.9). Further, 
M= —??{ (02/0) {2 a(v)/v Je*? dv} 10 
° = K'M?[(Ac)?+27w*S coth36 ]- s 


*and 


= K'’M*Ac[ Act (t1?w"/Ae)S cothZh | (17.4) 
and 2 
Mo= — (h®/i) (0/08) f[a(v)/v]e dv} mo 
= K'’M?Ac[ (Ae) 3h S cothzB 
+ (w8/A6)S]. (17-5) 
Further, e “_ 
E= Act (w?/Ae)S cothłG (17.6) | 
and : 
m= wh2S{coth}B-+ (wh/Ae) 3 
— [Ra (A6)]S coth? 48}. (17.7) 


Equation (17.6) shows that é shifts to the violet with 
rising temperatures. The shift is very small, since 
hwS/ Ae is of the order of 0.0015 ev. If Aw were 0.08 ev T 
and S=30 this term would be observable. This might | 
occur in LiF if the transverse optical mode were im- 4 
portant for the interaction. m? now has a constant term 
and a term proportional to coth? 48. The negative sign 
on this term means that m? can be negative at very high 
temperatures. This anomalous situation arises from 
the mathematical formulation and has nothing to do 
with actual emission. By returning to (17.1) we see that 
a(v)<0 if v<0. Since the mathematical techniques 
require integration from — to +, the integral 
J%.(v—3)vG(v)dv can be negative in spite of the 
fact that (v—v)?>0. The author expects that both 
additional terms in (17.7) are very small and will not 
be detected experimentally. £ 

The form (17.7) is not valid at extremely hig 
temperatures, say 5000°K, if (ñw)?S/(^Ae)? is of © 
order of 10~*. For the F center in KCI this term is 
the order of 8X10, and our interests are limited 
temperatures below 600°K. The above calculations 
most probably valid under these conditions. 
indicate that the assumption that f is indepe 
of v does not have an important influence — 
moments. In the language of Dexter (D1, p. 
have shown when the “narrow-band appr 
is valid 


In the case where f? iremis on ion 3 
the solution i is straightforward but ext | 


— 7 
— OE 


MAF on 
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made witnout any profound theoretical justification? 
Im (17.8) q is the coordinate associated with the ground 
st'.te. The normal mode for the excited state is taken 
in the form g—Ag=Q. Further, we again assume that 
the cjxs are very small so that the frequencies in both 
states are equal. As before, f? is taken as real; further, 
our considerations will be limited to terms in fè. 
Equation (10.6) cannot be used, and we require some 
more complex equations given by Lax (L1), namely, 


{x| Tull o|x} ={x| Toul x} 


when f is independent of q. 
To establish (17.12) we require some of the properties 
of the x’s. One may show (M7, p. 121 and 358)§§§§ 


(0/09)xv= (w/h){ (30) xXx C3 (0+ 1) xea} (17.13) 
gx = (h/o) {E (0-1) xet (0/2) *xv-1}. (17.14) 
Hence 


(17.12) 


i Mo=hK Av{x| {|x} (17.9) 
M\=hK Av{x| fHu f'—fHy|x} (17.10) 
and 

Mo:hK Av{x| fief 

i —2fH fH +H el. (17.11) 
l Equation (17.10) is equivalent to (10.12) provided 

fi 

i 


{x]¢(0/0q) |x} = —} (17.15) 


and 
al TPV uol = {xl Vu Tix}. (17.16) 


To establish (17.16) we have used (3.26) but placed 
no restriction on the s; hence, (17.16) applies to the 
ease considered in Sec, 10 and is the justification for 
Eq, (10.12), A similar argument has not been found 
for terms in (47,—#,)§, and the technique of Sec. 10 
requires lengthy and complex algebra. 
B Returning to (17.8), we note from (17.13) and 
= (171P that our interest is limited to even terms of the 
gs, Le, such as ¢ or g(@/ ag), not ¢ or 2, 8g. Hence, 
ubstituting (17.8) in (17.9), one obtains 


a Mo=AK fo. (17.17) 


“Had terms in J been included, we would have found 
that Wa is temperature dependen’. Now, substituting 


a £ ee (1740), we obtain 

SA ~My = AK Avtx| fol Viu- Ve) F 
ee ajap e (17.18) 
(17.19) 


= 


de—akdgfolyit cothis) 
à Een] conga, (17.19a) 
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and (d) of Fig. 2. The ratio [M (d)— M (a) ]/M (a) must 
be of the order of 0.1 or much less, probably much less. 
The second term of (17.19a) cannot possibly be larger 
than 0.001, and we would not expect to. detect ^i 
experimentally. Further, 


Mo=nK Av{x|fo(Va- Vo) +è ft LTV aq + Valg ‘ 
qT Vu—3q VT +2q(Vu—V 9)" |x}. (£7.20) 


Since 
{x| TV uq]x} ={xlqVuT |x} 


{x|VuTq|x3 ={xlgTVulx} 
= {x|qVuT |x} E (h?/2)w°Aq, 


(17.21a) 


(17.21b) 
we have 


M2=hK Av{x|(Vu-Vo)*fotfoife[2q(Vu— Vo) 


—h'uwrAg||x}. (17.22) 
The third term gives rise to a temperature independent 


factor in m*. Taking the indicated average in (17.22) 
results in 


M =AK{ fol (Ac)? +7?wS coth] 


— Agf f Aho coth38+ha }}. (17.23) 

Returning to (7.3) we obtain 
m= S cotht8—Agq(fi}/ fo!) J. (17.24) 
aa 


To estimate the magnitude of the second term we again 
use Fig. 2. From the definitions of the f’s and Ag one 
obtains 


Agfi}/ fo'=[M(d)—M (a) /M (a). (17.25) 


The maximum correction term that one may expect 
in (17.24) is 0.1. Such small values could barely be 
detected experimentally (Sec. 10). Meyer (M8) 
suggested that this term equals — 10. Such large values 
cannot be explained readily by the theory. 

The conclusion is rather straightforward. The 
assumptions used in arriving at (8.18) are most prob- 
ably oversimplified. However, they do not produce 
major changes in the moments of the distribution or 
in the shapes. Some corrections to the moments have 
been worked out here. They are certainly too small to 
be measured, and the author cannot envisaze any 
theoretical calculation which will be sufficiently 
accurate to determine them. One may show experi- 
mentally, however, that Ms has a complex temperature 
dependence (K4). The theoretical reason for this is at 
present unknown. The author believes this will come 
out of a more realistic model. 
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1. INTRODUCTION 


HE mechanism of energy transfer from one con- 
stituent to the other in solids, solutions, and 
gaseous mixtures has created considerable interest dur- 
ing recent years. Until very recently these discussions’ 
were confined mostly to the inorganic phosphors in 
general, and alkali halides in particular. Along with the 
inorganic atoms and molecules, more than a few experi- 


- mental results on benzene and benzene derivatives are 


also reported in the literature." For instance, the 


i Solids 
1F. A. Kroger, Some Aspects of the Luminescence of 
i blishing Company, Inc., Houston, Texas, 1948). 
E R en DAA to Luminescence of Solids (John 


eee York, 1950). i 
wies, A O esenci and Phosphorescence (Interscience 


Publishers;Inc., New York, sea Rev. 94, 503 (1954). ° 


4M. Furst and H. Kallmann, 97, 583 (1955). 
d H. Kallmann, Phys. Rev. 2 oe 607 (1955). 


fluorescence of indigo vapor is not excited by light of 
wavelength between 2700 to 2400 A, but in the pres- 
ence of aniline vapor it gives violet indigo fluorescence. 
The incident light is first absorbed by the aniline vapor 
and then the absorbed energy is transferred to the 
indigo vapor. Similarly, aniline in presence of benzene 
vapor gives strong fluorescence under weak excitation. 
Zinc radiation at 2500 A produces only a very weak 
fluorescence in aniline vapor extending from 2850 to 
2500 A. In a mixture of aniline and benzene vapor the 
benzene fluorescence is weak though it is easily excited 
by primary light of wavelength 2500 A, while the aniline 
fluoresces strongly. Apparently, the transfer of excita- 
tion energy from the benzene to the aniline molecules 
has high efficiency. These are instances of sensitized 
fluorescence which are commonly described as the 
process whereby an impurity atom or molecule (acti- 
vator) devoid of any appreciable absorption band in a 
given spectral region is made to radiate upon absorp- 
tion of excitation energy in this region by the host 
lattice or an impurity atom or molecule (sensitizer) 
embedded in it. In all cases of sensitization, the problem 
which is of the most fundamental significance is as 
follows: What is the mechanism of propagation of the 
excitation energy from the sensitizer to the activator 
and its ultimate transfer? Where the sensitizer and the 
activator are close to each other, the excitation energy 
does not have to move far and the processes of propaga- 
tion and transfer both take place at nearly the same 
position in the lattice. There are also cases of sensitized 
luminescence where absorption of the exciting energy 
takes place all over the matrix lattice while the lumi- 
nescence transitions occur in or near the few activators. 
These cases are characterized by propagation of energy 
through the lattice over a considerable distance before 
transfer of the excitation energy can occur. „There are 
four distinct processes by which transfer“of the excita- 
tion energy can take place. They are (a) photo» ahsorp- 
tion and reabsorption, known as photon cascade; (b) 
collision; (c) exciton migration; and (d) electron and 
hole migration. We discuss them briefly, bringing out 
the conditions to which each is subject. Particular 
emphasis is given to the discussion of anthracene- 
naphthacene and anthracene-naphthalene systems in 
crystals. 


(a) Photon Absorption and Reabsorption 


This mechanism of energy transfer, also~called the 
photon cascade process, becomes of quantitative im- 
portance in a system if there exists strong overlap 
between the emission and the absorption spectra and 
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if the system has a high absorption coefficient in the 
area of overlap, comparable with that at the wavelength 
of excitation. The system may be a pure one containing 
Dg foreign constituents, the region of overlap being 
between the entission and the absorption spectra of the 
constituents of the pure system. If the system is an 
impure one, containing impurity atoms or molecules 
embedded in it, the region of overlap may be provided 
by the emfssion spectrum of the pure system and the 
absorption spectrum of impurity constituents. If we 
consider such a crystal system, the part of the primary 
molecular fluorescence that is emitted in the region of 
overlap, is practically reabsorbed by the crystal within 
a sma} fraction of its thickness. This reabsorbed radia- 
tion is re-emitted as fluorescence by the molecules which 
absorbed it. A fraction of it suffers reabsorption by 
succeeding layers of molecules, these processes of 
emission and reabsorption follow each other until all 
the excitation energy is either transformed into fluores- 
cence outside the region of overlap so as to escape from 
the crystal as technical fluorescence or is thermally 
dissipated by internal conversion. 

A quantitative formulation of the mechanism was 
developed by Birks’ to explain the photofluorescent 
and scintillation properties of organic materials. Let us 
consider a mixed crystal containing solute molecules A 
immersed in solvent molecules S. On the photon cascade 

e theory, the processes shown in Table I occur in the 
crystal, assuming that the exciting radiation is absorbed 
totally by S and that S does not absorb the fluorescence 
radiation of A. 

When only processes 2 and 3 (see Table I) compete, 
there results the molecular fluorescence of S. Its quan- 
tum efficiency nos is given by 


1 


sn (1) 
SF (Kis/Kys) 


NOs 


> 


When the competing processes are described by 2, 3, 
4, 5, and 9, the primary fluorescence of S is reabsorbed 
and re-emitted and only a part of the primary fluores- 
cence S escapes from the crystal as technical fluores- 
cence. The quantum efficiency nes of this is given by 


ano NOs 


E CA 


Nes 


yS NOs i 
=| 1+ (Ks/Kes)+ (Ka/Kes)LA ] 


Ks \? Nos 3 
ee aa 
o NOs 
IFO) K/K) KA 
1J. B. Birks, Phys. Rev. 94, 1957 (1954). 
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A 
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e TABLE I. Photon transfer process. + f 
Process Relative ratio*® Description of process 
————— TT 
1. S+hyv > S* 1 Excitation by light 
2. St—> S+hy, Crs S*] Fluorescence of S 
Ss eS) KEATS] Internal quenching of S 
4. Sthy,— S* K Ky. S*]} Self-absorption of S 
5. Ativ;— A* KaKk,;[S*](A] Absorption of A 
6. A*— A+hva KyalA*] Fluorescence of A 
7. A*—> A K;alA*] Internal quenching of A 
8. A+hva > A* KaKyaslA*][A] - Self-absorption of A 
9. hv, —> KesKysLS* ] Escape of Avs 
10. va > K.aKyalA*] Escape of Mva 


a Here Brackets stand for concentrations in molecules per solvent 
molecule, S* and A* for excited molecules of S and A, and K =(1/#) for the 
probability of a process of decay time £, i.e., number of events describing 


the process, that happen per unit time. > 


In the absence of internal quenching Kis=0 and 
nNos=noa=1, so that (2) is reduced 


1 


a (3) 
1 3P (Ka [A Kes) 


Nes 


The quantum yield for transfer, i.e., the ratio of the 
number of photons transferred per second from S to A 
to the number of photon absorbed by S$ per second, 
neglecting nonradiative transactions in S becomes, in 
the case of molecular fluorescence, 


nor=KalA ]/Ks. (4) 


The molecular fluorescence quantum efficiency noa of A 
is given by 


1 


noa =—_————.. (5 
ý 1+ (Kia/Kya) ) 


The quantum efficiency (nea) of escape of (A) technical 
fluorescence is given by 


NesNOA (Ka/Kes) [A J 


"a TE Gen Ca a 


(6) 


In the absence of internal quenching, K;,1=0 and 
Nos=noa= 1 so that we have for (6) 


(Ka/Kes)[A ] 
1+(Ka/Ke)LA] 


Here the quantities K.s, Ks, and K4(A) are inter- 
related-by 


nNeA=Nea(Ka/Kes) LA |=1—nee= (7) 


KestKst+Ki(A)=1. 


They are dependent upon the relative absorption 
coefficients of S and A for the fluorescence of S and o 
the crystal size. z es 
Ţ~ In pure crystal phosphors, processes 1, 2, 3, 4, and 
are in operation. Tke molecular quantum effi 
(suffix S is now dropped) remains unchanged \ vl 
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technical quantum efficiency ne takes the form 


no 1 


E E/KK) 


[. K+K.=1].- (8) 
If internal quenching is not present, the molecular 


photofluorescence decay time (¢;)oo=1/K;; internal 
quenching, if present, reduces it to (t;)o given by 


(tp)00 
1+L(¢,)00/ti] 


In case of a large thick crystal, stepwise self-absorp- 
tion lengthens the duration of the interval between the 
moment of primary excitation and the moment of 
escape of fluorescence so that the technical photo- 
fluorescence decay time #; is increased to 


= (ts)o0 
: Ke+L(ty)o0/ti] 


When internal quenching is small enough to make 
(t;)oot; (8) and (10) may be expressed approximately 
as follows: 


=no(!s)o0- (9) 


(= 


(10) 


(11) 


ne= (no). (12) 


Here N=1/K, represents the average number of 
steps of molecular emissions passed before escape of 
technical fluorescence. The molecular and technical 
photofluorescence quantities (/;)o and (4) can be ob- 
tained ‘from measurements on microcrystalline and 
thick crystal specimens and thus NV can be determined. 
Comparison of the molecular and the technical photo- 
fluorescence spectra leads to a determination of N. 


t= n(ty)o 
and 


(b) Collision Process 


This mechanism was proposed by Cario and Franck® 
and to account for the disappearance of fluorescence of 
a gas due to the presence of foreign gas molecules. In 
this process when an excited atom or molecule collides 
during the lifetime of its excited state with a slow elec- 
tron, it may give away its excitational energy to the 

electron, without any radiation, the products of the 
collision being an electron with enhanced kinetic 
energy and an unexcited atom or molecule. When the 
collision partner is a normal atom or molecule, the ex- 
‘atom or molecule may give away a quantum of 
y (E:—Ēı). to the unexcited one and the latter 
, take up the energy either as energies of transla- 
excitation or both; there being no loss of energy 
iø the process as radiation. Such radiationless 
Bs excitation energy is called a collision of the 
a tee: 
ç Z. Physik 17, 202 (1928). 
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second kind, as distinguished from a collision of the 
first kind in which excitation is produced. The mecha- 
nism, demonstrated in gases by Franck, Cario, and 
many others, has been shown to take place also in some 
organic solutions.*? 

The quantum-mechanical theories of this transfer 
process as formulated by Perrin’ and Forster! are 
based on resonance between allowed electric dipole 
transitions in the sensitizer and the activator. The 
resonance theory is extended by Dexter™ to include 
cases of energy transfer in which an activator having a 
suitable emission spectrum can be made to fluoresce 
with energy transferred from an absorbing sensitizer 
despite the fact that the activator is forbidden direct 
absorption of energy. The transfer mechanisms of 
importance are by overlap of the electric dipole fields of 
the sensitizer and the activator, overlap of the dipole 
field of the former and the quadrupole field of the latter, 
and exchange effects, producing, respectively, sensitiza- 
tion of about 10’—10*, about 10°, and about 30 lattice 
sites in a crystal surrounding each sensitizer. Dexter’s 
theory is applicable to impurity sensitization both in 
solid and liquid systems. He adopts the physical model 
of a crystal in which the sensitizing impurity atoms or 
ions, of atomic concentration vs, and the activators of 
atomic concentration Xa, are randomly arranged at 
suitable lattice sites, without mutual interaction. The 
time-dependent perturbation method of quantum — 
mechanics gives the probability of transition of a 
system from its initial to final state as 


(2n/h) | (Hi) |x, (13) 


where 7 is Planck’s constant divided by 27, Hı, the 
matrix element of the perturbation to the Hamiltonian 
and pz the density of final states. In this case the initial 
state y is identified with the configuration of the system 
in which the sensitizer S is in the excited state Y,’ 
while the activator A is in its ground state Ya and the 
final state Yə is the configuration in which S is in its 
ground while A is excited to the state y.’. Thus the 
probability that the energy of excitation is transferred 
from a particular sensitizer S to a particular activator 


A has the form, E 
Ş © 


2 ANOS op 
P= (2r/h)pe Svavar ) (14) 


as long as the states yı and y2 are of the same energy. 
To take account of the lack of definition of the initial 
and final levels of S or A caused by lattice vibrations, 
the wave functions are normalized on an energy scale 
and the density of states factor pz is included within the 


2J. Franck and R. Livingston, Revs. Modern Phys. 21, 505 
(1949). 3 


10 F. Perrin, J. phys. radium 7, 1°(1936). % f 
ə 11 Th. Forster, Ann. phys. 2, 55 (1948); Z. Electrochem. 53, 93 
(1949). 7" 
e 


12D. L. Dexter, J. Chem. Phys. 21, 836 (1953), 
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normalization parameters. It is convenient to normalize 
the initial states of S and A (y,’ and Ya) and to express 
by the functions p,’(w,’) and pa(wa) the probabilities 

de Et S is in the particular energy state denoted by 
ws and that A’is in the state wa so that 


z m fiwa f vata ltar, 
f pi (odo = f palwe)divg=1. 
0 


The final states of S and A (Y, and y,’) are normalized 
on thé energy scale by the following: 


1 w-+dw 
— dw, | Iwao) Par=1 
Aw Yy 


(15) 


(16) 


ee 


dwd fie Wa) |*dr=1. 
Aw 


The properly antisymmetrized initial and final states 
wave functions describing S and A can now be written as 


Ws (Taws) 
Wa(F2,Wa) 


1 |W5(71,ws’) 
Wal71,Wa 
1 |Ws(71,We) Ws(F2,Ws) 
V2 Ya’ (Piwa) Ya! (F2,t00!)| 


Wi(ws Wa) = 


Vi 


PalWws Wa ) =— 


where only two electrons are involved in the transitions. 
According to the Franck-Condon principle, an elec- 
tronic transition in which the electronic excitation 
energy is lost to the lattice has small probability of 
occurrence, because of the condition of energy equality 
of the states yı and We. Under these conditions the 
transition probability Psa should contain a Dirac delta 
function 6(£,— E»), as a factor where E1= w, —w, and 
° E= (wa’— wa) are the excitation energy on S in the state 
Yı, and that on A in the state ys, respectively. The total 

” probability of energy transfer from S to A becomes ~ 


[Ao 


2r 
Psa= (=) Ny, 2 (gs ga) 
h i 2 


x f dwa f dws f dwapa(wWa) f dws ps (ws) 


x | (H (ws, Wa; Ws, Wa’))12|” 
X46 (ws'—ws) — (Wwa —Wa) J, 


Cad 


and of the level of A, respectively. 


eo 
a 


Āe i a a ee 
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*stituting E= E= E», we have ae s 
2r ; 

Pu= (=) ZE (eer b 
h vp 
x fae f twapata) f awepo) ae 
x (Hi (ws', Wa Ws E Wa+E))|*- (19) 


(18) 
wheré g, and gq ‘are the degeneracies of, the level of S | 


Calculating, by means of the S function and sub- 
A 9 
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The interaction between A and S can be expressed as 
the sum of all the Coulomb interactions of the outer 
electrons and the core of A with those of S, properly 
reduced by the dielectric constant K of the medium. 
Expansion of this sum in a Taylor series around the ` 
vector R, which separates the nuclei of A and S, 
reduces the interaction Hamiltonian to the form 


T Z x 2 
A(R)= (=) {Fs-Fa—3 (Fa: R) (Fa B)/R} 


3e? 3 
2KRt/ i= 


+10(X VZ/R?) (xaVa2s+Xa2a}s F YaFavs) 


+E D [(R/R)—SRER,/R] 


itj 
XxX [- Tas Tei (20) 


2raita;tsil}; 


> m¥som refers to all electrons on S measured 


Bakar 


Electric Dipole-Dipole Interaction 


where f= 
from its nucleus and so with ra= 


The first curly brackets in Eq. (20) are the dipole- 
dipole interaction term. Inserting this into Eq. (17), 
one can have 


2r et 
Paad- (2) Eg (——) eo 
XdE f dWapalWa) f dw: ps (ws’) 
X (Fs) Fa) — 3 (Ts): R) | ((ra)R). | 5 


where the matrix element, (7,) stands for 


E) ists 


and so‘Wwith (ra). Average of the absolute squar of tl 
matrix element in Eq. (21) over all possible oriex 
of R is (2/3)| (rs)|?| (ra) |”, where | (r)|?= | (a) |2- 
+ |(z)|? so that 


(21) E 


(Fs (ws, w gy 


P,.a(dd)= LZ | dE 


a TER 1 2 


xl fo b 


D md 
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By using the usual expressions for Einstein’s A and P Xa—> 
i ix i 40-5 40-4 4073 
coefficients to evaluate the matrix elements contained 


here in terms of the experimentally measurable quanti- 
ties such as oscillator strengths, absorption coefficients, 
and decay times, Dexter reduces Eq. (22) to the form, 


t 

a 

q 3hic'Qa f E Nt ff(E)E LE] 
| mior a (22 as 
|  4nRon'r,\ KiS. EA 
H i 
j 


with the absorption data for A and the emission data 
for S or to the form, 


i 

í hce? QaQs f gs & \4 f(E) fal Ee 

oo o (5 0, 
1 E AT n?RE \g,’/ \K38, E? 

iy (24) 


eer neeneen ne 


which also contains the emission data for s, where 
Q;, Qa=areas under the absorption bands of S, A; 
Ts= decay time for S emission; 6=electric field at an 
isolated atom; &,=electric field within the crystal; 
K=dielectric constant of the medium; f(E)= shape 
of the observed S-emission band normalized so that 
S f.(B)dE=1; FalE)=ca(E)/Qa; o.(£)=absorption 
cross section of energy Æ by A, measurable at low con- 
centrations of A, as the absorption coefficient per unit 
impurity (A) density in the medium. 

This result is similar to Forster’s! expression for the 
probability of (d-d) transfer between systems S and A 
at a distance Rsa apart, 


0°04 


0'4 A0 


Ue 


40°0 


Fic. 1. Quantum yield for dd transfer as a function of the re- 
duced concentration y of the activator. The upper abscissa repre- 
sents the numerical values of the atomic concentration xa in the 
typical case of NaCl crystals (after Dexter). 
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may be set equal to unity and one obtains 


~ 


ñr =y(C:(y) siny— S:(y) cosy+ (r/2) cosy}. (28) 


This function is shown in Fig. 1. For concentrations 


2 

u= 100a, (25) small enough that y&1, fr varies as y(r/2+y lny 
TND Rsa’ —0.4228y) so that for small xa, #7 increases linearly 
3 with concentration. For concentrations high enough to 

where, M=6.06X10, *=wave number in cm™, make y>1, 3r. 
i ee f.(#)=molar emission intensity of S at %, pa(#)=molar The temperature dependence of the transfer prob- 
aa extinction coefficient of A at 7. ability can be understood from Eqs. (23) and (24). 
Ef i The transfer probability thus depends upon the Generally Qa, Qs, or Ts are not very dependent on tem- 
LB strengths of the individual transitions as determined by perature so that the main temperature dependence is 
i i decay times, absorption band areas Q, and the energy determined by the temperature dependence of the 
| f overlap of the emission and absorption bands of A energy overlap integral /` f.(E)Fa(E)/E*. Both fs and 
sah i and S. Neglecting nonradiative transfers in S and F, generally possess temperature broadening. In the 
he transfers from S to S, the following expression for the event of perfect overlapping at all temperatures, the 
j quantum yield for transfer is obtained, transfer probability decreases with increasing“tempera- 


Mp ture. If, however, the absorption and emission bands 
r= peate/ (1+ Peat). (26) have well separated centers, temperature broadéning of 


H | By writing b= (PsaT:)?v, where v is defined as 


(4/3)rR3, one obtains nr= (8°)/ (+8). Let Ct be 
the number of density of lattice sites that can accommo- 
_ date A and let vo be the volume excluded by the presence 


the bands acts to produce greater overlap, and there- 
fore grater transfer probability. At low temperatures 
the broadening being negligible as compared to natural 
zero-point widths, the transfer probability should not 


of the sensitizer. Writing y=«aC*8 one finds the aver- depend on temperature. 


| age yield as 7 RS AE Electric Dipole-Quadrupole Interaction 
} i jo —Xa vo) — 
| = Ar=y exp (za vo) {y7 ia ae a (n/ iS comp, OD When S makes an allowed dipole transition and A a 
i A +Ei(y) siny—i(¥) Cosy YI» prohibited quadrupole transition to calculate the trans- 
| te and S; are cosine and sine integral functions. fer probability, one can insert in Eq. (19) the absolute 
el e t 


i ions xa 1% if C+=2.26X10” cm? square of the matrix element (Hı) computed from the 
=ê ° For ponten Mns 1 0-2 cm™ the factors exp (xaCtvo) second bracket of Eq. (20) and averaged over all 
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orientations of R. Inserting 


where a= 1.266 and | (N<)|?= | (2a?) |?+ | (ya) |2+ | (2a) |? 


=: +2| Ley) |?-+2| (yz) |?+-2| (zx) |? one obtains the follow- 
ing expression for the transfer probability: 


© 


y 


Dela 


4K? R8 


(| (Aa) |a= l(rs)|?|(Na)|? in Eq. (19), 


Pa(dq) = Se'na/2HK*R ge! g0 DDD far 
Lee 
x | f dw,'p. (wl) | Faw, w/—) | 
s 


x| f wpe | Wali wt Eel. (29) 


Relating, as before the bracketed quantities to either 
emission or absorption data when measurable or sig- 
nificant, for A or S, Dexter obtains 


4 pfa EEE 
yea a 


E 


135rah cga" ( 6 


Psa(dq)= Fae 
76, 


A RST Taga 


If 7, is not measurable or has no significance, one can 
use the expression 


135ah*c®Q, 
Brad) =—— ae 
Amni R874 


ee 6 j 
gags \NKt6. E’ i 


The ratio that Psa(dq) bears to Psa(dd) is calculated 
from Eqs. (30) and (23) to be (45a/4n?)(m\/R)?ra(d)/ 
Ta(q). Since a quadrupole radiative transition has a 
probability of the order of (a/))? times a dipole radiative 
transition, one can put ta(d)/7a(q) so that Psa(dg): 
P,a(dd) is of the order of (a/R). Thus, resonance 
> transfer can occur from a sensitizer to a nearby acti- 

vator having a quadrupole transition in the proper 
” frequency interval whereas direct absorption of radia- 
> tion by the activator takes place with a probability 

only ~10-7 of that in an impurity with an allowed 
. transition. 

After defining y by the relation Psa(dq) =**v**r, 
and putting y=yCt«a, one finds 


(31) 


r= y exp (xaC tr) | y[exp(—aaC tv) — 1] 


+ f (emptomal| (32) 
0 
a and : a 
aang f (e-¥/1-418!8) dt, (33) 
0 
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Fic. 2. Quantum yield for dq transfer as a function of the re- 
duced concentration y of the activator, as in Fig. 1. The upper 
abscissa scale shows atomic concentrations in the typical case 
of NaCl crystals. The variation of 77’ with concentration is more 
rapid in the dg case than in the dd case (after Dexter). 


This function is shown in Fig. 2. Since P,a(dg) depends 
on R more strongly than P,.(dd), 77 depends upon “a 
more rapidly in the dg case than in the dd case. 

The function F(Z) which is the absorption shape 
function of A cannot be experimentally determined 
because the absorption would not be observed. 


Exchange Effects and Other Forbidden Transitions 


Considering only the effects of electron spin, one 
can write out explicitly the matrix element of H ` 
Y(T) = ¢(7)x(c) where x(c) are the spin wave func 
tions. 


(H 1) = if ge" (ři) a" (Fo) 19s (7) Qa’ (F2) 
XXe * (T1)Xa* (F2)xXs(F1)Xa’ (T2) 


T f es" (71) a" (ro) 1 Ga’ (#1) Ps (T2) 
xt (T1) Xa* (T) Xa (ī)x: (E>) 


The frst term in (M4) is the Coulomb ter 
before, and the second integral is an excha 
with Hi1=e?/Krio. If the X’s are left for i 
integral can be understood as represen 
static interaction between the two ch: 


Q'(r1) o Ted) and Os 


Each function dies off 
“from S or A and each p 


mabe 
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the integral S'O’ (71) (1/riz)Q(F2)dr12 may be of sizeable 
va.ue’(R being small) even though the overlap integral 
JS Q(r)dz which enters in normalization, is small enough 
to be neglected. 

The transfer probability by exchange mechanism 
may be expressed as 


: Pule) =(Z)z fperecoae, E 


where 


Q’ —0tr 2)dr12 


712 


2-57 


1.2 


Es AKA 


a 
cannot be directly connected to optical experimental 
quantities. 
The exchange mechanism has an efficiency in energy 
transfer slightly less than the dq transfer. 


Electric Dipole-Magnetic Dipole Interaction 


The transfer mechanism arising from an elec- 
tric dipole-magnetic dipole interaction may be treated 
by including magnetic fields in the Hamiltonian. 
The most important in a so-constructed H, is 
eh/2mcH - (TaXVa+ iða), where H is the magnetic 
intensity at A produced by motion of charge at S. 
The largest term in H being of the order (e/c)RXV,/R® 
or (eh/mc)RXV./R’. Thus (Hı) has the magnitude 
(€°h?/2m??R?)(V.) or because (V,)=mE(F,)/h? one finds 


|(Hi(em))|*/| (H(dd)) |?~(e2/hc)*E2/(e/R)?. (36) 


Thus for the nearest neighbors P,a(em) is of the order 
of 10~* times P.a(dd), small enough to be neglected. 


Host Sensitization 


Most of the discussion of impurity sensitization 
applies also to host sensitization, i.e., sensitization of 
an activator by the lattice itself in which it is em- 
bedded. In this case X,, being equal to unity, is not 
small enough to forbid S-S transfer. When S has an 
allowed transition the transfer mechanism from S to-S 
would be a (d-d) process, and depending on the degree 
of forbiddenness of transitions in A, either a dd process, 
a competition between dg and exchange or exchange 
from S to A. 
= Transference of excitation energy by dipolé-dipole 
“interaction should be carefully distinguished, trom a 
d cascade peucnomenon. Im the latter process where a 

photon emitted by S is reabsorbed by A, the quantum 
fer depends not only on the concentration 
if also òn the size and shape of the system, as 
16). 
fr y 2 yield for transfer by the cascade 
i a can be r as follows in terms of the 


. 


—— 
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crystal dimension, 


in()=Anf JB), 

P — 
where A, v, fs(Z), C, and (E) are.as defined beforé cnd 
the average is taken over the linear dimension / of the 
sample. 1—nor is the probability that a photon emitted 
by S will escape from the system and is then equal to 
Nes. The quantity ‘C(A)l represents the number of 
impurity molecules A per cm? projected on a plane 
perpendicular to the path of the photon. If we compare 
a spherical system containing a concentration (A) of A 
with the same system spread out in a thin film as regards 
quantum yield, it will be greater in the former case 
than in the latter, if the transfer takes place by the 
cascade process whereas it will be independent of the 
shape and size of the sample if the transfer takes place 
by dipole-dipole interaction. 


(c) Exciton Migration 


Exciton or excitation wave, a concept first introduced 
by Frenkel! to mean quick transfer of excitation energy 
from one constituent to another of a crystal lattice, is a 
product of the mathematical problem of calculating the 
excited states of a lattice of interacting atoms or mole- 
cules. The physical model used by Frenkel, Peierls," 
and subsequently, by Slater and Shockley was simple. 


It consists of a system of identical atoms each contaén- > 


ing one electron arranged in simple crystalline array. 
The electron spin is neglected and both the lowest and 
the first excited states are assumed to possess no de- 
generacy. Let atoms be located at the points defined by 


r(n) =ni Itr- 1375, (37) 


where the < are the primitive translations of the lattice 
and the n range over all integral values. Let yı and y» 
denote the normal and the excited wave functions of 
the particular electron at r(n) and assume that there 
is so little overlapping of the wave functions on different 
atoms that the y are practically the same as the atomic 
wave functions. Consequently, Yn» andy,’ are orthogonal 
to each other and to the wave functions of electrons on 
other atoms. A wave function for the lowest stete of the 
entire crystalline array, which may be expressed as a 
determinant of the form z 


enal Vila) «> -YilFy) 


(N Dilyn (Ti): -Ww (Fn), 


where N is the total number of electrons and atoms. 
The mean energy of this state is Eo= S Yyo*Hyodr, 
where H is the Hamiltonian of the entire system. 

Let us now consider the wave function yy of the entire 
system for the state in which one atom is excited from 


18 J. Frenkel, Bat Rey. 37, 1276 (1931); Physik. Z. Sowjet- 


(38) 


e union, 9, 158 (1936 


“R. Peierls, Ann. Physik 13, 905 (1932). 
15 J. C. Slater and W. Shockley, Phys. Rev. 50, ios (1936). 
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the state yy to the state y; 


m (Palfy) pal) 
aa oo E POW) pala) 
) ai ee (39) 
WW D lyell) +> Wy (Fy) 
ae age Wy (Fi)Wy (72) ++ Wr (Fy). 


. J . 
We cannot*regard this system of wave functions as 
proper representations of the excited states of the 
crystal because they do not make 


. Egs = fittar (40) 


ad 
vanish when m¥n. 
The N-dimensional matrix constructed from the Ey, 
can be diagonalized by finding those linear combina- 
tions y’ of the yy which are of the form 


= Lays, (41) 
where the a; satisfy the following equation 
DD a, Byars (42) 
0 


Since crystal symmetry makes Æ, dependent only 
upon the difference between g and the f values, the V 
equation (42) can be reduced to the same form if we 
«substitute 
Oy = ape?™ K: r (0) (43) 
where K is a vector in the reciprocal lattice of the 
crystal. This reduces (42) to 


Ex’ = Eyt LE; tree 


where Ey, sps = yim, gpm+s Where m, s are arbitrary in- 
teger sets. 

After substituting Eq. (44) in (41), 
normalized wave function 


wik-r(s) 


(44) 


we have the 


4 ye SO (45) 
N} 

A wave fynction of this type represents the fact that 
the locality wiere the atom is excited, moves about the 
lattice witl»a wave number k. In other words, the ex- 
cited state is not localized on the particular atom that 
is excited, but is handed over from one constituent of 
the lattice to its neighbors. 

By virtue of the assumption of weak overlapping we 
assume that /,, vanishes for all but the nearest neigh- 
bors. If E, has the value e in this case, (44) reduces to 


Ex’ =EyteQler, (46) 


where r is to range over the nearest neighbors. 
For a cubic lattice with Jattice constant a, Eq. (46) 
takes the simple form ` 


Ex’ =Eyyt+e (cos2rka- cos2rk,a-+cos2rk:a). 


a 


(47) 
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This means that the excited levels constitute a bard 


having a spread of the order of magnitude e. - # 
It can be shown that the wave of excitation moves 
with a group velocity, $ 


A V=(1/h) gradxE, (48) 


and also. that it carries no current. We may thus look 
upon the excitation wave as if it were an uncharged 
particle created by exciting the crystal and capable of 
moving about the lattice. This concept is due to Frenkel 
who called the wave of excitation an exciton. 

It is instructive to look into Wannier’s!® model of 
treating the excitation bands in insulators because of its 
simplicity. If an electron is removed from the highest 
filled band of an insulator a positive charge, or positive 
hole, results. This positive hole may be filled up by an 
electron from an adjacent lattice constituent, and thus 
the positive hole may be able to roam about freely in an 
undistorted lattice. Though the Bloch picture endows 
the excited electron with freedom of movement inde- 
pendently of the positive hole, one should attract the 
other with a force that is Coulomb-like at large dis- 
tances. It has been shown by Wannier that at the dis- 
crete levels of the hydrogen atom the excitation bands 
arise because of the motion of the electron and hole, one 
about another in closed states, and the different levels 
in a band behave as if they were different transitional 
levels of an excited hydrogen atom. This picture al- 
lowed Wannier to obtain a set of simple equations from 
which to derive the wave functions and energy levels 
of an exciton. 

The migration of excitation energy by the exciton 
method should be carefully distinguished from that 
by dipole-dipole interaction. When excitons are gener- 
ated in a crystal, the excitation energy is propagated 
through the lattice with a given wave-number vector K 
and the total wave function of the crystal describes a 
moving state of excitation energy of previously defined 
momentum, but indeterminate coordinates. The inter- 
action Hamiltonian which describes the moving state of 
excitation energy is in form the same as that which 
determines the probability of transfer in a quantum 
mechanical collision process. The fundamental differ- 
ence between them is that the former process is the 
result of interaction between an excited atom or mole- 
cule and all other like units, while the latter process 
results from interaction between one excited sensitizer 
and a normal activator. The latter is thus a one-step 
process which may take place over a distance of many 
angstroms in a neutral solvent, while the former Grae 
be regarded as a stepwise prepagation of excitation 
energy from molecule to molecule covering many SUC: 
cessive independent acts and terminates by one of | 
number of possible events. The motion of am excit 

may be términated by trapping at am impurity 
decay with characteri o emission of the solven. 


186 G. H. Wannier, Phys. Re 52, 191 (1937). 
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a radiationless transition. The transfer of energy to an. 
impurity by exciton migration obeys a concentration 
dependence that is entirely different from that in case 
oftthe single step process of collision. The theory of 
exciton propagation in an insulating crystal has been 
extended by Heller and Marcus" to include cases where 
appreciable overlapping of the electronic wave func- 
tions of nearby crystal constituents is absent. Whereas 
in the primary works of Frenkel, and later Slater and 
Shockley the ease of exciton propagation was ascribed 
to the overlap of the excited state wave functions of the 
neighboring crystal constituents, Heller and Marcus 
have shown that very good propagation may result, 
notwithstanding poor overlap, from the coupling pro- 
duccd by the dipole-dipole terms of the interaction 
Hamiltonian which give rise to the van der Waals 
interactions in second-order perturbation theory. When 
one of the lattice constituents is excited to such a state 
that transition between it and the ground state is 
associated with a large oscillator strength, the strong 
coupling between this excited constituent and its iden- 
tical neighbors in the ground state allows the migration 
of energy through lattice. This happens because the 
translational symmetry of the lattice permits of solu- 
tion of the corresponding Schrédinger equation in the 
form of moving waves of excitation, analogous to 
Frenkel’s exciton waves. One of their important results 
is that even in an idealized cubic crystal there exists 
anisotropy in the propagation of excitons. Two interest- 
ing cases of exciton propagation are distinguished, 
those in which the induced dipole moment up is parallel 
and perpendicular, respectively, to the vector k of the 
exciton wave (Fig. 3). Provided there is no interference 
from polar modes of lattice vibration, exciton waves of 
the type u-k=7/2 are excited by absorption of light 
Waves, since in that case the electric vector and the 
propagation vector for light quanta are normal to each 
other. Conversion of excitons from the type u-k=7/2 
into the type u-k=0 and vice versa may result from 
scattering collisions with phonons or other imperfec- 
tions in the lattice. It is of interest to note that the 
excitons for which y-k=7/2 have an energy increasing 
with & for all prominent directions of propagations, 
namely (100), (110), and (111) in a simple cubic 
lattice, and thus these excitons created by light waves 
behave as if they possess a positive mass. Those excitons 
which are polarized parallel to k behave in the reverse 
way and as if they possess a negative mass. The magni- 


tude m’ of the effective mass of an exciton is given by 
Ç 
m'œmRo/ fnote, r 


998 S$ 


a 


a 
where m is the electronic mass, f is the oscillator 
oneth of the transition between the ground state and 
ted State, a, is the equivalent Bohr radius for 
t ited atem, and Ro= (37r/4no)*, no Being the 


17 Wo R. Heller and A. Marcus, Phys. Rev. 84, 809 (1951). 
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the quicker is the propagation of radiationless migration 
of excitation energy from one atom or complex to 
another atom or complex, attributed by Cario, Franck, 
and Forser to dipole coupling. The speed of migratiop 
varies inversely as the sixth power of their separation. 
In the case of exciton propagation, by virtue of the 
very high density of states, the stationary state cannat 
be located on one particular excited atom at a time and 
so the appropriate niodel is much like the electron wave 
packet in metals, in which collisions changing u and k 
will take place during the lifetime of the excitations. 
Following this approach one can think of a typical 
maximum diffusion distance of an exciton wave packet 
d as given by > 


» / L 
d< (U thermal )Tihermal Toptical) * 


=~ (104X 10—"X 10-8) = 10— cm. 


The total length of the path traversed may be of the 
order of centimeters. The exciton may be deprived of 
its energy long before the optical lifetime has expired by 
a competing process. 

All these conclusions are reached on the basis of the 
assumption of a small amount of overlapping of the 
excited wave functions of the electrons. In certain ionic 
and molecular crystals in which the electron wave 
functions are strong at the widely spaced lattice sites 
and weak in between, this assumption is an appropriate 


qualitative description of the state of affairs. n 
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Fic. 3. The energy of exciton waves as a funcion of wave 
number (after Heller and Marcus). The upper curves refer to 
excitons for which the electric dipole moment is perpendicular to 
the direction of propagation, whereas the lower family refers 
to excitons characterized by polarization parallel to the wave 
number vector k, d is (270)~4 where no is the density of ions con- 
tributing to the exciton propagation. The energy unit is mono 
where yo is the electric dipole moment of the excited state. Curves 
A, B, C in the upper diagram and A’, B’, C’ in the lowemare for 
k vectors in the“ (100), (110), and (111) directions, respectively. 
Integral approximations in which the excitons behave isotropically 
are shown in curves J and 7’. 
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(d) Electron and Hole Migration 


Production of electrons and holes and their migration 
“mere first suggested by Mott and Gurney!’ to account 
for» electronic behavior in insulating crystals. Such a 
crystal is characterized by a full band of energy levels, 
named the valence band, separated by a finite energy 
gap W from an empty conduction band (Fig. 4). If 
absorption of a quantum of radiation raises an electron 
to the conduction band, the resulting vacancy in the 
valence band, called a positive hole, may be filled up 
by an electron from another constituent whose full 
band lies above the starting band of the absorption 
procesges in Fig. 5(a) and thus may be caught in a 
trap. When the free electron from the conduction band 
finally recombines with its holes trapped in the new 
host, there results a new emission. It may also happen 
that the excited electron temporarily falls into an 
intermediate vacant band below the conduction band 
as in Fig. 5(b) and the holes become mobile through 
the lattice until it recombines with its trapped counter- 
part with the emission of a new band. The energy set 
free by the trapping of the electron or its hole may be 
emitted as infrared radiation or dissipated as heat. 
The traps of the electrons or holes are localized—the 


VACANT CONDUCTION 


BAND n 
-6 Fic. 4. Schematic 


FORBIDDEN BAND representation of the 


filled and unfilled 
FILLED VALENCE 


levels. 

energy levels being produced by an activator or lattice 
flaws, or cracks. Such crystals exhibit photoconduc- 
tivity because of the mobility of an electron in the 
conduction band or because of its positive holes in the 
valence band. 

When energy transfer takes place over a long distance, 
each of these processes requires a transport mechanism: 


(1) In a quantum mechanical collision process, the 
collision partners themselves may become vehicles of 
long distence transportation of energy. 

(2) The excitation energy may travel through ,the 
lattice by the migration of an excited electron and its 
holes freed from each other. 

(3) In the exciton mechanism, the transport of 
energy through the lattice takes place in the shape of 
excitation wave. 

(4) Photon absorption is not a localized process but 
may occur over larger distances. 


Although the same mechanism may account for 
energy transport and its transfer such as in CaWo—Sm 
(mechanism 3) and the Zn—S luminiphors (mechanism 
2), this is not indispensable. It is not improbable that 


18N. F. Mott and R. W. Gurney, Electronic Process in Ionic 
Crystals (Oxford Wniversity Press, New York, 1949). 
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Fic. 5. The excited electron falls into an intermediate vacant 
band below the conductjon band. 


excitation energy is transported by say migration of 
electrons and holes, whereas the final transfer is effected 
by, say, a collision process. 
o EA 
2. FLUORESCENCE TRANSFER IN ORGANIC 
MIXED CRYSTALS 


In 1949 Franck and Livingston? summarized the 
facts on fluorescence in organic mixed crystals bril- 
liantly. Since then many more publications have ap- 
peared and considerable new results are now at hand, 
so it seems desirable to discuss the subject afresh. The 
present discussion is confined to organic crystals in 
general and naphthacene in anthracene in particular 
for the following reasons: (a) a considerable amount of 
experimental results and theoretical discussion have 
been published on it which have made it a matter of 
controversy; (b) we are studying the matter in this 
laboratory and some interesting quantitative results 
have been obtained by us and our collaborators. 


(a) Exciton and Absorption 


Interest in transfer of energy from one molecule to 
another in an aromatic solid seems to grow continuously 
from the anthracene-naphthacene problem. Many 
workers” have observed that in the pure crystalline 
state naphthacene hardly fluoresces, whereas dissolved 
in solid anthracene or chrysene, it fluoresces brilliantly 
with its characteristic yellow-green color and the 
quenching of host crystal fluorescence. The quantum 
yield of its photofluorescence increases from 0.04 in a 
pure crystal to about 1 in anthracene.” Bowen” sug- 
gested that the mechanism of transfer of excitation 
energy from anthracene to naphthacene molecules em- 
bedded in the anthracene lattice possesses a close re- 
semblance to that observed in a typical inorganic 
phosphor, and that the energy of excitation travels by 
the exciton mass. On the other hand, Ganguly’s*! 
investigations into the absorption and fluorescence 
phenomena of these aromatic hydrocarbons under 


different physical conditions, particularly the depend- 


1 Dufraisse and Horclois, Bull. soc. chim. 1888 (1936). 
> E. J. Bowen} Nature 142, 1081: (1938). A Aye RA 
21 S, C. Ganguly, Nature 151, 3841 (1943). = ` 
2 S. C. Ganguly, Nature 153, 652 (1944). - i 
” E. J. Bowen and A. H. Williams, Trans. Faraday Soc. 3 


» (1939). í é 
“S. C. G . Chem. Phys. 13, 128 (1945); In 


A J 
Phys. 19, 22B (1945). Doctoral thesis, Calcutta Universit 
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ence of the latter upon the wavelength of exciting radia- 
Í tion suggests that the exciton mechanism is unnecessary 
f in this case and that the enhanced fluorescence of 
i i naphthacene could be related to the excitation of the 
Hi _naphthacene molecules by way of absorption of the 
| exciting radiations as well as the fluorescent light of 
anthracene. This suggestion was based on the following 

į observations: 


(i) When excited by light of wavelength from 4030 
to 4350 A, green anthracene crystal or a green chrysene 


Hi crystal emits the characteristic naphthacene fluores- 
i i cence, except that it is shifted toward the red due to the 
g: solvent effect. Naphthacene molecules absorb these 
Bh exciting wavelengths but the anthracene lattice does not. 


Bil (ii) Since the absorption region of naphthacene over- 
laps the emission region of anthracene (3650 to 5000 A), 
the naphthacene molecules absorb not merely the ex- 
citing radiation 3650, but must also absorb the fluores- 
cence radiations of anthracene.” 

(ili) There is positive evidence for photon absorption 
by naphthacene molecules followed by photon emission, 
and lack of evidence for the exciton mechanism. 


Almost simultaneously Bowen and his co-workers?5~28 
were investigating the effect of naphthacene concentra- 
tions, particle size etc., on the intensities of anthracene 
and naphthacene fluorescence. They observed the 

| following. 


(i) At a concentration of 10° naphthacene molecule 
per anthracene molecule, the green naphthacene 
fluorescence is visible and at 10 naphthacene molecules 
per anthracene molecule it grew strong. Lack of 
f molecular proximity and orientation in liquid solution 
inhibited the transfer so that a green crystal of anthra- 

cene with a low concentration of naphthacene when 
dissolved in benzene did not give the characteristic 
green fluorescence of naphthacene under the same 
excitation. 

(ii) The curves (Fig. 6) relating the intensities of an- 
thracene and naphthacene fluorescence to the cencentra- 
tion of naphthacene in an anthracene crystal could be 
represented by n= 1/1+k« and 1—n=kx/1+kx, where 

n= quantum efficiency of fluorescence of host lattice; 
k=a constant for å pair of compounds in the host and 


e 
ina Aa 153, 663 (1944). 
As 5 e J Poren, Nate Mikiewicz, Nature 159, 706 (1947). 


Paes i ley, Nature 164, 572 (1949). 
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guest relation, excited by a particular radiation; and 
x=guest concentration in g per g of the host. For 
solid solutions of anthracene in naphthalene k was 
found to be 5X10! when excited by 2900 to 32004 
radiations at 20°C, for naphthacene in anthradene 
crystal excited by 2900 to 3200 A, k was found to be 
2X 105 at 20°C. 4 

(iii) The curves of intensities of fluorescence vs 
naphthacene concentrations in anthracene crystal do 
not differ greatly from those in solutions precipitated 
from one solution. When separate mixtures of the two 
are taken, there is produced a great difference. 

(iv) The intensity ratio of the 0—1 and 0—2 bands 
in the fluorescence of anthracene remains unalt:red for 
all concentrations of naphthacene in anthracene 
crystals. 

(v) The intensity of the 0—1 band compared to that 
of the 0—2 band in the fluorescence of anthracene in- 
creases as the particle size is diminished and finally 
approaches that for dilute solution in benzene. 


These observations were regarded by Bowen and his 
co-workers as evidence supporting the exciton mecha- 
nism in these crystals. Later it was shown by Birks’ that 
the observations (i) and (ii) could not lead to this un- 
ambiguous conclusion. 

Lipsett and Dekker’s® investigtions are of interest 
in that, apart from using an improved technique of 
measurement, they studied band-wise the spectral in- 
tensity of fluorescence under x-ray excitation emitted 
by solid solutions of naphthacene, 1,2,5,6-dibenzanthra- 
cene in naphthalene and in anthracene as a function 
of concentration. The relative band intensity of im- 
purity fluorescence increased with the rise of concen- 
trations in the host lattice, reached its maximum, and 
then fell off in a manner shown in Fig. 7, for 1,2,5,6- 
dibenzathracene in naphthalene. Considering the high 
efficiencies of energy transfer at concentrations as low 
as 1.01X10~ mole naphthacene per mole naphthalene 
and 2.93 10~ mole naphthacene per mole anthracene 
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Fic. 7. Intensity of the 4080 and 4250 A bands of 1,2,5,6-di- 
benzanthracene fluorescence relative to the 3410 naphthalene | 
ees band as function of concentration (after Lipsett and 

ekker). 


2 F. R. Lipsett and A. J. Dekker, Nature 173, 736 (1954). 
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for maximum transfer, they discard the absorption and > Taste I. j 
= re-emission as unlikely and regarded the exciton mecha- = = q nae ior | 
nism as more probable. Subsequently, they observed 9 157 078° 0.81 093 102 093 075 030 Ona 
What the intensity-of fluorescence under x-ray excitation —— ~ 2 
of 2nthracene molecules embedded in the naphthalene er z 
lattice increased steadily with concentration a after tion Cis given in Table II. Thus all the quanta quenched 
the waphthalene fluorescence was quenched. They 1” anthracene fluorescence do moi peta a anthran- 
attribùted this complex behavior to a vigorous fluores- nee fluorescence. iere IS Gis ed Ae SENG) 
cence of Anthracene molecules “by themselves in ee M innis Case, Git wrieichs 1 Is maximum, gorey 
addition to the transferred excitation energy from this case. For molecules of smaller-size studied by them 
naphthalerte molecules to anthracene molecules by such as PEASE, pentaceng, perylene, etc., T was 
way of exciton. The anthracene bands at 3505 and 8reater at 10° M than at 107 M: Larger molecules 
3700 A arose primarily from the transferred excitation SUCh as Roan Centl nee gu oaee Gilley e reversed 
energysand those at 3895, 4200 A, etc., from a combi- tendency. The concen raton dependencelof Oo sien 
nation of transferred excitation energy and molecular quantitatively explained by Northrop and Simpsor-o1 ~~” 
fucrescence: the basis of exciton migration. When the matrix lattice 
Recent investigations of fluorescence of solid solu- of the solvent is not contaminated with impurity mole- 
tions of nine aromatic hydrocarbons in anthracene and cules a solvent molecule may return © the ground state 
pyrene by Northrop and Simpson® have supplied sig- by GENUS fluorescence, by nonradiative WEITER 
nificant quantitative information as regards the trans- ©@USINS internal quenching, or by transferring its ee 
fer of excitation energy from solvent to solute molecule. EO ONO tay 0 a like neignbey ane E F Paan 4 
The degree of energy transfer is discussed in terms of two be the probabilities per So 2 ca me 
experimentally determined quantities, namely, the PFOCESSES) respectively. ona ë E Ti ae 
quenching factor and the transfer efficiency, defined by CONES UO) IONE through thes crystas sun i Ewes 
E its life it is potentially able to cause fluorescence, 
QO=(S—S’)/S and T=I/(S—S", they obtained the total probability of emission of 
where S= integrated emission of pure solvent molecules, Ba ce) Gas ay Pa ee moce a 
* S'æintegrated emission of a pure solvent containing pe foreign molecules of concentration c, there is a 
solute molecules, and J=integrated emission of solute probability, q, say, that the exciton may be captured 
molecules. These investigators found (Fig. 8) that for Gurineies proparation by an adjacent foreign molecule. 
anthanthrene dissolved in anthracene the quenching This LL a the probability of emission by a solvent 
factor is proportional to concentration up to 107? M ake 
and at higher concentration becomes less than what is S'=a/a+6+ycq, 
expected from the proportionality relation. The be- , ; 3 
havior of transfer efficiency with increasing concentra- where s and s” are proportional to S and S’. Therefore, 
À SS! OEO é 
o Sa s a+b 
400 ; : 3 : : 
which predicts that Q is proportional to ¢ in agreement 
with the experimental results. 
+ The investigations of Northrop and Simpson point 
out that there is no correlation between quenching and 
4 the overlap between solvent emission and impurity 
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, Fic. 8. Dependence of the quenching factor Q on the concentration 
* of impurity (after Northrop and Simpson). 


2 D. C. Northrop and O. Simpson, Proc. Roy. Soc. (London) 
A234, 136 (1956). ; 
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emission and impurity absorption. There is very little 
overlap in the case of iso-violanthrene embedded in the 
anthracene lattice and yet iso-violanthrene quenches 
anthracene fluorescence strongly with Q=11 and 31 at 
C=10— and 10 M, respectively. Naphthacene has a 
very strong absorption overlapping the emission of 
pyrene but there was observed no quenching or transfer 
when at high concentration. These facts are of great 
importance as they rule out the possibility of energy — 
transfer by colision processes. The probability of energy NE 
by collision is proportional to f~ E(4) A (v)dv, where - 
E(v), A (v) are the fluorescence spectrum of the, excited Me 
molecule and the absorption spectrum of the acceptor — 

molecule. Er 
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One of the glaring features of early investigations of 
energy transfer is that the question whether transference 
oi excitation energy by way of the exciton in aromatic 
crystals is feasible in view of weak intermolecular bind- 
ing in them, was not attacked by a consideratior of the 
nature of electronic states in these crystals. Ir spite of 
the works of Heller and Marcus,!” some’? had the idea 
that exciton propagation required strong intermolecular 
coupling. As there existed at that time little evidence of 
such intermolecular coupling in these crystals, often 

the possibility of exciton propagation was ruled out. 

Theoretical works on the electronic states of molecu- 

lar crystals, begun not long ago, are all based on the 
elementary theory of excited states of a lattice of inter- 
acting atoms or molecules given by Frenkel, later by 
Slater and Shockley!®; and improved by Peierls“ to 
include the effect of lattice vibrations. One of the 
principal results of this theory is that in aromatic 
hydrocarbon crystals beneath conduction bands there 
occur nonconducting excited states called exciton bands. 
Just as the conduction bands are analogous to ioniza- 
tion levels of atoms or molecules, these excitation bands 
in a crystals are analogous to the excited states of free 
atoms or molecules and are characterized by the fact 
that the optical excitation to these bands is accom- 
panied by propagation of this state of excitation from 
one lattice constituent to another, by what is called 
an “exciton.” 

Nonconducting excited states or exciton states as 
they are generally called, arise in two ways. In weakly 
bound solids (molecular crystals such as anthracene, 
etc.), the exciton bands correspond closely to the lowest 
excited states of the constituent atoms or molecules in 
the free state. Secondly, the strongly bound ionic or 
covalent crystals,-where the constituents have lost 
their individuality, arising in the ionic case from the 
removal of an electron from one atom to a neighbor, 
or by the sharing of electrons with neighbors in the 
covalent case. An application of Frenkel’s exciton theory 
to the calculation of energy states of molecular crystals 
was made first by Davydov.*! He envisaged a weak 
coupling model of a molecular crystal characterized by 
an intermolecular interaction energy small compared 
to the intramolecular energy so that crystallization does 
not appreciably affect the electronic structure of a 

molecule. This is required by the fact that the broad 
features of the visible and ultraviolet spectra, are but 
little changed by the process of crystallization. One of 
{ze most significant results reached by his calculations 
is that from each excited level of a free molecule there 
j nd of energy levels in the crystal. And 
ortant, corresponding to a transition 
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nondegeneracy of the free molecular levels. This results 
in multiplicity of lines corresponding to one optical 
transition in the free molecule. This effect, called 
Davydov splitting, must be carefully distinguished fror- 
the two other well-known effects of crystal field. Orr: of 
these is the splitting of levels that are degenerate in 
the free molecule by the crystal field. This effect has 
been discussed by Bethe. It has found wide application 
in the interpretation of the spectra of rare earth 
crystals. The other is the occurrence of crystal-field- 
induced transitions which are altogether forbidden in 
the free molecules. 

We now discuss the extent to which the exciton 
effects occur in the crystals of aromatic hydrocarbons 
in accordance with theoretical predictions. 


(a) Benzene Crystal 


The benzene crystal being the simplest of the aromatic 
crystals, its importance in interpreting the nature of 
the solid state of the aromatic hydrocarbons is unique. 
The electronic spectra have been investigated both 
experimentally and theoretically. The first transition 
in solid benzene was first observed by Kronenberger. 
The spectrum exhibited some degree of fine structure 
not accountable without assuming some crystal effects, 
either Davydov splitting, or combination with lattice 
modes. Recently Broude et al. have investigated the 


absorption spectrum of single crystals of benzene using ^ 


polarized light and detected two perpendicularly polar- 
ized lines, separated by 25 cm~: and corresponding to 
the lowest excited singlet state of the free molecule. 
Inasmuch as these workers identified the crystal axes 
by comparing the intensities of light absorbed by the 
crystal in these directions, it is desirable that their 
results should be confirmed by similar experiments 
with crystals in which the axes are independently de- 
termined. The fluorescence spectrum of solid benzene 
at —253°C and at 180°C, investigated by Kronen- 
berger? and Starkiewicz*! exhibits frequency intervals 
which have been tentatively ascribed to combination 
with lattice modes. These experimental observations 
are understood best by comparison with theoretical 
predictions on the electronic spectra of tht benzene 
crystal made by Davydov,*! Winston,’ and recently by 
Fox and Schnepp** on the assumption that’ the lower 
excited states in the free molecule are converted into 
exciton states in the crystal. According to Davydov’s 
calculations the state By, of the free molecule gives rise 
to the representations A1, (inactive), Bz, (b polarized) 
and the molecular state represented by Bo, to Bu 
(a polarized) and B3, (c polarized). Winston confirmed 


#2 A. Kronenberger, Z. Physik 63, 494 (1930). 

3 Broude, Medvedev, and Prikhotjko, J. Exptl. Theoret. Phys. 
(U.S.S.R.), 21, 665 (1951). 3 

34 J. Starkiewicz, Compt. rend? soc. polon. phys. 4, 201, (1929). 

35 H. Winstor, J. Chem. Phys. 19, 156 (1951). 

38 D. Fox and O. Schnepp, Phys. Rev. 96, 1196 (1954) ; J. Chem. 
Phys. 23, 76, 7 (1955). 
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these results and also found that the forbidden 0—0 
transition of the free molecule is not prohibited in the 
crystal. Fox and Schnepp calculated the exciton band 


Dh 
Unergies and band widths in the benzene crystal corre- 


spoading to tht By, and the Bo, states of the free 
molecule, to which dipole transitions from the ground 
state are allowed. They used the x-ray determination 
of the benzene crystal structure made by Cox and 
Smith,” more recent, and exciton wave functions, more 
general, than those used by Davydov. Their results 
are tabulated in Tables III and IV for exciton functions 
constructed. from molecular states of symmetry Bi, 
and B2,. The foregoing results belie our expectation of 
identifying the By, and Be, band from polarization 
measurement because in each band the transitions are 
a, b, and c polarized in the same order of energy. The 
band widths are 720 and 58 cm. The two optically 
active levels observed by Broude eż al. with a separation 


TABLE II. Symmetry and energy of states in the Biy band 
and polarization and intensity of transition from the ground state 
to these states. 


Representa- Polarization of Band splitting Intensity 
tion in Doar transition in em ratio 
Bou b 180 8X10 
Bau c — 180 1710-4 
Aiu forbidden +360 0 
lu a —360 29X107 


TABLE IV. Symmetry and energy of translationally invariant 
states in the By, band and polarization of transition from the 
ground state to these states. 


Representation Polarization of 


in Dar transition em 
Bazu C —7 
Bou b +29 
B lu a T 29 
Aiu forbidden +7 


of 25 cm™ confirm the splitting qualitatively though 
not qualitatively and thus lend experimental support 
to the exciton nature of these bands in the benzene 
crystal. 


Yb) Naphthalene Crystal > 


Quantum mechanical treatment of the naphthalene 
crystal on the basis of the exciton theory has been 
carried out by Davydov and more recently by McClure 
and Schnepp.*® The results are that the free molecular 
energy levels are split by crystal forces, each into œ and 
B components such that transitions from the ground 
state to them are perpendicularly polarized. The exciton 
theory intensity figures for these transitions in terms of 
free molecular intensity are given in Table V. 

The a- and b-axis polarized absorption spectra of 

37 E. G. Cox and J. A.S. Smith, Nature 173, 75 (1954). 


38D, S. McClure and O. Schne } Chem. Ph s. 23, 1575 
(ios, pp; J. y: 
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= t TABLE V. > a 
—— 
Symmetry Intensity 
Molecular F. group a axis b axis ro 
Bau A, (8) 0 0.045 
Bau B.(a) 0.19 0 
Bai Aula) (0) 0.76 
Bou = B.(8) (0) 1025 d 0 
TABLE VI. , : 
Upper 0—0 band cm F. group 
electronic splitting 
state a pol 6 pol gas em7t 
B3u 31 476 31 642 32 021 166 462 
Bou 33 783 33 610 35 910 170 = 
or or or 
33 460 320 2290 


single crystals of naphthalene were investigated by 
Prikhotjko,® Craig and Lyons,® and more recently by 
McClure and Schnepp. The latter workers made an 
attempt to interpret the observed structure of the a- 
and b-polarized spectra at 20, 77, and 298°K using 
single crystals of naphthalene of thickness 10 mm to 
0.5 u. Table VI depicts the observed effects of crystal- 
lization on the electronic transitions to the upper states 
in a naphthalene molecule. They are displacement and 
dichroic splitting of the 0—0 band in both transitions. 
The observed polarization behavior of the crystal 
absorption is described and compared with the exciton 
theory predictions in Table VII. . 
The fluorescence spectrum of the naphthalene crystal 

at 20°K, investigated in great detail by Obreimow and 

Sabaldas* and more recently by McClure and Schnepp, 
is polarized along the a and b axes of the crystal. Factor 
group splitting of vibrational level absent in the mixed 
crystal is evident in the doubling of the line 1384 cm™ 
in the one and two vibrational quanta and of some 
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SeN: 
=) tea 
Pee ee a a ee eee ee ee et ee ee eee eee T 


TABLE VII. j 

Tempera- Exciton theory i 

Region of ture of Measured predictions 

Observer study cm7! study Is:Iaratio Ag—Bu Ag—Biu l 

Craigand 37000 Room 45 5:1 

Lyons 3150 temp. 

37000 20°K 45 5:1 USM 1:4.2 E 

3150 i 

0—0 band 20°K 100:1 - 

wi 

Craig 37 000 2-3:1 F- 

a 33780 ae 

a l-a a k 

McClure 37 000 Qualitative i q 
and 33780  20°K agreement x 

Schnepp - with above 


a 


3 A. Prikñotjko, J. Phys. U.S.S.R., 8, 257 ( 

U.S.S.R. Ser. Fiz 12, 499 (1948). 
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TABLE VIII. 


Vibrational wave number 


Wave number in cm in cm7? 


Pure Mixed Pure Mixed 
crystal crystal crystal crystal Splitting 
31 060.8 31 551.6 0 (0) 
£29 681,2 30 174.2 1379.6 1377.4 8.2 
29 673.0 1387.8 
28 284.9 28,799.0 2765.9 2752.6 + 6.2 
28 288.7 2772.1 


combination lines involving 1384 cm™ in the pure 
crystal. Measurements on the 1348 series are entered in 
‘Rable VIII. 

Thus the gross features of the absorption and the 
fluorescence spectra satisfy the predictions of the 
exciton theory for molecular crystals. 


(c) Anthracene Crystal 


The energy levels of the anthracene crystal were first 
treated by Davydov as an application of his theory of 
light absorption in molecular crystals. The absorption 
spectrum has been investigated by Obreimov and 
Prikhotjko* at 20°K, by Craig and Hobbins,** Lyons,“ 
and Sidman.** The fluorescence spectrum has been 
studied by Obreimoy, Prikhotjko, and Sabaldas‘® at 
20°K, Pesteil and Barbaron“’ at 14°K, and Sidman‘® 
at 4°K. Craig and Hobbins, following Davydov, cal- 
culated the splitting, shift, and polarization ratio asso- 
ciated with an electronic transition in the anthracene 
crystal. Davydoy’s theory predicts that the second ab- 
sorption system of anthracene will be split in the 
crystal by 16000 cm™ if it were due to a long axis 
transition and by 1000 cm~, if due to a short axis 
transition. Craig and Hobbins investigated the a- and 
b-polarized absorption spectra of the crystal and ob- 
served that the peak of the b spectrum in the second 
system occurred at 2680 A so that for A,—B;, (long 
axis) assignment, the peak of the @ spectrum should 
fall at 1900 A. Though they could not detect it, they 
observed a continuous rise of absorption until the 
spectrum was cut off at 2300 A in a crystal of thick- 
ness 0.1 4 so that the peak was expected beyond 2300 A. 
Recently, Lyons“ reported that experiments with 
polarized light in the vacuum ultraviolet located the 
split components at 2200 A and at 1890 a.u. and the 
absorption is greater perpendicular to the b axis than 
=~ parallel to it. This allows one to interpret the 4890 a.u. 


1, J. plys. radium 15, 92 (1954). 
03,9 1956). 
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absorption as the a component of the vapor A,— B3u 
transition polarized along the long molecular axis. 

On the basis of the exciton theory, Craig and Hobbins 


calculated by first-order perturbation the polarizatig~ 


ratio of the a-b polarized spectrum in the second sys“em 
and obtained theoretically molar extinction coefficients 
in the crystal from the solution values. The predicted 
value was 9100-500 for the b-crystal absorption, taking 
the solution extinccion coefficient as 232 000 while the 
experimental value obtained by Bree and Lyons" was 
9450. The polarization ratio (Is: Ta) 1.8:1, observed by 
Craig and Hobbins for the first absorption system in 
anthracene, later confirmed by Ganguly et al.,” was 
quantitatively interpreted by Craig on the hasis of 
exciton theory modified by crystal-induced mixing of 
two excited electronic states to be due to Ag— Bou. 
This interpretation leads to a value 21000 of molar 
extinction coefficient while the experimental value is 
22 800 as measured by Bree and Lyons. 

Thus, the observed behavior of electronic transition 
in the anthracene crystal confirms the predictions of the 
exciton theory. Calculations of Craig and Hobbins were 
made by considering the dipole-dipole interaction be- 
tween molecules not more than 20A apart. Craig 
showed that the influence of distant molecules made no 
difference in their calculated results because the inter- 
action summation converges within a distance of 20 
A in the anthracene crystal. Fox and Yatsiv® also 
critically discussed the dependence of the calculated 
splitting and shift on the shape, size, and direction of 
propagation of the exciton wave packet. They found 
that the observed splitting could not lead to an un- 
ambiguous long axis assignment for the second absorp- 
tion system since by reasonable assumptions as to the 
shape and size of exciton wave packet, one may inter- 
pret the observed splitting on the basis of a short 
axis assignment while by these assumptions one may 
interpret the observed frequency shift as confirming 
long axis assignment. 

The long wavelength absorption and fluorescence 
have been investigated in greater detail in the ad plane 
at 4°K by Sidman.*® The absorption spectrum is found 
to consist of at least two regions, one which extends 
from 25 400 to 30 000 cm™ in both the á- and 6-polar- 
ized spectra and the other which extends+!ownward 
from 2540 to 3000 cm and appears only in the b 
polarization. The former, which consists of strong, 
broad absorption bands, the first band peak being 
located at 25 400+20 cm™ by Sidman, at 25 380 cm™ 
by Obreimov and Prikhotjko,*! and at 25 350 cm™ by 
Craig and Hobbins, as electronic vibrational transition 
to the two transitionally allowed levels of the free ex- 
citon band derived principally from La, Bzu state of an 


49 Ganguly, Chaudhury, and Mukherjee, 44th Session, Indian 
Science Congress (1957). ‘ 3 x 

5 D. Fox anď S. Yatsiv, J. Chem. Phys. 24, 1103 (1956). 

5 I. W. Obreimov and À. F. Prikhotjko, Physik. Z. Sowjet- 
union 11, 948 (1936). 
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anthracene molecule. The latter regions which consist 
of many weak bands and lines such as at 24 809, 24 836, 
and 24989 cm™ detected by Sidman is assigned by 
‘Sidman to: transitions from the ground electronic state 
to the trapped exciton states of the crystal, predicted by 
Frenkel? and Davydov.‘ In addition to the phenom- 
enonf propagation of a free exciton, Frenkel considered 
the case where the crystal lattice may suffer distortion 
in the neighborhood of a migrating exciton. His result 
is that the exciton may be trapped at the site of the 
local distortion produced by it, if the energy of the 
trapped exciton is less than that of the free exciton. 
The low intensity of absorption in this region is caused 
by a onvertical Franck-Condon transition of the 
lattice, since the orientation of the molecule in the 
lattice is different in the upper and lower electronic 
states. 

The fluorescence spectrum as observed by Sidman 
at 4°K consists of broad bands superimposed with 
sharp structures. The fluorescence spectrum has several 
origins, principally below 24975 cm~, which do not 
coincide with the origin of strong absorption at 25 400 
cm. This gap of several hundred cm™ between the 
origins of fluorescence and absorption has been verified 
by all workers and is significant. It shows that radia- 
tionless transference of excitation energy from free 
exciton levels to trapped exciton levels is rapid so 

œ that no fluorescence takes place from the free exciton 
levels at 25 400 cm. This gap between the origins of 
absorption and fluorescence has been found to exist in 
crystalline naphthalene and phenanthrene and is simi- 
larly accounted on the basis of the trapped exciton 
theory of Frenkel. Sidman also points out that calcula- 
tions of Craig applied to Frenkel’s prediction show 
that the relative shift of electronic transition in vapor 
mixed and pure crystals can be interpreted as a semi- 
qualitative proof of the existence of trapped exciton. 


(d) Naphthacene Crystal 


The first absorption system of tetracene single 
crystals under light polarized parallel and josgeanetenten: 
to the b axis of the crystal was investigated by Achis." 
The maxima in one spectrum are displaced from those 
in other, so-that Davydov splitting is compared. There 
seems to be no published absorption data on the second 
and third systems. Bree and Lyons“ recently investi- 
gated the polarized ultraviolet spectrum of tetracene 
single crystals by the photoconductance method and 
confirmed splitting in the first absorption system as 
shown in Fig. 14. The flatness of the photoconductance 
curve in the second and the third absorption system 
does not give any information about splitting there. 
The absorption and photoconductance of the first 
E S, Davydoy, Lavest. Aka% Nauk S.S.S.R. Scr: Fiz. 12, 608 


5 A. Y. Achis, quoted by A. S. Davydov, Akad. NAN Pamyati 
S. I. Vavilov, 197 (1952). 
5 A. Bree and L»E. Lyons, J. Chem. Phys. 22, 1630 (1954). 
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TABLE IX. s x 3 
Peak parallel to A® 19 235 20 715 22 320 
b direction P 1916050 20750450 22270-4300 
Peak perpendicular A 19810 21 145 22 545 
to b direction P 19860+50 21190+50 (22 3702400 
A 2 
Splitting between A 575 430 on 225 
components P 700+100 440+100 | 


a 
os 5> —y 


a A =absorption spectra. 
b P =photoconductance curve. 


system are entered in Table IX. Recently, the absorp- 
tion and fluorescence spectra of tetracene in solid sub- 
stitutional solutions at 20°K have been investigated pý 
Sidman.5® The lowest excited singlet state at 20 246 
cm™ has been found to be Boy, La since the transition 
with the ground state is polarized along the short axis 
of the molecule. 


(e) Phenanthrene Crystal 


The a- and b-polarized absorption and fluorescence 
spectra of single crystal of phenanthrene have been 
investigated at 20°K by McClure.**-The absorption 
spectra show that the strong a-polarized 28 613 em 
band and the strong b-polarized 28 660 cm band act 
as if they are the two components of the 0—0 band 
split by intermolecular resonance. The former belongs 
to the Bu representation and the latter to the Ay 
representation of the factor group C2, of the crystal so 
that the factor group splitting is 47 cm. The vibra- 
tional frequency 1378 cm™, a C—C stretching mode 
also becomes split into 1370 and 1386 in the crystal. 
The total width of the exciton band is 280 cm™, the 
band beginning at 28540 cm™ and terminating at 
28 820 cm™. 

The weak absorption bands at 28 315 and 28 408 em 
on the red side of the 0—0 band are not regarded by 
McClure as a part of exciton band. Similar bands also 
occur in the crystals of anthracene and stilbene as ob- 
served by Sidman. The fluorescence originates at or near 
the weak absorption band at 28 315 cm™. Sidman has — 
recently interpreted these low lying emitting levels as 
the trapped exciton bands of Frenkel. ` ete 


(f) Hexamethyl Benzene Crystal 


The molecular plane in the hexamethyl ber 
crystal lies almost parallel to the ab plane of the 
the b axis‘of the crystal being in the plane of the 
ring. The absorption spectra of the hexame 
under polarized light were investigated 
el al.® 51 They concluded that the ae d lig 
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in the plane of the benzene ring; though they report a 
definite absorption perpendicular to the ring planes, 
amounting to about ten percent of parallel absorption 
at room temperature. A detailed investigation of the 
absorption parallel and perpendicular to the ring plane 
was recently made by Schnepp and McClure® at 
high re8olution at 20°K in the 2800 A region. The 
hexamethyl benzene crystal contains one molecule per 
unit cell and because the excitation exchange energy 
between the neighboring molecules should be small, one 
can expect that the spectrum should be free from all 
the complications resulting from the presence of two or 
more molecules per unit cell and that the spectrum 
~~ hibits molecular properties and not crystal properties. 
The analysis shows that the 0—O transition placed at 
35 136 cm™™ is much stronger under light polarized 
perpendicular to the benzene ring than under light 
polarized in the plane. The in-plane absorption is due 
to a forbidden transition. Detailed quantum mechanical 
calculations show that the observed behavior, of the 
0—0 band under polarized light is caused not by crystal 
effects but by the destruction of molecular symmetry 
by the unsymmetrical positions of the methyl groups in 
the Ss structuré. This distortion gives rise to weak 
perturbation of the m electron state by the perpen- 
dicular non-r electron states. 

The fluorescence spectra, investigated by Schnepp 
and McClure at medium resolution, exhibit polariza- 
tion of bands but not as complete as in absorption. 
As the 0—0 band, calculated from the fluorescence 
spectrum by use of Raman frequency, is located at 
35 140-8 cm™, it is concluded that both absorption 
and fluorescence originate from 35 136 cm. If this is 
so, then the crystalline band in this crystal must be 
either very narrow or the allowed level of the band must 
lie very near the bottom of the band. 

Thus nonconducting excited states or exciton states 
do occur in most of the insulating aromatic crystals 
investigated as a result of excitation in the fundamental 
absorption band. The excited regions—called excitons— 
are believed to be mobile and capable of transporting 
energy through a crystal. The existing experimental 
evidence for exciton migration in these crystals rests 
mainly on indirect observations. 

The work of Apker and Taft,**’ shows that excitons 
are able to create and destroy F centers. Exciton migra- 
tion has been shown by Northrop and Simpson? to 

account quantitatively for the fluorescence quenching 
of aromatic hydrocarbons containing impurities ; Sid- 
man*® has presented evidence derived from differences 
in the absorption and “fluorescence spectra of anthra- 
h implies that excitons can be trapped 
ns. All these experiments depend 


on the interaction of excitons with particular centers of 
i ] migration of ex- 
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citons through the anthracene lattice has been recently 
demonstrated by the investigation of Simpson.” He 
devised a method which enabled him to measure the 
exciton motion directly. The observed diffusion lengt 
in the case of anthracene is found to be 460 A. Irfan 
isotropic medium this would correspond to a root-mean- 
square displacement of 1120 A between the poirts of 
origin and decay of an exciton. ; 

Dexter“ points out that a phosphor. containing 
luminescence centers can be used as a detector of ex- 
citons. Thus it is possible to detect the diffusion of 
excitons from an activator free crystal into a contiguous 
crystal, if the latter contains centers capable of trapping 
the excitons. One surface of a thin specimen is ¢overed 
by an exciton detector, the opposite face of the specimen 
is illuminated in the fundamental absorption band. 
The luminescence emitted by the detector is drived 
partly from excitation by that fraction of the primary 
radiation not absorbed in the specimen and partly from 
excitons originating in the specimen. A separate meas- 
urement of the transmission of the specimen suffices 
to separate the two contributions; hence to determine 
the flux of excitons into the detector. 

The allowance for reabsorption of anthracene fluores- 
cence by naphthacene, was found by measuring the 
emission from specimens which were made with the 
anthracene and detector deposited on separate glass 
plates, and then put together with a small air gap © 
between. In this arrangement transfer by exciton 
diffusion was suppressed. 


4. INFLUENCE OF CRYSTAL SIZE ON THE SPECTRAL 
NATURE AND DECAY TIME OF FLUORESCENCE 


Though the matters discussed in the previous section 
bear testimony to the exciton nature of the electronic 
bands and to the migration of excitation energy by way 
of excitons, some experimental facts give evidence of 
energy migration by photon reabsorption. In this case 
efficiency of fluorescence escape from aromatic crystals 
shows a dependence upon crystal size, in contradiction 
with predictions of exciton theory. 

The long-known fact that overlap of emission and 
absorption spectra of a system is apt to alter the spectral 
character of fluorescence due to self-absorption and re- 
emission, is given proper consideration in ‘the vdse of 
organic crystals by the investigations of Kortum and 
Finckh,” Bowen and Lawley,® and Little and Birks.“ 
They show that the transmitted photofluorescence 
spectrum of an organic crystal is critically influenced 
by crystal thickness. That this fact cannot be ignored is 
evident from the fluorescence spectrograms of anthra- 


© O. Simpson, Proc. Roy. Soc. (London) A238, 402 (1956-1957). 
a D. p Dexter, J Snem. Phys. 21, 836 (1953). ; 
. Kortum and B. Finckh, Z. physik. Chem. (Leipzig) B52 

XE (0), ; physi em. (Leipzig) B52, 

“E. J. Bowen and P. D. Lawley, Nature 164, 572 (1949). 

“ W. A. Little and J. B. Birks, South African Association for the 
Advancement of Science Congress, Cape Town, South Africa 
(July, 1952). ah a 
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cene, /rans-stilbene, para-terphenyl, and diphenylacetyl- 
ene for thick crystals recorded by Koski,®* Koski and 
„Thomas, 66 and Sampson oi 
- Birks ‘and Little®? studied the spectral change in the 
miesenie of each of these compounds as they 
changed the fluorescent specimen from single crystals 
of 1-¢m thickness to microcrystalline layers deposited 
on a thin glass plate by evaporation of a dilute solution 
of the compound in xylene and the mode of observation 
from transmission to reflection. The fluorescence spec- 
trum of anthracene (Fig. 9) gets extended into the ultra- 
violet as far as 3700 A and about 80% of this primary 
fluorescence suffers reabsorption in a one centimeter 
thick ¢rystal. These observations were carefully re- 
examined by Birks and Wright®, who used improved 
technique and made relative quantum intensity meas- 
urements on one centimeter thick crystal specimens and 
very thin microcrystalline layers deposited on quartz 
plates. They observed that the microcrystalline fluores- 
cence spectrum corresponds to the complete molecular 
emission; 76% of the primary molecular emission from 
a 1-cm thick anthracene crystal could not escape re- 
absorption so that the probability of escape of molecular 
fluorescence, K was 24. In a thick /rans-stilbene crystal 
40% of the molecular fluorescdnce was reabsorbed. 
These investigations lend support to the photon 
cascade theory formulated by Birks’ and later on 
æ supported by Fowler and Roos.” Based on the fact that 
the absorption coefficient of an organic crystal is high, 
exceeding 10% in the first absorption band system of 
anthracene, and that there is overlap between the emis- 
sion and absorption spectra, this theory proposes that 
in a thick crystal, the part of the primary molecular 
fluorescence which is emitted in the region of absorp- 
tion is practically reabsorbed in a path length of a few 
wavelengths. This reabsorbed radiation is re-emitted as 
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fluorescence, a fraction of which is again absorbed by 
succeeding layers, these processes of absorption and 
emission following each other until all the excitation 
energy either escapes from the crystal or is dissipated 
thermally by internal conversion. Thus the inter- 
molecular energy transfer takes place by expansion of 
the excitation column, not by the mechanism of exciton 
migration but by the absorption and reabsorption of 
the primary fluorescence. This, is an extension of 
Ganguly’s* theory of transfer of energy from the 
anthracene lattice to the naphthacene molecules em- 
bedded in it through absorption of anthracene fluores- 
cence by naphthacene molecules. But our recent meas- 


urement on the coefficient of absorption of naphthacens—~ 


shows that the probability of naphthacene excitation 
by absorption of fluorescence of anthracene is remote. 
The photon cascade theory seems to be strengthened 
by surprisingly few existing photofluorescence decay 
time measurements of organic crystals. Liebson” found 
that for anthracene the photofluorescence decay time 
ty changed from 2.0-+0.5 musec for a 4% solution in 
benzene to 17 muysec, for a crystal. For terphenyl it is 
increased from 2.50.5 musec for a 4% solution to 11 
musec for a crystal. For stilbene and diphenylacetylene 
crystals, they measured ty to be 3.1 and 2.5 mysec, 
respectively. The effect of crystal size on ¿p was in- 
vestigated by Birks and Little” who used the phase- 
shift method and observed ¢; for thick single crystals 
exceeded by many times that for microcrystalline 
specimens. The data for anthracene and terphenyl 
represent two extreme cases. Whereas in the former the 
ty increases from 3.5-+1.0 mysec for microcrystalline 
specimens to 14+2.0 musec for 1-cm? single crystals, 
in the latter it changes from 3.5--1.0 mysec for powder 
to 3.81.0 sec for 2-mm thick single crystals. Account- 
ing for the observed increase in decay times in crystals 
by self-absorption and re-emission of the primary 
molecular fluorescence, they calculate the decay time 
ty of technical fluorescence at 12.6+3.5 mysec which 
agrees, within experimental error, with the experimental 
value 142.0 musec. They calculated the decay time 
(ts)oo of primary molecular fluorescence to be 3.1+-0.8 
myusec in agreement with the experimental value 
3.5+1 musec. Bose et al. have observed that under 
x-ray excitation the decay times of anthracene and 
naphthacene emissions in a green anthracene crystal 
isolated by suitable filters were practically identical. 
Splendid as the success of the photon cascade theory 
seems tq be in interpreting the energy transfer mecha- 
nism in pure aromatic crystals, it is not so in mixed 
crystals. In fact, in mixed crystals it is more qualitative 
than quantitative and must be considered with caution. 
The basic fact utilized by the theory, viz., that the 
primary photons emitted in the regioni ef the first ab- 


, PS. H. Liebson, Nucleonics 10, 41 (1952). 


133 J. B. Birks and W. A. Little, "Proc. Eby Soc. (Lofidon) A66, SE A 


921 (1953). 
u H., N. Bose et al., Proc. Natl. Inst. Sci. Tndia 1e, 397 To 


` CC-0. Gurukul Kangri University Haridwar Collection. Digitized by S3 Foundation USA 


> 


Co GIANG Wil Ww 


ry 


{i sorption band system of a crystal are reabsorbed com- 
H pletely in this region due to very high absorption coeffi- 
cients (10?—10%) is violated in the case of energy 
transfer from the host lattice to impurity molecules in 
mixed crystals. In spite of very high absorption coefħ- 
cients of impurity molecules the effective absorption by 
a mixed ¢rystal in the region of impurity absorption is 
much short of being total, unlike in pure crystals, 
because of the very low concentrations (10-'—10-°) 
mole solute/mole solvent at which the transfer of 
energy is almost total. The fluorescent spectrum studied 
by reflection, instead of transmission, to eliminate the 
small absorption effect of a thin microcrystalline layer 
__ of anthracene containing low percentage of naphthacene 
deposited from xylene solution on a fused quartz plate 
exhibits the naphthacene fluorescence bands in high 
intensity.”® The result of drawing a parallelism between 
absorption effects due to the host lattice and impurity 
molecules is evident from the following considerations. 
The relation 
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where g-z= technical fluorescence quantum efficiency of 
the host crystal X, K,=probability of photon capture by 
the impurity y, K.z=probability of photon escape in X 
crystal fluorescence, and [F ]= concentration of Y, de- 
rived on the photon cascade theory in the absence of 
internal quenching, is similar to V,=1/(1+K[Y }), 
where V,=technical quantum efficiency of X-crystal 
juorescence, K=exciton capture probability of Y 
‘elative to that of X, derivable from the exciton theory. 
In both of them the important quantity is the energy 
transfer parameter k which determines the efficiency of 
the process. The value of k determined by Bowen et al. 
from the experimental N, vs [Y ] curve for anthracene- 
contaminated naphthalene crystals excited by 2900 to 
3000 A was 5X10‘ and the value obtained by Wright 
who used 2540 A as the exciting radiation was 4X 101. 
In the relation K..+K.+K,, [Y ]=1 where K,=pho- 
ton capture probability of X, given by the cascade 
theory, putting Y=2.5xX10-* mole, anthracene per 
molar naphthalene for equal intensities of the com- 
ponents and assuming the most favorable but experi- 
mentally unrealized condition K,=1, we have K..+ Kz 
+2.5X10-5=1. As the theory assumes no interaction 
between the host and the guest molecules, K should not 
differ from its value in a pure crystal. Substituting 
3 K>=0.9565 from scintillation data which always gives 
somewhat larger value, we have Rer= 4.3475 X10 
23 far away from the experimental value. The 
sport for the theory in mixed crystals is Bowen 
‘cxisi® result that the intensities of anthra- 
jacene fluorescence components become 
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equal at 1.8X10-° g naphthacene/g anthracene in 
anthracene-naphthacene crystals and at 9.10— g/g in 
solution, giving a ratio 0.2. The explanation that this 
difference results from the reduction, of the effective“ 
absorption coefficient in solutions due to random Atign- 
ment by a factor 0.25 relative to that in crystals where 
the Y molecules align parallel to X molecules applies to 
all such crystals and is not supported by the experi- 
mental facts that light in a double refracting crystal is 
bifurcated into two perpendicularly polarized com- 
ponents and that the absorption coefficients of a sub- 
stance such as anthracene in solution are greater”? than 
those in crystal by a factor ~8. Moreover, the disagree- 
ment by 25% is glaring. a 


5. SCINTILLATIONS FROM ORGANIC CRYSTALS 


The scintillation property of fluorescent organic crys- 
tals has made them of great use in scintillation counting 
of nuclear radiations, but there seems to have been little 
attempt to understand its nature until the recent work 
of Taylor et al., Franzen et al., Hopkins, Birks, etc.78-82 
We review the fundamental facts in the scintillation 
process because any satisfactory theory of the fluores- 
cent process in these crystals must not ignore them. 
That the energy transfer mechanism in organic crystals 
under photon or corpuscular excitation is qualitatively 
independent of the mode of excitation seems to be borne 
out in Birk’s® experiments. Using crystals of naphtka- 
cene containing different concentrations of anthracene 
he investigated their scintillation response to alpha 
particles. The intensities of naphthalene (N) and 
anthracene (A) emissions were measured separately by 
introducing suitable filters between the crystals and 
the photo tube. His plots of (N) and (A) fluorescence 
intensities, relative to pure crystal, against A concen- 
trations, show that not only the Ņ fluorescence is 
progressively suppressed with A concentration but also 
the quantity suppressed in N reappears in A (Fig. 10). 
The experimental curves were excellently reproduced 
in the theoretical curves obtained from the relation 
n=1/(1-+k«x) derived by him from the exciton theory 
of energy transfer where n=relative intensity of radia- 


Fic. 10. Fluorescepse emis- 
sion from the mixed crystals 
of naphthalene containing an- 
thracene as a function of an- 
thracene concentration. A and 
N represent anthracene and 
o naphthacene fluorescence, re- 
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tions, k= the exciton capture probability of an anthra- 
cene molecule relative to that of a naphthalene mole- 
cule, x= anthracene concentration in moles per mole of 
inaphthalene ; with-&= 43 for æ particles. These curves are 
similar to those’obtained by Bowen et al.* and Wright 
el al. in case of photofluorescence of anthracene in 
naphthalene with k=5X 10%. 

A similar relation was derived by Birks and Blacks’ 
on the exciton to account successfully for the irrepar- 
able deterioration in anthracene scintillation response 
produced by a prolonged intense a-particle bombard- 
ment so as to cause a permanent molecular damage as 
reported by Birks? and Wright.** They regard the 
damag@l molecules as quenching centers just as quench- 
ing impurity molecules in a host lattice and found a 
value 1000 for k which is the exciton capture probability 
of a damaged molecule relative to an undamaged 
anthracene molecule. 

The investigations of the scintillation response S of 
anthracene to protons of H<17 Mev by Franzen et al.,"8 
Taylor et al., and Frey et al.® to electrons of Æ from 
1 Kev to 3 Mev by Jentschke et al.*' and Hopkins,” to 
a particle of H<2.1 Mev by Birks® and Taylor et al.,” 
to deuterons of #<11 Mev by Taylor et al., show that 
the response S of anthracene to an ionizing particle of 
energy / and residual range r, depends on the nature of 
the particle and is proportional to Æ above a certain 

~ enesgy level. Measurements of the relative response of 
other organic crystals such as naphthalene,* stilbene,” 
terphenyl,® etc., and of organic solutions” to different 
particles show that they behave in the same manner 
as anthracene so that the behavior of anthracene can be 
regarded as typical of the organic substances studied. 
The S— curves of anthracene for different ionizing 
radiations are shown in Fig. 11(a). Birks’ investiga- 
tions® deserve special mention in that, unlike others, 
he compared the scintillation response S$ of anthracene 
and other crystals to different ionizing particles, by 
considering not the S—£ curve but the variation of the 


” specific fluorescence ds/dr (expressed in arbitrary units 
of volts/cm air equivalent) with the specific energy 
, loss dE/dr (measured in Mev/cm air equivalent) where 


r (measured in air equivalent) is the residual range of the 
particle withit the crystal. He showed that the non- 
linear Pon of the S—E curves was due to the non- 
linear variation of the specific fluorescence with the 
specific energy loss. At low values of dE/dr (electrons 
of E>125 Kev) ds/dr increases linearly with corre- 
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. (a) Relative scin- 
tillation intensity (S) of an- 
thracene, produced by different 
ionizing particles, as a function 


of their energy (E) (after 
Jentschke et al.). (b) Specific 
fluorescence vs specific energy 
loss for anthracene (after 
Birks). 
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spondence to S being proportional to E. At high values 
of dE/dr (a particles of E<5 Mev) ds/dr is practically 
constant so that the scintillation S is proportional to 
the residual range r of the particle rather than to its 
energy E. This linearity in the S—r curves for anthra- 
cene, terphenyl, and stilbene was found to hold good 
for r about 8 mm. The ds/dr vs dE/dr curve for anthra- 
cene and S—r curve for stilbene are shown [Fig. 11(b) 
and Fig. 12]. 

The exciton theory was extended by Birks® to ac- 
count for the foregoing experimental facts. The passage 
of an ionizing particle through the crystal produces 
along its path a local concentration of damaged or 
ionized molecules and he regards them as quenching: K 
agents for the excitons produced by the i ionizing p: T 
ticle. The linear density of excitons produced ani 
local contentration of nonradiative molecules ai 
point on the particle track are both proportiona 
specific energy loss of the particle and can be 
to A(dE/dr) and B(dE/dr) molecules per 
molecule, “respectively. Denotin, g by 
capture probability of a a damaged: 
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Fic. 12. Scintillation response of stilbene to a particles as a function 
of their air range r (after King and Birks). 


an undamaged one, he deduces the relation, 


ds  A(dE/dr) 


dr 1+KB(dE/dr)’ 


which is reduced to ds/dr=A (dE/dr) at low values of 
dE/dr and to ds/dr=A/KB=constant, at high values 
of dE/dr in agreement with observed facts. The experi- 
mental values of dE/dr and ds/dr used for substitution 
in Eq. (49) were obtained from the observations of 
Birks,*’ Curie,” Livingstone and Bethe,” and Bethe’ 
and gave for A=82.5, KB=7.15 in anthracene. 

The variation of dS/dr with dE/dr calculated from 

(49) agrees well with the experimental curve obtained 
for anthracene by Jentscke et al.5! as shown in Fig. 11. 
The S—£ curve for electrons, protons, œ particles, etc., 
calculated from (49) are in excellent agreement with 
those obtained experimentally.** 

To account for the observed nonlinearity in the S—r 
curves for r<8 mm, Birks’ introduced a multiplying 
factor ¢ in (49) in consideration of the different be- 
havior of excitons produced near the crystal, surface. 
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where @—mean free path of an exciton and £;(r/ao) 
exponential integral. The experimental ¢—, curve for 


electrons in anthracene agrees well with the theoreti- 


cal one. i £ 

The absolute efficiency of scintillation, i.e., the tôtal 
energy of fluorescence emission/total energy of scintil- 
lation particles, of anthracene was measured by Birks 
and Szendrei” to be 3.76 (+0.07)% for thick anthracene 
crystals at room temperature, excited bý fast elec- 
trons. This corresponds to an energy expenditure 
E= 70.5 (3.8) ev per fluorescence photon. The cascade 
theory formulated by Birks gave Æ= 68 ev'per photon 
in agreement with the experimental value. 

Liebson” investigated the temperature dependence 
of the scintillation response S and the scintillation decay 
time ¢ for anthracene from 103° to 283°K. His results 
show that a linear relationship between logioS and 4 
holds good. Birks’? worked out the relationship between 
S and /; on the photon cascade theory in logS=logA 
+ Bt, where A is a constant and B=loggoto, go being the 
quantum efficiency of photon emission, occurring in 
cascade and of molecular decay time /o. Substituting the 
experimental value of B from the graph and go=0.9, 
Birks found ¢=3.8 musec for the molecular decay time 
of anthracene at room temperature in excellent agree- 
ment with the experimental value 3.5+1.0 muysec 
obtained by Birks and Little.” 

Sangster and Irvine” have studied the scintillation 
and photofluorescence behavior of a large number of 
crystalline organic compounds and the utility of Birks’s 
photon cascade theory of the scintillation process has 
been demonstrated. 


6. PHOTOCONDUCTIVITY OF ORGANIC CRYSTALS 


Byke and Bork,” and Volmer™ observed that an- 
thracene crystals begin to photoconduct under the 
irradiation with wavelengths shorter than 4000 A. 
Later works of Putseiko, Akamato and Inokuchi, 
Vartanyan, Inokuchi, Eley et al., Mette and Pick,! 
Northrop and Simpson,” and Lyons and Morris!°% 
have confirmed that not only photoconduction but 
also semiconduction takes place in some aromatic 
crystals. This photoconductivity effect raises the ques- 
tion whether production and migration of holes have 
anything to do with the transport of excitation energy 
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through the anthracene lattice, even though want of 


~ any evidence of ionization in the anthracene lattice led 


Franck and Livingston’ to discard the electron migration 
“mechanism. Recent studies of the photoconductivity 
of 2nthracene have been made by Carswell,!°* Carswell 
and Lyons,” Chynoweth and Schneider,!°% and Bree, 
Carswell, and Lyons." Carswell and Lyons’ experi- 
mental arrangement uses a single anthracene crystal 
sublimed ir? a CO, atmosphere, motinted with Aquadag 
on platinum electrodes and irradiated with mono- 
chromatic radiation. Their important observations are: 
(a) plot of photocurrent vs frequency’ of exciting 
radiations closely reproduces the optical absorption 
spectrdm (Fig. 13), (b) the ratio of the current pro- 
duced with light polarized parallel to the b-crystal axis 
to that parallel to the a-crystal axis was 1.5, using a 
mercury 3650 A light source!!; (c) the photocurrent is 
proportional to the intensity of light; (d) quantum 
efficiency is approximately 10~*, under his experimental 
conditions, constant over the near ultraviolet, and 
nearly equal to the fluorescence quantum efficiency; 
(e) the intentional incorporation of naphthacene as an 
impurity in the anthracene lattice so as to change the 
relative heights of the 5325 A and 4425 A fluorescent 
bands to 37:1 apparently affects neither the photo- 
current nor the time constant." The investigations of 
Chynoweth and Schneider! are of great significance in 
that their results support the adoption in organic 
crystals of the conventional energy band picture usually 
associated with inorganic crystals. The fact that irradia- 
tion of anthracene crystals within the fundamental 
absorption spectrum cannot liberate electrons com- 
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Fic. 13. (1) Solution absorption spectrum of anthracene. (2) 
Crystal absorption spectrum of anthracene. (3) Spectral photo- 
current curve in anthracene crystal. Plot of photocurrent vs fre- 
quency of exciting radiations showing reproduction of the optical 
absorption spectrum (after Carswell). 
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Fic. 14. Photocurrent in naphthacene single crystal as a func- 
tion of frequency, in two ranges of wavelengths. In 1, light is 
polarized lb: in 2, Lb; in 3, unpolarized. In the lower diagram 
“||, L” are with respect to a direction which makes 12° with the 
b crystallographic axis (after Bree and Lyons). 
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pletely,’ combined with their observation that the 
majority of the carriers are positive holes suggests to 
them the following mechanism: light absorption raises 
the electrons either directly or through the intermediate 
process of exciton formation, into a metastable level 
from which they can either recombine with their holes, 
emitting fluorescence, or are trapped in a relatively 
stable level. Under the action of the field the holes 
drift through the lattice, but the drift velocity depends 
upon hole traps of different depths. Under steady-state 
conditions the rate of creation of carriers equals the 
rate of recombination between positive holes in the full 
band and the trapped electrons, as also the rate of 
trapping of carriers equals the rate of detrapping. 

If this mechanism is to be adopted for the emission 
of characteristic fluorescence of anthracene, it remains 
to be seen how it explains the want of reproduction 
of some photoconductivity phenomena in corresponding 
fluorescent phenomena as noted in (d) above. Leaving 
aside the question of transport of energy through the 
lattice of anthracene, the observation (e) made by 
Carswell suggests that immediate transfer of energy to 
naphthacene or pentacene in an impure anthraceze 
crystal takes place by a mechanism other than hole 
migration. 5 

The correspondence between the optical absorption 
spectrum and the photoconductance curve of an anthra- 
cene crystal, as observed above, led Bree and Lyons 


sto study the ultraviolet polarized spectrum of tetracene `- 
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Fic. 15. Wavelength dependence of photocurrent in sandwich 
cell; (a) illuminated electrode positive, (b) illuminated electrode 
negative, (c) approximate edge of photosensitivity in surface cell 
(after Compton et al.). The observed effect produced by reversal 
of polarity of electrodes indicates that the charge carriers are 
mostly positive holes. 


‘single crystals by the photoconductance method. They 
observed that the first system is definitely b-axis polar- 
_ized and exhibits Davydov splitting (Fig. 14) as dis- 
cussed in Sec. 3. Lyons and Morris! investigated the 
_ spectral dependence of photocurrent in naphthalene, 
: phenanthrene, diphenyl, and #-terphenyl and com- 
pared it with the solution absorption spectra in the 
compounds. The photoconductivity takes place in the 
absorption region and not outside it. The measurements 
of Riehl," Bree, Carswell, and Lyons” suggest that 
no true bulk photocurrent in anthracene and 
the photoconductivity in anthracene is a surface 
enomenon. This led Lyons!‘ to explain the spectral 
lependence of photoconductivity in a surface cell on 
le assumption that the light energy absorbed by the 
eates excitons, and that only those excitons 
migrate from their points of origin to the sur- 
r can generate charge carriers. The number 2 


hz], (50) 
1—g 


the optical density of a molecular 
y of the incident light, ¢ is the 
y that an exciton crosses a layer of melecules, 
a constant depending upon the specimen. If 
1 glecting higher powers of £, n=KI)h/ 
otocurrent is given by 
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applied voltage. The photocurrent is linearly dependent 
on (a) the optical density of the crystal, (b) the incident 
light intensity, and (c) the applied voltage. The linear ~~ 
dependence on the incident light intensity and on the 
applied voltage was demonstrated by the inves- 
tigations of Carswell and Lyons! and Chynoweth and 
Schneider, respectively. Lyons" showed that, in the 
case of the tetracene crystal, the plot of optical density 
of the crystal against photocurrent was a straight line 
and thus exhibited the predicted dependence. 

A recent detailed investigation of surface and bulk 
photoconductivity of anthracene was undertaken by 
Compton, Schneider, and Waddington."® They used 
three types of crystals, namely, large scintillation 
crystals, thin layer crystals grown from dimethyl- 
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Fic. 16. (a) Photocurrent as a function of field strength (surface 
cell) ; ey Photocurrent as a function of applied voltage (sandwich 
cell), (a) illuminated electrode positive, (b) illuminated electrode 
negative, crystal thickness 2 m (ordinate scale values are multi- 
plied by ten) (after Compton et af.). § 1 
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formamide solutions, and sublimation flakes, and prac- « 7 > 
tically got rid of the space-charge effect, observed by h i: 
“—~__ earlier workers, by using electrodes of colloidal graphite, 
evaporated metals, and silver paste to secure better 
comtacts with ‘the *crystal. To investigate the bulk 
photocurrent, they constructed a sandwich cell fitted 
with a guard ring on the surface separating the two 
electrodes. [One of the electrodes, called the front 
electrode, which was transparerft, conducting and 
receiving the exciting light, was connected to a stabil- 
ized source of power; the other electrode, called the rear 
electrode, was connected to a microammeter. ] If the 
guard ring is earthed, surface current is prevented from 
reaching the rear electrode. 

Besides confirming the reported spectral dependence 
of surface photoconductivity and that the photocurrent 
is mostly caused by the mobility of positive holes, they 2.5 3.5 4.5 5.5 
arrived at the following results: A/T x10? 


“ON 


ACTIVA, ION ENERGY, 
4940 cal mole-1 


w 


ACTIVATION ENERGY, 
4975 cal mole-1 


LOG (PHOTO CURRENT) 


= 


(1) Anthracene crystals exhibit a true bulk photo- Fic. 18. Photocurrent as a function of temperature 
conductivity which is not affected by the nature of (after Compton eż al.). 
ambient gases. The bulk photocurrent in an anthracene 
sandwich cell is much less than the photocurrent in a 
surface cell under comparable conditions. The spectral 
dependence of bulk photocurrent is unlike that of sur- 
face photocurrent in that the maximum photocurrent 
occurs on the long wavelength side of the anthracene 
absorption band (Fig. 15). 
(2) The surface and bulk photocurrent in anthracene, 
generally nonohmic, exhibits an ohmic dependence on 
applied voltage at very low strength (less than 60 v cm 
in a surface cell) and increases at a rate higher than 
linear at higher field strengths. The photocurrent, 2, 
showed no sign of saturation, even with 25 kv cm ina 
surface and 20 kv cm™ in a sandwich cell (Fig. 16). 
Plots of V/z against V, in the case of a surface cell were 
straight lines, indicating that 7 is proportional not to V, 
LOG (LIGHT INTENSITY) as demanded by Lyons’ theory, but to V/(a—bV). 
IN ARBITRARY UNITS (3) In the surface cell the photocurrent shows a 
(a) linear dependence on the light intensity except at 
very high level of intensity, whereas in a sandwich cell 
it varies as the square root of the light intensity 
(Fig. 17). 
(4) The photocurrent displays an interesting de- 
pendence on crystal orientation; it is greatest in the b 
direction, about ten times smaller in the @ direction, 
and one-hundred times so in the c direction. ‘This indi- 
cates that the charge carriers are more mobile in the b 
direction than in the a and c directions by factors 10 
and 100, respectively. mel 
(5) The photocurrent studied in a surface cell or în = 
a sandwich cell over a temperature range from 55 or 
o 4 2 3 70°C to — 130°C in an atmosphere of pure dry argon, 
LOG (LIGHT INTENSITY) follows the relation — 
IN ARBITRARY UNITS = i= ine EIET) ie á ‘> ponus 
‘ "9 ? é 
OR (b) g y p a 
Fic. 17. Photocurrent as a function of light intensity (a) surface where £= 0.172003 ey in a surface celland 
cell and (b) sandwich cell (after Compton ef al.). ev for a sandwich cell (Fig. 18). 
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(6) The photocurrent in an anthracene surface cel! 
exhibits a decay time, which is less than 30X10% sec. 
The decay shows a tail of some seconds, which is about 
10-3 times the original photocurrent. As this varies 
markedly from crystal to crystal, the tail is most likely 
due to flaws. < 

(7) When tetracene or acridine is incorporated into 

an anthracene crystal in the approximate concentration 
10-* mole pet mole of anthracene, it is found that the 
voltage dependence of photocurrent is unaltered, its 
intensity dependence is linear in the surface cell, and 
the photocurrent produced by excitation in the 3800 A 
system is reduced. 
___ (8) There is little evidence for the existence of trap- 
ping levels in an anthracene crystal, as postulated by 
many workers. Irradiation with infrared light produces 
very small effect on surface photoconductivity. The 
electrical analog of the thermoluminescent glow curve 
shows that the concentration of traps of depth from 
0.15 to 0.5 ev is less than 108 cm~ in a pure anthracene 
surface or sandwich cell, and this is not dependent on 
the nature of the ambient gas. These experimental 
results are explained on the following model: The 
carriers of charge, which are positive holes, are gener- 
ated in the bulk as well as on the surface, the carrier 
mobility being greater in the latter case than in the 
former. Lyons’ calculations can be applied with the 
modification that q denotes the probability of charge 
carrier crossing a molecular layer. The photocurrent 
thus remains proportional to the optical density as 
observed by Lyons but the smaller the optical density 
is, the greater is the proportion of carriers formed at 
a good depth below the surface. This gives rise to the 
long wavelength tail of surface current in thick crystals 
and becomes of great importance in a sandwich cell. 
The photocurrent in the bulk is recombination limited, 
i.e., the charge carriers (positive holes) end their life 
by recombination before reaching the electrodes. 

In the surface cell these carriers migrate to the 
surface along the shortest route, but because the surface 
in a real crystal is not perfectly flat, this route at many 
points will not be perpendicular to the field direction. 
Thus, the probability q of a charge carrier crossing a 
molecular layer is affected by V and if the phenomenon 
is averaged over the distance between electrodes, g can 
be expressed as g=a+bV. The equation for 7 becomes 


= KVIE KVIE 
—— SS SSD e 
1—a—bV  a’—bV ; 


AA explains the experimental fact that V/i depends 
linearly on V. The mechanism of formation of charge 
carriers in antlrracene and other organic crystals is not 
clearly undefstood. Though photoconduCtivity is pro- 


duced by absorption of light, it is not a primary product 


of light 
r a anthraceñe crystal 
‘and tiles, « created 
mons 


sorption 
ed are exciton states where the electrons 


by the act of light absorption, mi- 
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grate remaining bound to each other and produce no 
photocurrent. The experimental facts that the quantum 


efficiency for photoconductivity is of the order of 10-4, -—™ 


and that the photocurrent is profoundly influenced by” 
impurities (as observed by Carswell and Lyons!” «nd 
Compton ef al.>), and reduced by neutron irradiation 
(as observed by the latter group), cannot be explained 
on such an assumption. 

In many photoconductors in which infrared irradia- 
tion has little effect on the photocurrent, optical and 
thermal formation of charge carriers requires identical 
energies and the absorption edge occurs corresponding 
to the energy E in the equation for semiconductivity 
o=o oe #/*T, This relation is obeyed by the aromatic!” 
hydrocarbons such as anthracene, naphthacene, penta- 
cene, etc. If the generation of charge carriers is not a 
primary product of light absorption in anthracene, one 
cannot expect that the energy parameter Æ would have 
the energy value corresponding to the absorption edge. 
In fact, the energy 3.2 ev associated with an exciton in 
the 3800 system is greater than the activation energy of 
semiconductance 1.63 ev as found by many workers!:10! 
from the temperature dependence of semiconductivity. 
In all the aromatic hydrocarbons investigated, the 
measured activation energies are lower than the funda- 
mental absorption energy 1#. This produces a difficult 
situation, because the theory of conductivity leads one 
to suppose that all conduction bands must have ener- 
gies greater than Æ. Any doubt as to whether the values 
of the energy parameter Æ derived from semiconduc- 
tivity are really activation energies is removed by the 
fact that there is good agreement between the impurity 
activation energy parameter e and the triplet state 
energies *Z of free molecules. This led Northrop and 
Simpson to conclude that in the absence of an applied 
electric field, the number of electrons populating the 
conduction band is too small to account for the ob- 
served conductivity. Conductivity is due to the fact that 
there is greater population of electrons in the low lying 
bound states and there is a finite probability of ioniza- 
tion from the bound excited states due to the applied 
field. The mechanism is described by them as dissocia- 
tion of the exciton in an applied field. 

The exponential dependence of photoconductivity 
on temperature suggests that the charge Carriers are 
formed through a molecular triplet so that for the 
actual separation for the uncoupled electrons, thermal 
energy is required. This explains the energy parameters 
E as activation energy having the same value 0.17 in 
both types of cell. 

The question of the origin of the charge carriers has 
been somewhat controversial and much of the contro- 
versy derives from the effects of gases on the photo- 
current. It has been observed by many"® that the sur- 
face current is profoundly enhanced due to the presence 


because the first excited states of |. of adsorbed gases such as Oz, SO», etc. This Jed Bree 


us H, J. Zinzer, Z. Naturforsch. lla, 306 (1956). 
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and Lyons to suggest that oxygen serves as a catalyst 
in the formation of a triplet level which reacts to form 


™ the biradical intermediate AOO. When the AOO disso- 


n 


~ 


a 


àcłates into ions, charges are liberated and the positive 
holgs nove through’the lattice. This scheme of forma- 
tion of charge carriers is, however, not properly equipped 
to explain the origin of bulk photocurrent established 
by Compton e/ al. who regard fluorescence and photo- 
conductivity as processes competing with each other 
when an exciton ends its life. This is supported by 
experimental investigations of Zinzer!™® who found that 
prolonged a-ray bombardment reduces the fluorescence 
and the surface photocurrent and by those of Compton 
el al. who observe that neutron bombardment reduces 
both bulk and surface photocurrent as well as fluores- 
cence. Compton eż al. observe that the incorporation of 
tetracene into anthracene lattice also reduces the photo- 
current in the 3800 system of anthracene, though 
Carswell and Lyons find that introduction of 10-* M 
tetracene into anthracene increases the photocurrent, 
especially in the region of the second electronic transi- 
tion (40 000 cm). Many workers!” observed that in- 
corporation of tetracene or acridine into the anthracene 
lattice suppresses or quenches the fluorescence. The 
quantum efficiency of photoconductivity of anthracene 
is 1. The former figure may mean that the range of 
charge carriers is small compared with the electrode 
separation which is suported by the observed lack of 
saturation in the photocurrent. A knowledge of the 
carrier mobility is required to convert detected currents 
into quantum efficiency of charge formation, and it 
remains to be seen whether there is quantum equiva- 
lence between absorption and photoconductivity. 


7. SOME NEW RESULTS 


The contents of the Secs. 3, 4, and 5 seem to indicate 
that, in most of the aromatic crystals investigated, the 
energy of excitation can migrate away from the place of 
excitation in the form of excitons, and that if the exciton 
is annihilated deep in such a crystal with the emission 
of fluorescence radiation, the emitted radiation may be 
absorbed in a photon cascade process by the crystal 
having imtense absorption band in the region of emis- 
sion. But the’sections do not furnish any detailed, in- 
formation régarding the place of terminus of the energy 
transport, and it is not known how the final stage of 
the transfer is executed in a pure crystal, nor in an 
impure one. Polarization properties of crystal absorp- 
tion and emission are expected to yield valuable in- 
formation in this regard. If the crystal fluorescence is 
due to the reverse of photon absorption, fluorescence 
should give the same polarization ratio as absorption, 
provided both correspond to the same electronic transi- 
tion. Experjmental observations show that the polariza- 
tion ratio of absorption (3800 A) and the fluorescence 
(corresponding to the same electronic transition) spectra 
are very different. 


` n 


CC-0. Gurukul Kangri University Haridwar Collection. Digitized by S3 Foundation USA > x, 


IN 


ORGANIC PHOSPHORS 1015 
~ The polarization ratio for absorption of light in the c’ 
direction of an anthracene crystal, when ¢’ is perpen- 
dicular to the a and b crystallographic axes, has been 


measured by Craig and Hobbins® to be 1.8:17and hy 


Ganguly, Chaudhury, and Mukherjee to,be31.5:1. 
The ratio of fluorescence intensity along the same 
directions, observed in the c’ direction, has been de- 
termined by Ferguson and Schneider"? to have the 
value 5:1 and the value obtained by Chaudhury"! who 
corrected for all probable errors, is 4.3:1. Ganguly and 
Chaudhury™ calculated on the basis of an oriented gas 
model that the polarization ratio (J,/Za) is given by 
the square of the cotangent of the angle which the 
active molecular axis (projected on the ab face) makes 
with b axis of the crystal. Craig and Hobbins® shox<a 
that the first-order perturbation methods applied to 
the exciton theory gives the same result. For an anthra- 
cene crystal it is 7.8:1 which is reduced to 6.5:1 on 
making allowance for torsional oscillations. Ferguson 
and Schneider argue that if the value 1.8:1 for absorp- 
tion is accounted for, as was done by Craig,”° by as- 
suming a crystal induced mixing of the first two excited 
states of an anthracene molecule, the value of 4-5:1 
for fluorescence, being nearer the exciton theory predic- 
tion, implies that the fluorescence is emitted from a 
molecule situated at a lattice dislocation where the 
symmetry elements of the lattice are not maintained. 
The different values for absorption and fluorescence 
I,/I, may be due to the fact that fluorescence is emitted 
from a molecule situated at a place where the principal 
directions of transition are rotated with regard to those 
of crystal absorption. Thus, these facts seem to estab- 
lish the idea that an exciton in an anthracene crystal 
ends its life by being captured by a molecule situated at 
a dislocation from where the fluorescence is emitted. 

When naphthacene molecules are present in an anthra- 
cene crystal that is excited by ultraviolet radiation, the 
naphthacene molecules may fluoresce by any one of the 
processes described in Sec. 1. If the photon cascade 
process is responsible, the decay time for fluorescence of 
anthracene would be different from the decay time of 
fluorescence of naphthacene in acrystal containing naph- 
thacene as impurity. Unfortunately, there are hardly 
any useful data on decay times of fluorescence in mixed 
crystals, except those of Bose™ which are not free from 
ambiguity. In this laboratory we have measured the 
optical densities of an anthracene crystal containing 
naphthacene molecules and calculated the effect of 
naphthatene absorption on anthracene fluorescence, 
according to the formula'’s 


f pal 1 — ex (Hte) t 
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(eee Ferguson and W. G. Schneider, J. Chem, Phys. 25, 780 
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u8 N, K. Chaudhury, Z. Physik. 151, 93-105" (1958). 
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where fr is the backward fluorescence intensity qf 
anthracene if there is absorption, fo is the same for no 
absorption; pil is the optical density of this crystal at 
the exciting wavelength Ai, wot that at the emitted wave- 
length às. In our measurements, (=9.9X10% cm, 
pat~ 10 at A1=3650 A; u~ 1 at A2=4898 A where the 
absorptior intensity of naphthacene molecules is the 
greatest in the fluorescence region of anthracene and 
f,/fo=1. Thus absorption cannot be held responsible 
for transfer of energy from anthracene to naphthacene. 
It has also been observed in this laboratory that 
when naphthacene molecules are incorporated within 
the anthracene lattice, not only the absorption and 
fluorescence spectra of naphthacene molecules show 
~—~pslerization, but also the polarization ratio of absorp- 
tion spectra of these molecules remains unaltered from 
that of pure anthracene crystal. This means that the 
naphthacene molecules are embedded in an anthracene 
lattice as substitutes for anthracene molecules, and also 
that they are coupled to the anthracene lattice in the 
same way as anthracene molecules themselves. Thus it 
may be inferred that if the fluorescence of anthracene oc- 
curs at molecules situated in imperfect regions of crystal 
lattice, so it is for naphthacene fluorescence. The trans- 
fer of excitation energy from anthracene to naphthacene 
molecules is half complete at a concentration of c=1.8 
X10~ mole naphthacene per mole anthracene.’® If 
the naphthacene molecules occur at lattice dislocations 
at the same concentration as anthracene molecules, 
which is quite reasonable in view of the foregoing facts, 
the probability that a naphthacene molecule receives 
the excitation energy, and emits it in the form of fluores- 
cence, should be at least 510‘ times as great as that 
of an anthracene molecule. 
When the intensity ratio of the polarized fluorescence 
band at 4450A of an anthracene crystal along the 
b axis to that along the a axis is plotted against different 
concentrations of naphthacene in the crystal, it is 
observed that the ratio at first gradually increases, 
attains a maximum value, then decreases with increase 
of concentration of naphthacene. This increase of 
intensity ratio with increase of concentration of naph- 
thacene at low concentration region excludes absorp- 
tion process. But the fall of the intensity ratio ‘of 
fluorescence of anthracene at high concentration region 
of naphthacene is difficult to explain if the absorption 
process is prohibited altogether. 


8. SUMMARY 2 


_ + Dhough our knowledge of energy transfer in aromatic 
crystals has increased considerably, it has not reached 
the state of maturity required for a definite formulation 
oi a general rule. Many aspects of the problem remain 
ie ‘ntouched. Except for Bowen et al. and Birks 
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Because a quantum mechanical resonance transfer 
process is independent of crystal size and shape, the 


importance of such measurements in distinguishing the ~ 


photon cascade from resonance transfer cannot be over+ 
emphasized. More and more carefui measurements 
would be worthwhile. There exist few data on the effi- 
ciency of energy transfer from a host molecule to an 
impurity as a function of the exciting wavelength. It is 
also not known wuether naphthacene fluoresces with 
the same efficiency by transferred excitation energy as 
by direct absorption, though in a recent paper Simpson” 
has shown that the intensity of naphthacene fluores- 
cence excited by absorption of fluorescence of anthra- 
cene is smaller than that under exciton excitation. 
Because of complete quenching of the fluorescence of 
anthracene, it cannot be concluded whether the fluores- 
cence efficiency is or is not the same in both the cases. 
Careful study of the results show that the agreements 
between the predictions of the exciton theory of the 
molecular organic crystal and the observed spectro- 
scopic data are only over all and not in details. Again, 
cascade theory cannot explain many experimental 
results, though it gives satisfactory solutions to the 
problem of mechanism of energy transfer in pure 
crystal. This shows that either data of greater precision 
or some modification in the theories must be introduced. 
Measurements of absorption coefficients have not been 


made in many of these crystals, pure or impure, with - 


the result that values of the absorption coefficients in 
solutions are quoted in support of a theory in solids. 
The importance of temperature in emission and absorp- 
tion cannot be denied, but it has been almost completely 
ignored. More data are required on photoconductivity. 
Results obtained by different workers on the same 
crystals not only differ but are sometimes contradictory: 
how the mobility is influenced by various factors needs to 
be known for clearer understanding of the subject. 
It is expected that the study of temperature effect on 
the efficiency of fluorescence, coefficients of absorption, 
energy transfer, and polarization will reveal facts of far- 
reaching consequences. We have undertaken some of 
the studies in this laboratory. 


Note added in proof.—After the review had been serc for publi- 
cation, some new works having some bearing on the subject matter 
of the review appeared. They are briefly described betow.ic 

Electronic states of aromatic crystals (Sec. 3).—. 


The near ultra- 
violet polarized absorption spectra of some aromatic crystals of 
pyrene have been investigated by Ferguson'*! at room temperature 
and at 77°K, of 1,3,5-trichlorobenzene by Schnepp™ at 42°K, of 
hexachlorobenzene by Schnepp and Kopelman?” at liquid nitrogen 
and liquid helium temperature, and of durene by Schnepp and 
McClure™ at 20°K. Whereas in the case of pyrene, Davydov 
splitting could be detected as a separation of 18 cm™ between the 
a- and b-polarized components, the a component being of higher 
energy in 1,3,5-trichlorobenzene, hexachlorobenzene, and durene 


r; 
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crystals, no such splitting could be detected. If present, the 
splitting must not be greater than 5 cm™ in the trichlorobenzene 
and hexachlorobenzene and 3 cm™ in the durene crystal. In the 


> —case of the hexachlorobenzene crystal, the energy of separation 


s® 


`bètween the two K=0 levels in the exciton band has been calcu- 
lated tbe 0.7 cm7. 

Thus it appears that at least one of the low-lying excited elec- 
tronic states of the pyrene crystal has the nature of exciton states, 
though in the case of the 1,3,5-trichlorobenzene, hexachloroben- 
zene, and durene crystals any such conclusion would be too hasty 
in view of the, paucity of experimental data. 

Photoconductivity of organic crystals (Sec. 6).—In a recent 
paper’? Northrop and Simpson have published values of photo- 
currents of pure hydrocarbon crystals. They observed that the 
impurity molecules embedded in pure crystals which quench the 
fluorescence of pure substance reduce the photocurrent in the same 
ratio. Their other observations regarding dependence of photo- 
current With light intensity, applied voltage, etc., are in agreement 
with previous publications. From these observations they propose 
that the interaction of two excitons is required to produce a single 
ionized molecule. Thus the production of charge carriers is ex- 
plained. Compton et al. studied photocurrent of anthracene 
crystals before and after neutron bombardment and concluded 


125 D, C. Northrop and O. Simpson, Proc. Roy. Soc. (London) 
A244, 377 (1958). 

126 Compton, Schneider, and Waddington, J. Chem. Phys. 28, 
741 (1958). 
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that conductivity was greatly reduced on bombardment. They 
lurther observed that dependence of photocurrent on wavelength, 
and intensity of incident light remained unaltered after bombard- 
ment. Before bombardment the photocurrent was markedly non-, 
ohmic, after bombardment it was ohmic up to a field of 25 000 
volts cm™. 
change of photocurrent when polarity of electrode is reversed, 
but this asymmetry disappears after irradiation. Kommandeur and 
Schneider!” studied the photoconductivity in greater detail with 
very pure specimens of anthracene crystals and obtained results 
very different from previous ones. They observed that the maxi- 
mum vaiue of photocurrent corresponds to the minima of the 
absorption spectra. They also observed that the intensity de- 
pendence of photocurrent changes with wavelength, field direction, 
and even with magnitude of the applied field. These authors finally 
concluded that spectral response, voltage, and intensity depend- 
ence of photocurrent depend on the source and treatment of the 
crystals used, i.e., it depends on the density of imperfections_of 
the crystals. 
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1. INTRODUCTION 


HIS paper reviews and extends the theory of 
irreversible thermodynamics. The irreversible 
behavior of a system driven by externally applied 
forces has been studied extensively, but attention has 
been focused primarily on the first-order term in the 
driven response. Here we also consider the higher-order 
terms in the riven response and the random fluctua- 
tions, er acise, occurring during an irreversible process. 
In addition to the well-known relations between the 
linear response and the equilibrium fluctuations, several 
new relations are proved involving the nonlinear 
response, the driven noise, and the equilibrium fluc- 
tuations. A 
The method of analysis is statistical mechanical and 
general, neither assuming a specific model nor postu- 
lating Markoffian behavior. The purposes of the analysis 
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are thermodynamic; that is, to investigate interrelation- 
ships among macroscopically observable characteristics 
of systems undergoing irreversible processes. In this 
sense the aim should be clearly differentiated from those 
other approaches which might be characterized as 
kinetic or statistical mechanical rather than thermo- 
dynamic. 

The most direct approach to the problem of irrever- 
sibility is the kinetic approach, in which a specific 
model is immediately introduced. The essential features 
of the model may be expressed in terms of molecular 
collision probabilities, giving rise to the Boltzmann 
equation, or to some similarly detailed kinetic equation. 
This is the standard method of “transport theory,” and 
it is the method characteristic of the theory of the soid 
state. 7 

A considerably more general approach is one which 
we term the irreversible statistical mechanical approach. 
The purpose there is to develop a gene.al formalism, 
analogous to the partition sum algorithm ot equilibrium 


»statistical mechanics, which would provide a systematic -` 


recipe for the calculation of any macroscopically ob- 
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servable characteristic of a system undergoing an irre- 
versible process. No specific model is invoked; the aim 
is rather to provide a general formalism into which any 
particular model could be substituted to obtain explicit 
_ results. The irreversible statistical mechanical approach 
has not been completely successful as yet, but one type 
of partial result has been exploited widely. In this type 
of result the driven response of a system is obtained as 
a perturbation expansion in the applied forces. The 
various order response terms are typically expéctation 
values of (multiple) commutators, taken with respect 
to the equilibrium system. It is, of course, hoped that a 
general algorithm for the computation of equilibrium 
commutator forms then will be developed to complete 
- thaeneral formalism. 

The third approach is the thermodynamic approach 
which we adopt here. Although the statistical mechan- 
ical formalism is used to describe the motion, our 
purpose is not to compute either the response functions 
or the value of any quantities characterizing the 
equilibrium system. Our purpose is rather to explore 
the general interrelationships among different types of 
response functions and the equilibrium fluctuations, 
insisting, however, that the quantities so related each 
be macroscopically observable. Thus, for example, the 
equilibrium commutator forms in terms of which sta- 
tistical mechanics expresses various response functions 
are not true observables of the equilibrium system. In 
order to give thermodynamic significance to statistical 
mechanical results, it is therefore necessary to re-express 
‘such quantities in terms of macroscopically observable 
‘symmetrized equilibrium forms, or anticommutators. 
Three general classes of irreversible thermodynamic 
_ results have previously been obtained for first-order 
| processes: (a) relationships between off-diagonal ele- 
= ments of the admittance—these are the Onsager reci- 
| procity,! and its extension to non-Markoffian systems? ; 
es (b) the relationship between the first-order response and 
= the second correlation moments of the equilibrium 
= fluctuations—this is the so-called fluctuation-dissipation 


= _ theorem*®; (c) the relationship between the path dis- 


E: 


_ tribution function for a driven system and the equi- 

librium fluctuations.®=8 

= The extensions of the theory which are developed 

A here, and the general structure of irreversible thermo- 

se dynamics, are summarized in the diagram in Fig. 1. 
+t = S Ra, . . . 

The quantities appearing at the vertices of the diagrams 


ote the macroscopic observables, while the connect- 
s indicate the existence of thermodynamic rela- 


. The arrowheads refer to the direction in 
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which specific relationships are developed in this paper, 
the numbers along the lines indicating the sections in 
which the various proofs appear. 

The nature of the observables of interest can be made 
clear by the following considerations. Let (q(¢)).denote 
the expectation value of the variable g at timé ¢ in a 
system driven by externally applied forces; that is, the 
driven response of the variable g. Further, let (¢(é)) be 
expanded in powers of the applied forces. Then we label 
the first-order term in the response (q(t))“, the second- 
order term (g(¢))®, etc. The zeroth-order term (q(¢)) 
=(g) is simply the average value of q in the equi- 
librium system. We similarly obtain a more detailed 
description of the driven system by introducing the 
expectation value (q(ti)g(t2)) of the product of the 
variable q at time /; with its value at time ¢; this is the 
second correlation moment of the random fluctuations, 
or noise, in the driven system. Again it is possible to 
expand (q(¢:)¢(¢2)) in powers of the applied forces. We 
label the first-order term in the driven noise (¢(t1)q(t2))™, 
the second-order term (q(t1)q¢(t2)), and so on. The 
zeroth-order term (q(¢1)q(t2)) =(qq(te—t1)) charac- 
terizes the spontaneous fluctuations in the equilibrium 
system. Similarly, it is possible to consider third- and 
high-order driven correlation moments. 

There exists a definite hierarchy of irreversible ther- 
modynamic relationships. The left-hand diagram repre- 
sents the fluctuation-dissipation theorem, between the 
first-order response (g(t) and the equilibrium second 
moment (qq(t)). The middle diagram indicates the 
triplet of relationships which exists among the second- 
order response (g(t)), the first-order term (q(t1)q(t2)) 
in the driven second moment (noise), and the third 
moment (qq(t)q(t2)) of the spontaneous equilibrium 
fluctuations. The right-hand diagram indicates the 
cyck of interrelationships which may be presumed to 
exist among the next appropriate group of observables, 
although we do not consider this case explicitly. 

In Secs. 2 to 4 the general statistical mechanical 
description of the time-evolution of a driven system is 
briefly reviewed. Sections 5 to 8 are devoted to a 
review of the existing first-order theory of irreversible 
thermodynamics. In Secs. 9 to 14 the general theory of 
irreversible thermodynamics is extended to the second- 
order response and the driven noise. Secticns 13 and 14 
are devoted to the irreversible thermodynamics of 

: £ . d 
step-driven processes. In Sezs. 15 and 16 the question 
of path distribution functions is considered, which may 
be regarded as the fundamental quantities of irreversible 


= 


7 


r 


—_ 
F 


> 


IRREVERSIBLE THERMODYNAMICS 


thermodynamics in the sense that all macroscopic 
quantities are derivable therefrom. 


“4. THE TIME EVOLUTION OF DRIVEN OPERATORS 


We consider*a system, of which the unperturbed 
Hamiltonian is H®, in interaction with a number of 
external driving systems or signal generators. The 
Hamiltonian of “he composite system may typically be 
representec@ by 


HZ HOT BO Hes, (1) 


where Q;=(Q;(q,p) is a function of the coordinates q 
and m4menta p of the system of interest, F;=F;(q',p’) 
is a function of the coordinates g’ and momenta p’ of 
the ith signal generator, and H,, denotes the Hamil- 
tonians of the signal generators. 

Whereas the dissipative system possesses a large 
number of degrees of freedom and a quasi-continuous 
spectrum of energy eigenvalues, the signal generators 
have relatively few degrees of freedom and an extremely 
high degree of excitation. Thus, the coordinates of the 
signal generators, and consequently the F;(q’,p’), are 
essentially classical functions of the time. Ignoring the 
term H,, as being irrelevant to the system of interest, 
the perturbed Hamiltonian assumes the form 

: H= HO F;()Qi. (2) 

t 

We adopt the interpretation that the Hermitian 
operators Q;(q,p) correspond to thermodynamic exten- 
sive parameters of the system of interest, while the 
F(t) represent the conjugate intensive parameters 
imposed upon the system by the various signal genera- 
tors. Thus, Q;(¢,p) might be the operator corresponding 
to the position of a movable piston (volume), the total 
number of particles, or the magnetic moment of the 
system. The respective imposed intensive parameter 
would then be the pressure, the electrochemical poten- 
tial, or the applied magnetic field. 

If the system is in equilibrium, with the temperature 
T, before the forces are applied, the expectation value 
of an operator Q; at time ¢ is 


2 a (2i(0)=Trace pO:(0, TO) 


where p® is the initial (unperturbed) canonical density 
operator 


p=exp[—BH ]/Tr exp[-6H©], B=1/kT, (4) 

and where the Heisenberg operator Q;(t) is defined by 
QOH=U'HwU(. (5) 

The unitary time-evolution operator U(é) satisfies the 

Schrédinger equation 

$ Fa 1 g 4 e A z 

U(ġ z OLLO) pe wS F:(D0JU (D. (6) 

t 1 i 


. The differential equation (6) is equivalent to the 


integral equation zi 


HOL 
U(Ù =exp| =i 
h 


|e f dtyF s(t) OQ: (t4) i 


x eoj] U Os | 7 (7) 


the correctness of which can be verified by differen- 
tiation. 0; (£) represents the unperturbed Heisenberg 


operator 
HOL HOE ae 
ooz = -e 0 


The iterative solution of Eq. (7) is 


HOE o 1 n t 
uO =e] =i È (->) DD f dt, 
h Jn=0 ih/ ij---k 


—%0 


tn-1 
A dtz- - JA dlnFi(tı)F ;(t2)- = 


XF r (tn): ® (41); (to) > - -OkO (tn). (9) 


The perturbation expansion of @;(¢) follows from _ 


Eqs. (5) and (9). The zeroth-order term is simply the 


unperturbed operator Q;(é). The first-order term 


0; (@) is 


P 
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The second-order term Q;® (£) is 
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j dtyF ;(t1) f dtoFp(t2)Q;® (1)Q:® (Ox (tə) 
= EZ l dteF .(t2)Q0; (4); (DO. (ts) 


+ f dt F ;(t1) f j tol, (t2)Q; CROOKS (tə) 


> - xe < — 
se ee “RNS EE ES 


-f dnte) f MPANGO NODO) 
~at a —o 


+f | dtyF (ty) f : dtaF r (t2)Qx (t2) 
XQ: (DQ; (n). (12) 


In the last step we have inverted the order of inte- 
gration in the second integral and interchanged the 
dummy indices z, 7 and the dummy times 4, fs. Using 
this result, Eq. (i1) can be written in the form 


Q: (i) = (-- =) ib dtF ;(t;) f dtoF ,(t2) 


XLO O (t) TOO (4), OI. (13) 


Examination of Eqs. (10) and (13) clearly indicates 
the general form of the nth-order term Q;‘)(#) in the 
driven Heisenberg operator Q;(¢). Thus, for example, 
the third-order term is 


aoM=(-—) z x f dh F;(t1) f tok ;,(t2) 


xf l dtsF i(t) [Or (ts) 


X [OLO (t2),[05 (4), (OIL. 


This section follows the perturbation formulation 
given by R. Kubo.’ However, because we are later 
concerned with Q operators which are intrinsically 
time-dependent, we choose to examine the time evo- 
lution of Q,(é) rather than of p(#). 3 
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3: THE MACROSCOPIC RESPONSE AND DRIVEN 
CORRELATION MOMENTS 


t-arder term in the driven response (Q;(t)) of 
mic variable corresponding to the 
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term, so that 3 E 


CAE EN) 


(0:())@ =Tr pQ: (4) 
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where the bracket { )® denotes an expectation value 
with respect to the equilibrium ensemble. 
Similarly, the second-order term in the response is 


1 2 t ty F 
oy=(—S) z f anr f ar 


X (COO U) GO 4), O QH 


The form of the higher-order terms is clear from Eqs. 
(15) and (16). 

The spontaneous fluctuations in an equilibrium 
ensemble are characterized by the second correlation 
moments in time between each pair of variables, al- 
though all higher moments are required as well for a 
complete description. The second equilibrium corre- 
lation moment is 


TO O=O]. (17) 


where the bracket [ , J+ denotes a symmetrized 
operator product, or anticommutator. 

A driven ensemble also exhibits fluctuations about 
its average motion, which in general differ from the 
equilibrium fluctuations. The fluctuations in a driven 
ensemble are characterized by 


(2LO:(), Oi(t+7) J= Tr pL: Or) (18) 


Using the result (10) for Q:® (£), the first-order term 
GLO0;:(), O;(t+7)],)© in the driven second moment 
(18) becomes 


GEOD, ODIY 
=K (D, GOJ 
HOOO, GOU)” 


(16) 
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+ f ARORO 
xox}. (19) 


Upon decomposition of the first integral according to 
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Eq. (19) Socos 
~l LQ;(2), ET 


2—— —>| iE dF x(t) (CO. (4), 
X [Qo (4), Q;,(t+7) J) 
T 
g +f | dhF .(t)(LO (d), 


XLO (4), (t+ 7) LL) }. (20) 


a 


The higher-order terms in the perturbation expansion 
of the driven second moment can be written in anal- 
ogous fashion, although the expressions involved become 
rapidly more complicated. The second-order term is 


(400.(4), QUT] p 


(A) a en fa 


X (COO (te), [Ox (4), 
X LQ; (4), QOU) 


attr 
sf darit f 


dtoF (l2) 
(LO. (4), LO: (9, 


XLO (ts), QOU) ITO. 21) 


The nth-order term in the driven response can be 
written directly in terms of the (#—1)st-order term in 
the driven commutator ([Q;(4),0:() DE. 


CO f. dnF;(h) 


: en X(LOi(4) AO IIe™. 


For n=1 this expression is identical to Eq. (15). For 
n=2 it can be obtained directly from Eq. (16) and the 
commutator analog of Eq. (19). For any order it is a 
direct consequence of the iterative nature of the per- 
turbation expansion, and furnishes a clear picture of 
the essential structure of the motion. 


(22) 


4. STEP-DRIVEN PROCESSES 


` The preceding two sectians were concerned with the 
motion Gf a driven system for which the forces are 
arbitrary functions of the time. In order to illustrate 
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the formalism in a simple manner, we alsq consider 
step-dviven processes in particular. ` 

A step-driven process is defined as one for which the 
generalized forces in the distant past have increased ` 
slowly from zero to some constant value. This constant 
force remains applied until ¿=0, at which time it is 
suddenly removed and the system is allowed to relax 
into its equilibrium configuration. For a step-driven 
process the first- and second-order responses reduce to 


1 0 
2:0) =-— EF; f dull), OL), 
th i = (23) 


1 0 i ~ 
(0;(1))@ = (- -) > F jFk if an f dtz 
th jk O —00 


X (CO (42) LO; (4),05 OIL). 


When the indicated time integrations are performed 
in Eqs. (23) and (24), the contributions from the infinite 
time limits pose certain difficulties, which are related to 
the approach to equilibrium. This matter is examined’ 
in Appendix A. It is also possible to approach the 
question of step-driven processes from the following 
alternate point of view, which circumvents these dif- 
ficulties. We expect the applied forces F; to bring the 
system to a new equilibrium configuration at t=O, 
characterized by a density operator p(0) having the 
generalized canonical form appropriate to an ensemble 
in contact with a set of reservoirs with constant in- 
tensive parameters Fj. 


p(0)=exp{ —6LH©-+); F530; ]}/ 
Tr exp{(—6LH®+>; 


(24) 


F,O;]}. 


At ¢=0 the interaction with the external systems is 
removed. Since for ‘>0 the Hamiltonian is simply 
H, the response (Q;(¢)) during a step-driven process is 
given by 
(Q:())=Tr p(0)Q; (2) 
=Tr exp{ — [E O+}; F305 )}0;5(/ 
Tr exp{ —6[R 0+3; F;@;]}. 


In order to expand Eq. (26), we first perform the 
well-known expansion of the operator - 


> <A(@)=exp{— BLH©-+ eH}, 
R 2) eS 


where H® denotes the perturbation Hamiltonian. 40) So 
satisfies the integral equation ; 


A(6)=exp[ -BH ©] -i 


(25) 


: 


(26) 
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the iterative solution of which is A 


= DY 


o 8 
-A(6)=expl—BH] E (~9” m dH (— ih) 
n=0 0 


AL 
xf dio H O (—ih)o) 
i 0 


An-1 


x f AnH O(— iha) (28) 
0 


where H® (—7h)\y)=exp—i4 JH™ exp[—A H]. 

Replacing «H™ by the perturbation Hamiltonian 
dei F 50; in Eq. (28) yields the quantity appearing in 
the numerator of Eq. (26). 


exp{—6LH+)); F0;]} 


=exp[—6H®] X (—1)” yy F;Fr--Fı 
n=0 keel 


B Al 
<I d\10; (—ihd3) D204, O (— iha): +- 
0 r 0 


An-1 


X dAn (— ihn). 


0 


(29) 


The expansion of the denominator of Eq. (26) can 
be simplified by employing a technique, due to 
Nakajima,” which reduces all multiple temperature 
integrals by one order. Consider the trace of the 
operator A (6) defined in the foregoing. Differentiating 
Tr A(6)=Tr exp{—6[H-+ «H ]} with respect to «, 
we obtain 


ð 
Pr Tr A(6)=—8 Tr exp{ —p[H O+ H Y]}HV. (30) 
€ 


We now expand the quantity exp{—@[H-+«H}} 
according to Eq. (27), integrate this expression with 
respect to e, and substitute >); F;@Q; for eH® to obtain 
the denominator of Eq. (25). 


Tr exp{ -LLE Y+; F;0;)} 


e (=1)" 
=exp|—ßH oJ 1+6 2 


z 


Fj- -FF 
n jkl > 
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2 B An-2 A 
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$ We now substitute the expansions (29) and (31) into 
Eq. (26) and collect terms corresponding to each order 
in the perturbation. The results for the first few orders = 
in the expansion of Q;(#) are found to be á 


g ie 
(0:())=-¥ P| f aoimo 
-800o ], (32) 


B A 
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The classical forms of the foregoing equations are 
easily obtained and constitute a series of thermody- 
namic relationships. Letting g;(#) denote fhe time- 
dependent classical variable corresponding to the s 
Heisenberg operator Q;(?) and replacing traces by 
integrals Jdr over phase space, the classical step- 
driven response is, from Eq. (26), 


(asl)= faroa: 
= far exp{—6[H+)); Pada / 


$ x far exp{—BLH+>>; F ;q;)} (35) 
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or 


gD) = ca B ZE 593) Qi ())/ 


ae ee: X(exp(—6 X; F3q;)). (36) 


Since all quantities appearing in Eq. (36) are now 
classical functions, the perturbation expansion can be 
carried out in a straightforward way. Thus, the first 
few order terms in the classical step-driven response 


(q:(t)) are 


(gx) == 8D FL(gige()— (QQ), 87) 


(qi(D)) = 368? x FiF [ggg O 


— (q0) O) O JHB E Fila) a), (38) 


(q:())® = — 46 2 FiF Fil lggeqa: (O 
jk 


—(qiqngqr) (qi) J 
— 3B? 2 PSF elgg Oq) 
qk 


+8 E Fg) ODL. (39) 


Equation (36) is the first truly thermodynamic rela- 
tionship we have developed up to this point. It 
expresses the step-driven response (g;(/)) in terms of 
the quantity (exp(—6 >; Fiqi)qi (t)), which charac- 
terizes the spontaneous equilibrium fluctuations in an 
operationally significant way. In particular, 


(exp(—8 X (Figaqi() 


represents the second equilibrium correlation moment 
between the quantity exp(—ß > ; Fjq;) at time zero 
and the quantity q: at time t. 

The first-order term (37) in the expansion of Eq. 
(36) will be recognized as a classical form of the so-called 
fluctuation-dissipation theorem,‘ which relates the first- 
order response (g;(t))“ to the equilibrium second corre- 
lation moment (q;q:(t))© between the variables q; 
and q:(th We discuss the quantum-mechanical form 
of the fluctuation-dissipation theorem in the following 
three sections, considering the step-driven case specifi- 
cally in Sec. 7. 

Similarly, Eqs. (38), (39), etc., relate the second- 
and higher-order terms in the classical step-driven 
response (g;(¢)) to appropriate higher equilibrium fluc- 
tuation moments. The quantum-mechanical form of 
these relationships is presented in Sec. 13. 


5. EQUILIBRIUM FLUCTUATIONS AND THE 
` FIRST-ORDER RESPONSE 


The linear theory of irreversibility is reviewed in 
this ahd the two following sections, foNowing quite 
closely the formulation of R. Kubo? The pattern of 
this developmeng suggests the method of extension to 
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uionlinear processes, and yields a relation” between 
commutators and anticommutators to which we make 
frequent reference. 

It is convenient to characterize the first-order 
response by the aftereffect function ¢;:;"(¢), which is 
the response (Q;(#))™ to a 6-function force Fi applied 
at t=0. That is, by definition, ? 


QM)=E f ankou. 40) 


Thus, writing the equilibrium commutator in Eq. (15) 
in the equivalent form ([Q;, 0; (t— t1) |_), we identify 
1 ne 
bP (O= 5, sess OIDO. (41) 
i 


The aftereffect function ø:;® (t) exhibits significant 
symmetry properties with respect to reversal of the 
time ¢ and of an applied magnetic vector potential A. 
The classical quantities g; are assumed to be even 
functions of the particle velocities; explicitly indicating 
the dependence on A, the operators Qi (A) then satisfy 
the relationship 


Q:i(—A)=Q:*(A). (42) 


The unperturbed Hamiltonian H®(A) and its eigen- 
functions also satisfy Eq. (42). 

Because the response (Q;(¢))® must itself be real, it 
follows that ¢;; (4) is real. 

A second property of ¢;; (é) is 


pu” (—N=— G55). (43) 


Introducing the transformation ¿—> —¢ in Eq. (41), we 
have 


1 
bO ( =)=- GOH. (44) 


The ¢ dependence can be transferred to the operator Q; 
by performing a unitary transformation with 


exp{i[Ht/h]}, proving Eq. (43). 
i; (t) is odd under reversal of time and magnetic 
field. 
i359 (—2; —A)= — oh; (t; A). (45) 


According to Eq. (42) Q;(—é; —A)=Q0,;*(¢; A), 
similarly» p©(—A)=p*(A), so that $i3(—#; —A) 
= —¢$:;%*(t; A). Invoking the reality of yt (tA) 
yields the property (45). 

Properties &) and (45) also imply 


pu (t; —A)= p; (2; A). 


We now w establish the fundamental refationshi V 
exists between the equilibrium commutator A 5 
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characterizing the first-order response, and the, equi? 
librium anticommutator V;;()=(G[0;,0; ()1.), 
‘characterizing the spontaneous equilibrium fluctua- 
tións. The motivation for so doing is that the latter 
quantity is a true macroscopic observable of the 
equilibrium system, while the former is not. 

Consiler the equilibrium second moment ¥;;® (4. 
Since the analysis is carried out in the spectral repre- 
sentation, it is convenient to define the operators Q; 
such that V;;(/) has no constant component, thus 
avoiding the attendant 6-function singularity appearing 
in its Fourier transform. As discussed in Appendix A, 
the time-independent portion of ¥;;(#) is 


Lees) =lim 4L0,,0; (0) 
=(0) (0). 


Consequently, we assume that the operators Q; are 
defined such that (Q,;)©=0. 
W;;©(¢) can be written 


V5 D)=3(0,0;(O+0; OQ) 
=3(0;0; (+0; (—1hB)Q; (1). 


The second term on the right has been obtained by 
inserting exp[+@H™] in front of Q; and cyclically 
permuting the operators in the trace. That is, 


2,00) =Tr 90; (0) exp[+6H 90; 

=Tr p exp[PH OQ; expl—BH}0;(0), (49) 
and invoking the definition (8) for Q:© (—ihß) gives 
Eq. (48). We decompose Eq. (48) into a double sum- 


mation over matrix elements in the unperturbed energy 
representation 


PaO) =z 2 p(E){1+exp[6(Eı—Em)]} 


(i= Et 
X(Z1| Q:| En)(Em|Q;| Ei) a] 


(47) 


(48) 


(S0) 


where p(B)=e P2} 0f? and (Eı|Q:| Em) is the 
matrix element of Q; between the eigenstates of H® 
having the eigenvalues Æ, and Em. In virtue of the 
quasi-continuous spectrum of energy eigenvalues, the 
double summation appearing in Eq. (50) can be replaced 
by a double integration over energy eigenvalues. 


a 


1 0 E9 a 
- PpO) =- f dE, f dE mp (E)n (E)n (Em) -~ 
5 i 2 J—o sd 
he 


E X {1+ explo (Er Ein) XE: Q:| En) 
eona a (En— Epi 


density-of-states function. 
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We obtain the Fourier transform G;; (w) of V3; (4) 
by introducing 


m= Ext hie. (Sa 
Thus Eq. (51) becomes SG 
V5, ()=G10,0;5 OO 
F 1 % ; 
=—— | dweIG;9(w) (53 
(2r)! J_., ý ) 
where 
T 3 is) 
G3 (w) = (=) al1+exp(—A84)] f dEp(E) , 
Xn(E)n(E+hw)(E|Q:| E+) 
X(E+hw|Q;|E); (54) 


G;; (w) is the spectrum of the spontaneous equilibrium 
fluctuations. 

We obtain the Fourier transform of the aftereffect 
function ¢;;(é) in an analogous way. Equation (41) 
can be rewritten 


1 
bi (Y= yee (Q-Q5(YQ:) 
Wn 


c 


1 
=O () -0O (8O (N)O. (55) 


Decomposing the equilibrium expectation value into a 
double integral over matrix elements in the unper- 
turbed energy representation and introducing the 
transformation (52), we obtain the result 


1 ao 
dwet Li; H (w) 
(2r)? Í i 


pi; (Y= (56) 


-N 


where 


Lij® (w) = (2r) [1 —exp(—A8w)] f ; dEp(E)n(E) 


Xn(E+hw)(E|Q;| E+hoX Ethel; lE) (57) 


is clearly the Fourier transform of ¢;;“ (4). 
Comparison of Eqs. (54) and (57) shows that the 
Fourier transform G;;(w) of the equilibrium second 
moment W;;(#) is related to the Fourier transform 
L;;@ (w) of the aftereffect function ¢,;; (t) according to 


1aG;; (w) = BY lw ; B)Li;™ (w) (58) 
where 
hw hy 
E ae E 
20 D BA 


a 


The universal function E® (w; 8) is uniquely quantum- 
mechanical in origin and corresponds to a slight 


A 
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smearing out of the microscopic contributions to the 
macroscopic’ response at extremely high frequencies 


= QR 10” cps at room temperature). In the classical limit 
B— 0, EY (w; 6) —> 


(1/8) as indicated. 

Equation (58) is the spectral statement of the funda- 
mental relationship which exists between the first-order 
response and the spontaneous equilibrium fluctuations. 
It provides directly the basis for the fluctuation-dis- 
sipation theorem, several alternate forms of which have 
been developed.*4° We discuss these in the following 
two sections. 

The result (58) has also been obtained by Kubo? 
using function-theoretical arguments rather than the 
matrix,approach employed here. 


6. THE ADMITTANCE AND THE FLUCTUATION- 
DISSIPATION THEOREM 


We rephrase the results of the preceding section in 
the familiar terms afforded by the admittance matrix. 


We define a;(w) and y;(w) as the Fourier transforms of 
the first-order “current” and force, respectively, 


? (D) =Q; = ; fa g 60 
a Q; =W; Tn) g weta; (w), (60) 


F: za 
t) = —— 
(2r)? 


We further define the admittance matrix elements 
Y;;(w) by 


l- e (w). (61) 


» 


ajlo) =o v:o) Vij) (62) 


whence, from the definition (40), it follows that 
Ya) =iof digao. (63) 
0 


By Eq. (63) the symmetry properties appropriate to 
Y;;(w) follow immediately from the symmetry proper- 
ties of ;;“(d). The reality of ġ:;® (t) implies that the 
real and imaginary parts of Y;;(w) are even and odd, 
respectively, under the transformation w > — w. 


Maca 


‘| V;(o). ` (64) 


The symmetry property (46) of :;® (t) with respect 
to reversal of the applied magnetic field implies a cor- 
responding symmetry of Y;;(«). 


Yalow; —A)=YV ji; A). (65) 


Equation (65) represents the extension of the original 
Onsager reciprocity!” to all frequency components of 
the admittance matrix elements. 

Wee now rewrite the spectral relationship between the | 
first order response and the equilibrium fluctuations’ in” 
terms of he Vi) by decomposing Eq. (58) into its 
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symmetric and antisymmetric parts with respect to the 
indices tj. > 


(w; B) 


E a 
9 @G;; O (w) =———— Y @)£Lj(@)] (66) 
iw 


where the superscripts (s) and (a) denote the sym- 
metric and antisymmetric parts, respectively. However, 
using the time-reversal symmetry property (43) of 
¢:;" (0), it follows from Eq. (63) for Y:;(w) and Eq. 
(58) for L; (w) that these quantities are related 
according to 


2i Re®Y;;(w) 
(2r)? w 


2 Im®™Y;lw) 
alLa (w)— KAL A o (68) 


Re“) Y,;(w) is the real (symmetric) part of Y;;(w) and 
Im“ Y;;(w) the imaginary (antisymmetric part). We 
note from Eq. (65) that Re®Y;,;(w) and Im™Y;,;(@) 
are even and odd, respectively, with respect to BRE! 
of the vector potential A. 

Substituting the relations (67) and (68) into Eq. 
(66), we obtain the results 


HL @ +L. )]=— , (67) 


OGO (w) = — ©) EV (w; panes (69) 


w? 


Im® Y;;(w) 


ONE 
GO @)=-i(=) E(w; 8)——————.._ (70) 
ww 


T 


Since G;; (w) is the spectrum of the equilibrium second 
moment (3[0;,0;(é)],), the Fourier transforms of 
Eqs. (69) and (70) are 


©310:,05 OS 


Re® Y5;(w) 
=—— J dw cost E® (w; pe, (71) 
w? ; 


OOTKO OO 


=— S dw sinwtE™ (w; oe (72) 


ee ae 


The symmetric part ©(4(0;,0; O])° of the equi- 
libriun? second moment with respect to 77 is even with. 
respect to reversal of the vector potential A, whilë the 
ante part ™3[0;,Q0,°@1)© is an odd 
function of A. Further, “<4[0,;,0; (0) vanish =, 
in the absence of an applied’ eae OT S t. 

Equations (69) and (70) or (71) so) consti 
the familiar spectral statement of the fluctu 
sipation theorem.‘ Tn the classical i F: 
Eq. (59)], Eqs. (70) and ee educe 


a f 
IOR 


Nyquist forms : o 


2 h Re®) AO 
f dw COSO | 
0 


Ng; K Osn (13) 
CEAO) 7 


T] 


2 S mF lo)  , 
@(giga(t)) =— i des sinwt———. ° (74) 
O e 7B 0 2 e 


i) 


If we put ¢=0 in the Nyquist relation (73), the left- 
hand member represents the total noise intensity. It is 
of interest to note that this form of the equation for 
t=0 also follows from the Kramers-Kronig or dispersion 
relations, together with the results of equilibrium fluc- 
tuation theory. The Kramers-Kroénig formulas relate 
the’ zeal and imaginary parts of the admittance matrix 
element Y;;(w) in consequence of the general require- 
ment of causality. In Appendix B we consider this con- 
nection between the dispersion relations and the spectral 
form of the fluctuation-dissipation theorem. 

It is sometimes convenient to characterize the spon- 
taneous equilibrium fluctuations in terms of a set of 
hypothetical intensive quantities F; rather than the 
extensive quantities Q;. These hypothetical forces are 
associated with the fluctuating Q; in the same formal 
way as real forces are associated, with the average first- 
order driven response. That is, by ar alogy with Eq. 
(62), the fluctuating force is so defined that the product 
of its Fourier transform with Y (w)/iw yields the Fourier 
transform of the fluctuating extensive parameter. 

Consider a one-dimensional system with a single 
force F(t) and corresponding operator Q. The spectrum 
G(w) of the second moment (FF (t)) of the equi- 
librium force fluctuation is defined by 


ao 


£ 1 
(FF (i): Opi 


Using Eq. (62), together with Eq. (69) for G® (w), we 


obtain 
2\3 Re Y(w) 
= EO w; B m 
C) € ): Y (w) |? 


f dwett O (w). (75) 


wG (w) 
@)=—— = - 


ZOL 
a (-) ‘E(w:p)ReZ(s) (16) 


where Re Z(w)=Re Y(w)/|Y(w)|? is the real, or dis- 
sipative, part of the complex impedance function. Thus, 


- (FE(t)) is given by ý 


Wy 
2 (a) 
CERA if dw coswtE™ (w; B)Re Z(w) (77) 
es > s E a CURL) . 5 
vÉ ch ee the one-dimensional form of the 
, ration-disspati on theorem (71) for n. 
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Similarly, it is shown that the symmetric and anti- 
symmetric parts of the second moment (F;/7;(t)) of 
the equilibrium force fluctuations for a multidimen-_ ER 
sional system are given by 


« 


2 a 
ARED) O=- f de cosatE™(w; 6) 
ap 4) £ 
XRe®Z;;(w), (78) 


is) 


J dw sinwtE® (w; B) 
0 
<Im OVA (w) 4 


2 
OPP ;())O = 
T 


(79) 


The symmetry properties of the elements Z;;(w of the 
complex impedance matrix are identical to those of 
Y;;(w). 

The fluctuation-disspation theorem of Eqs. (71) and 
(72) or (78) and (79), which establishes a quantitative 
relationship between a dissipative process and appro- 
priate equilibrium fluctuations, can be given the fol- 
lowing intuitive interpretation. A dissipative process 
can be conveniently considered to involve the inter- 
action between the dissipative system and a source 
system or signal generator. As mentioned at the be- 
ginning of Sec. 2, the dissipative system is characterized 
by a large number of degrees of freedom and is capable 
of absorbing energy when acted upon by an imposed 
force. In equilibrium it exhibits random fluctuations of 
its variables. 

The source system, on the other hand, which provides 
the imposed forces and delivers energy to the dis- 
sipative system, is characterized by relatively few 
degrees of freedom and a high degree of excitation. 
Examples of such systems might be a classical pendulum 
or polyatomic molecule. When isolated from the dis- 
sipative system and given some internal energy, the 
source system may be regarded as having a sort of 
internal coherence. 

If the source system is now connected to the dissi- 
pative system, this internal coherence is destroyed, the 
periodic motion vanishes, and ihe energy is sapped 
away, until finally the source system is left with only 
the random disordered energy 1/6 characteristic of 
thermal equilibrium. This loss of coherente within the 
source system may be regarded as being cavsed by the s 
random fluctuations generated by the dissipative 
system and acting back upon the source system itself. 
The disspation therefore appears as the macroscopic 
consequence of the disordering effect of the random 
equilibrium fluctuations, and, as such, is necessarily 
quantitatively related to the fluctuations. 

An interesting analogy is furnished by the historical 
development of the theory of spontaneous radiation 
from excited atoms. After the initial development of 
quantum mechanics, it was found impossible to compute 
the spontaneous transition probabilities for an isclated 
excited atom, and this dissipative process appeared to 
be outside the existing structure of dynamics. With the 


3 j 


a 
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advent of quantum electrodynamics, however, the dis- 
sipation could be computed, and it was found that the 
“^> spontaneous transitions could be consistently considered 
to be jnduced by’ the random fluctuations of the elec- 
tromagnetic field in the vacuum. In this case, the excited 
atom plays the role of the source system, and the 

ii vacuam plays the role of the dissipative system. 


7. THE FIRST-ORDER RESPONSE—TEMPORAL 
REPRESENTATION 


A particularly useful temporal form of the fluctu- 
ation-dissipation theorem, due to R. Kubo, is obtained 
by taking the Fourier transform of the basic spectral 
relatioñship (58) between L;;® (w) and G;;(w). Ac- 
cording to Eq. (56) this yields the aftereffect function 
$:;™ (t) in the form 


ae 


1 
wer a) (w). 


TOA ee 
g) oD 


Letting 


(80) 


1/E%(w;6)= | dte~“'T (ù) 
and noting from Eq. (53) that 
iG] deiho, 

Eq. (80) becomes 

1 ao ao 
6s O=— f do f a'T (t') 

2r —1 —0 

xf dt 300i, Q5 OU) Leer"). (81) 


Invoking the 6-function property of 


ao 
i t—t’-—t’’ 
f dae , 
—00 


we obta?n the result 


s a 
pu (2) = ——(L01,0; AHO 
th 


z J “an (2-2) 310;,Q;(#)1,) (82 


dwe! —— 
B (036) 


D ah Philo EN 
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The evaluation of the function I'(¢) has previously been 
carried out by R. Kubo, and is presented in Appendix 
C. 3 

Equation (82) constitutes the temporal statement of 
the fluctuation-dissipation theorem. Although equiva- 
lent to the spectral form of Eq. (58), it presents the 
basic relationship between equilibrium. comrautators 
and anticommutators in a more explicit fashion. 

The «result (82) can of course be substituted back 
into Eq. (40) for (Q;(#))® to yield the first-order 
response directly in terms of the equilibrium corre- 
lation moment. 


(Q;())O= > f ane f dtyT(4—b’) = 


X GLO (4), 0O O1). (84) 


In obtaining (84) from (40) and (82), we have intro- E 
duced the transformation t =t— i’ and made use of the Í 
identity 


GEO; QO (h—t') 14) = -GAO (0), GOAO. 


In the classical limit 8 > 0, Eq. (84), in virtue of the 
é-function property of T (¿), reduces to ` 


Qi(O)=-BE f aPN. (85) 


The first-order response during a step-driven process 
can be obtained directly from Eq. (84) by introducing 
the step-function forces defined in Sec. 4. 


iQ) Sa mies f dty f $ dty'T (4.—ty’) a 
X GLQ: (4'),0; (O14). 


Integrating by parts and putting (0;)=0 gives 
0 ope arlam) 
(0;()) => F; f dt { Faas 
i —% ) Oty’ 
X GLO (4’),0;5 (O14) 


or, performing the /,-integration, 


,(0)=-E F: i duit (ts!) 


X (ATO. (4,0, 
In the classical imit b — 0, it r ed 


Condenser plotes 


in the important case of transport processes one is 
interested in driven generalized currents rather than in 
conventional thermodynamic parameters. The special 
considerations required to treat currents, and par- 
ticularly to treat the steady state, have been carefully 
discussed elsewhere." Before concluding the discussion 
of the linear theory of irreversibility, we indicate briefly 
the manner of formulation of the theory in terms of 
driven currents. 

We consider the case of electrical conduction, for 
example, by assuming the specific Hamiltonian 


HHOH &:()0= Hed E)E su (89) 


where 6;(#) is the applied electric field in the ith direc- 
tion, and ;, is the ith displacement component of uth 
charged particle. The component J; of the current 
operator J is simply the time derivative of the operator 


Qi. 


Ji=Qi=e Ð iy. (90) 


In our interpretation heretofore we would have 
visualized the Hamiltonian (89) as applying to the 
physical situation illustrated in Fig. 2. The field is 
applied to the sample by condenser plates which are 
not in physical contact with the sample. The state 
asymptotically approached after imposition of a step- 
function force is one with zero current. An alternative 
interpretation arises if we formally impose periodic 
boundary conditions on the particle wave functions in 
the system. A step-function applied force then leads 
asymptotically to a steady-state current. The formalism 
is essentially unchanged, but the trace of any operator 
implies a summation over an entirely different Hilbert 
space than has been implied heretofore. 
The first-order current response (J;(t)})® is, given, 
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where the current after-effect function 7$;; (4) is 


1 
TO=- OL). 
th 


ld 


The fluctuation-dissipation theorem relating the 
first-order current response to the second moment 
GJO (t)] O f the spontaneous equilibrium 
current fluctuations is readily obtained using the tech- 
niques employed previously in connection with the 
extensive parameter displacements Q;. The results, in 
the spectral and temporal representations, respectively, 
are 


OGIO OI” 


2 a 
=- f dw coswlE™ (w; B) Re V;;(w), (93) 
Tv 0 


OAL JO O1) 


=— M dw sinwtE™ (w; 8) Im“ Y;;(w), (94) 
0 


O= f dEU-HALTTOW)II. (95) 


—o 


Equations (93) and (94) follow immediately from Eqs. 
(71) and (72), if we replace the operators Q; by nen 
time derivatives J; which simply removes the factor 
(1/w)? in the integrand. Similarly, Eq. (95) corre- 
sponds to Eq. (82) for ¢;; (Ù. 

Finally, the above analysis of driven currents can be 
justified by another consideration, which is perhaps 
more physical than the artifice of periodic boundary 
conditions applied to the system in Fig. 2. We consider 
a time-dependent magnetic field 3¢(¢) imposed axially 
through a toroidal conductor, as shown in Fig. 3. The 
induced current in the toroid will be driven by a tan- 
gential electric field &()= —A(d), A(t) being the vector 
potential associated with 3¢(t). The Hamiltonian appro- 
priate to this situation is 


H()=HO+A (HT 


r 


~ (96) 


where J is the operator corresponding to the electrical 
current around the toroid. 


from Eq. (15), by 4 
ee Me aks 
(HO) ro a D dh 6: (4); ® (t) OOA 
be” ee pile © dnEkty7s(C-2) OD i Tic. 3. 
s ; on th 2 È A F 
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The firstzorder response (J()) to the perturbation ~The energy density of this fluctuating field is found to 


A(t)J is then 
> t 


me oe 
Ti) = —— 


ih Yo 


dhA (h) LI h), JO]. (97) 


(J() can be rewritten in terms of the induced electric 
field &(¢) by integrating Eq. (97) by parts. 


OV=——A) f OOO 


t 


1 Ag 
TA ast) f di (EI O (u), IOHA (98) 


ih Sa 


where we have let A (4:)=— 8(h). 

The term involving &(é) is the physically interesting 
one, the integrated term corresponding simply to the 
accumulation of magnetic field required to sustain the 
driving electric field. By ignoring the latter term, the 
physical situation is precisely that which would obtain 
if the process were driven by a battery placed in the 
circuit rather than by the magnetic field 3¢(/). For this 
case, (J (t)) reduces to 


1 t 
> ANE = dt, (ts) 


tht Ja 
t 

xf di (EIO (a) IONO. (99) 
By Eq. (90) we identify 


(100) 


tı 
QO(n)= | azo): 


af) 


Therefore, 


o=- * LENTOJ OHO (101) 


n Y o 


which is idantical with Eq. (91) for the case of one- 
dimensional electrical conduction. > 


8. APPLICATIONS OF THE FIRST-ORDER THEORY 


Several applications of the foregoing first-order 
theory are now mentioned briefly. i 
In their original paper on the fluctuation-dissipation 
theorem, Callen and Welton‘ discussed the relation of 
4 that theorem to the energy density in an isotropic 
radiation field. The impedance of a charged particle 
driven by, a periodic electric field exhibits a dissipative 
term arising from the radiation damping force. Ac- 
cording to the’ fluctuation-dissipation - theorem (78), . 
this implies the existence of a random fluctuating elec- 


tric field exerted by the vacuum on the free particle. 
: o a 
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be just the familiar Planck radiation density. 

Van Vliet? has recently employed the fluctuation- 
dissipation theorem to discuss the equilibrium charge 
carrier fluctuations in semiconducting materials. A simple 
admittance matrix corresponding to a linear RC net- 
work is introduced, the resistances being expressed in | f 
terms of transition rates between different groups of iB 
carries levels. The fluctuation-dissipation theorem there- a 
by yields the second moments of the equilibrium carrier a 
fluctuations in terms of the thermal generation-recom- pr 
bination process. The charge carrier fluctuations in iy 
turn give rise to a contribution to the driven noise, to a 
which further reference is made in Sec. 12. 

In addition to the fluctuation-dissipation thtorem 
and the spectral reciprocity, Kubo? points out that 
general proofs of certain sum rules can be obtained from 
irreversible thermodynamic considerations. Thus, for 
the case of electrical conductivity in a system of inter- 
acting particles in an applied magnetic field, he finds 
that the frequency integrals of Re™Y;;() and 
w Im“ Y;;(@) are given by 


Cn, 
—-6 ijy ( 102) 
Mr 


2 7? 
= il dw Re® F; lo 
TYo “ T 


ebr 


Ic.. (103) 


2 i) 

-f dw w Im™ Y;;(w) => 3 
a Yq r MPC 

n, My, and e, are the number, mass, and charge, respec- 
tively, of the rth type of particle, and 3. is the z-directed 
applied magnetic field. Analogous sum rules can be 
derived for the magnetic susceptibility matrix. 

H. Mori! has applied the fluctuation-disspation 
theorem to the analysis of transport processes in fluids. 
The coupling between the slow macroscopic relaxation 
of the system and the rapid microscopic fluctuations 
is shown to be responsible for the dissipation. Thus, the 
coefficients of viscosity, thermal conductivity, and 
diffusion can be computed in terms of the equilibrium 
fluctuations of the thermodynamic fluxes. a 


9. THE SECOND-ORDER RESPONSE 
IN A GENERAL PROCESS 

In Secs. 5 through 8 the first-order theory of irre- 
versible thermodynamics was reviewed, showing the 
relationship of (Q;(¢))™ to the equilibrium fluctuations. 
Sections 9 through 14 are devoted to an extension of: _ 
the fluctuation-dissipation concept to the driven secon 
moment (3[Q;(4), Q;(¢+7)],) and to the seconc 
higher-order terms in the driven response (Q 
number of interrelationships among these q 
and the equilibrium fluctuations are &tabli 


2 K, M. Van Vliet, Phys. Rev. i 
8 H., Mori, Phys. Rev. 112, í 
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pF ChytittesA) $ > of Fx(tz) prior to F;(tı), while the second term charac- 
t2 terizes the contribution arising from the application of 
F.(t2) subsequent to F;(t). A 
Although Eq. (105) defines iix® (tı,t2) only ion 
positive ¢; and fz, we accept Eq. (106) as the forfual 
definition of ¢;;.° (taste) for arbitrary values of ¢; and és. 
e The symmetry properties of pij ® (t1,l2) permit us to 
evaluate this function for arbitrary times in terms of 
(2) ~ its measured values for positive times. 

=P igj (tit te,tesA) We first observe that ¢:jx@(,l2) is invariant with 
respect to simultaneous reversal of all times and the 
applied magnetic vector potential A, the argument 

being identical to that given below Eq. (45). 


bij (—hi, = te; —A)= pin? (hfe; A). (107) 


Fic. 4. Further, pix (4:,f2) can be written in either of the 
following forms. 


-Piay(titte chiA) 


Consider the second-order response (Q;(#))®. From 


1 2 
Eq. (16), age (t,t) =( —— } {—(LO; (d), 
i a) a as) f me XLO (4), Odes) 


z HELO: Q; (4) Js, 
© (to) CQ; ® (t1),0: ()]- J% 
X (Oe (4) [O (4), 0:0] O EN 


t t t 
+f au) f dtoF ;.(tz)((L0; (4), These forms follow by writing out all terms in the 
= 4 double commutator of Eq. (106) and appropriately 
regrouping. Consider the double commutator form of 


X10), (104) Eq. (108), which states that 
pir” (tito) + ni (t2, —t1— le) 


where we have inverted the order of integration in the Fori; (—tı— tə, 1)=0. (109) 
second term. The second-order after-effect function eo eraon hip corresponds to Eq. (43) for 
$i; (ta,t2) is defined by rewriting Eq. (104) as SO ; P apoms LV O 
1 t tı We now return to our observation that the sym- 
| 00-2] f aisF su) | dtoF z(t) metries (107) and (109) permit us to evaluate 
| 2 ik | I o = pij” (41,t2) for arbitrary times from its measured value 
i i ee for positive times. Consider the ¢;t2 plane shown in Fig. 
a X xii (4~ be, i—h) 4, which we have divided into sectors. The value of 
‘ie A A pix” (t1,t2; A) in sector I is obtained by direct meas- 
ni +f ants) f dtoF ;,(t2) urement. 
| a a ; The value in sector II (for which b> —h>0) is 
obtained by noting that P 
| Xpjri” (t2— tı, t-te) (105) pij” (Lilo; A)= — pin” (L+t; —t2; A). 

whence Sector III is determined by rewriting Eq. (109), 
a interchanging the indice 7k in the second term and 

hijr? (41,2) - reversing the vector potential A in the third term. 


Horij? (th, —h; —A)=0. (110) 


@)( Te is the response (Q,(/)) at time t(>0) Then for f2>—t>0 the second and third terms are 
ô- g ti pd F; applied at time zero, and a measurable, thereby determining the value of the first 


jon. F; applied at time 7(>0). Conse- „term in sector IV. z 


~ =) COs £0500, on tL). (106) $5 (sh; A) =r (— hy, hH; A) 
\ in 


first term in Eq. (195) characterizes the Equation (107) reflects the known values into the 
a 4)\@) arising from the application remaining half-plane. Z 
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We now discuss the relationship which exists between 
pij ® (t1,2) and the equilibrium fluctuations. Consider 
~, the form of Eq. (108), for ise (int), involving the 
double anticommiutators. This can be rewritten 
a 


1 2 
disk (tut) = a( ==) [Wis (—t1— bs, h) 
S: mn 


a 


— FiO, aA (ati) 
where 


Vise (tayls) = 3(L0;, LQ; (4), Qi. (ttt) J]. 
(112 


Since W; jx. (41,42) is the equilibrium expectation value 
of a symmetrized product of the operators Q; 0; (t), 
and Q;,©(ti+é2), it is a third correlation moment 
among the equilibrium fluctuations of the variables cor- 
responding to these operators. 

However, referring back to Eq. (108) to identify the 
two third equilibrium correlation moments in Eq. (111), 
we see that each involve precisely the same operators 
at precisely the same times, although the order of sym- 
metrization is different! The two distinct third moments 
correspond in principle to different ways of measuring 
the correlation, as can be seen from the following general 
considerations. 

Since there is only one way of symmetrizing a product 
ofttwo non-commuting operators, it is possible to write 
a unique quantum-mechanical expression for the second 
equilibrium correlation moment [see Eq. (17) ]. How- 
ever, quantum mechanics furnishes no such unique 
a priori prescription for symmetrizing a product of 
three (or more) operators.!4 Thus, for example, in Eq. 
(112) we introduced the equilibrium symmetrized 
quantity WV; (é,f2) containing four permutations of 
the operator product 0,0; (4:)Q, (t,t), while we can 
also construct the fully symmetrized form Wij. (t,f2) 
containing all six permutations. 


¥ Vij. (t1,t2) 
= 3(0,0; (4)Qe (+b) 
p’ +0; (t)Q. (t t2)Q; 
i +6, Git+t)0:0; ) (4) 
$ AAE + (complex conjugate). €113) 


Each possible symmetrized arrangement corresponds 

i to some particular experimental measurement. Con- 
: sider an experiment in which three detectors monitor 

i j the variables Q;, Q;, and Ọ@r, and feed their signals into 
Gi a counter. Appropriate time delay circuits are inserted 
between the Q; and Q; detectors and the counter such 
that the counter makes a single measurement of the 
desired product. Since all three signals are handled in 
a completely symmetrical fashion, this experiment 
measures the fully symmetrized operator product 
Tir ® (igt). S 


“J, R. Shewelh Am. J. Phys. 27, 16 (1959). 
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Alternately, suppose the Q; signal and tne delayed 
Q; signal are fed into a multiplier, and the multiplied 
signal is fed into the counter along with the delayed 
Q; signal. The counter makes a simultaneous measure- 
ment of the product of its two inputs. In this case Q; 
and Q; are treated symmetrically, as are their product 
and Q;, and the experiment measures the quantity — 
Vi. (hy, te) of Eq. (112). 

In the classical limit (zero order in 6) the two cor- 
relation moments become identical, as discussed later 
in Sec. 13. 

Returning to Eq. (111), we see that this expression 
constitutes a thermodynamic relationship between the 
second-order response, characterized by the second- 
order aftereffect function $:%,(t1,t2), and the equi- 
librium fluctuations, characterized by the difference 
between the two operationally distinct third correlation 
moments W;;; (—t—te, t1) and Wx; (to, —tı— t2). 


10. FIRST-ORDER NOISE IN A GENERAL PROCESS 


In this section we consider the first-order term 
(LQ;(2), Q;(t4++7)],)@ in the driven second moment, 
establishing its relationship to the equilibrium fluctua- 
tions. The relationship between the first-order driven 
noise and the second-order response is also discussed. 

The relationship of the first-order driven second 
moment to the equilibrium fluctuations is conveniently 


approached using Eq. (20) for ($(0;(4), O;(#+-7) ],)™. 
The quantity 


— (1/218) LO (h), [0 ©, GOUDA 


appearing in the first term of Eq. (20) is the noise 
response (3[0;(t),Q;(t+7)];)© to an applied force 
F,.(t)=6(t—h), t1<t<t+r. As such, it characterizes the 
noise contribution arising from the application of F;(¢1) 
prior to time ¢. This noise response function can be 
readily symmetrized and is the quantity of primary 
physical interest. 
On the other hand, the quantity 


— (1/2ih)(LO; 1), LOO (t1), Q5 (4-7) LI) 


appearing in the second term of Eq. (20) is the noise 
response to the force F,(t)=8(t—t), t<i <tr. As 
such it characterizes the noise contribution arising from 
the application of F;(é) in the time interval ¢ to t+7, 
during which the noise is being measured. We denote 
this function by 0x; (4—¢, t+r— t). Thus, 


8544. (ids) à be 5 


“il 
= -COs [0° OMY. C 


As we shall see, 0:;. (41,2) is not of partic 

interest; we discuss it briefly at the 

ection with the second-order 1 
st equilibrium expect 


mM 


W. 


BERNARD 


a 


o 


invoking -the basic relationship (82) between, com~ 
mutators and anticommutators. Replacing t by (t—h) 
and introducing the transformation /;/=¢—?’ gives Eq. 
(82) in the form ` 


1 . 
OY (tx) SOX) ORDY ; 
th 


5 -f OEE AOON (115) 


If we replace the operator Q; by the anticommutator 
4[0;,0; (7) ];, this becomes 


1 Or 
E ORE, ODO), OOUT) O 
2ih 


= -f dty'T (ih) GCA ® (4), 
~ X[0: (1), GOAI. 


The result (116) can be substituted directly into 
Eq. (20), along with the definition (114) of 0;;,™ (t1,t2), 
to yield the first-order driven noise in the form 


G[O.(), O;(t+7) 1) 


(116) 


=-5 J diy (ts) J dT (h— h’) 
X41. (4), LOO), GOUJO 
ED f dli F(t) Oir; (tit, i+r— t). (117) 
kw; 


As discussed in the preceding section, the symmetrized 
equilibrium expectation value 


210. (4'), [0 @, QO (4-7) 414) 


is one macroscopically observable form of the third cor- 
relation moment among the spontaneous equilibrium 
a fluctuations of the variables: corresponding to the 
operators Q(t’), 0: (t), and Q; (+7). 

A thermodynamic relationship between the intensity 
of the first-order driven noise and the equilibrium 
fluctuations follows immediately from Eq. (117). 
Letting 7=0, the second term in Eq. (117) vanishes, 


ae ee a 


H0: O) 03 (9) J- @) a a 


oa A 1 a(t!) ° 
Aa p k 29. ~~ e 


the a driven noise 


x 
a 


AND 


aE, Bg (ANIL IL 18 IN} 
intensity (3(Q;(¢),0;() 4) in terms of the equilibrium 
third fluctuation moment 


LQ. (4'),[0: (0,0 O11). 


Returning to the more general quantity € 
3LO:(4), Olr), 


for most purposes ene is interested in measuring this 
driven noise only for processes in which the imposed 
forces are slowly varying over the time interval during 
which the noise is being measured. We therefore 
assume that F;.(t;)-~constant=F;, in the interval ¢ to 
(t+7) although, of course, /;,(t1) is arbitrary for 4 <t. 
Further, we decompose F;(¢;) into two components 
according to 


Fi, (t1) = P+ AF; (t1). (119) 


The contribution to (3LQ;(4), Q;(t+7)],) arising 
from the time-dependent force component AFx(t1) is 


D2 f anart) f dty'T (i— h’) 
FLO. (4), LOO), GOH) 


since the integral J;!+" dtıAF;(tı) vanishes. It is there- 
fore directly expressible in terms of the equal oim 
fluctuations. 

The contribution arising from the constant force 
component Fx, on the other hand, is just the first-order 
term in the perturbation expansion of the driven second 
moment with respect to a constant applied force. 
However, the application of a constant force Fy implies 
simply a change in the corresponding (equilibrium) 
intensive parameter associated with the system. Hence, 
this contribution can also be regarded as a macroscop- 
ically observable characteristic of the equilibrium 
system. We denote it by 


(120) 


3100; ()1,)® 


where the derivative is evaluated at F "=Q. = 


Inserting the quantities (120) and (121) into Eq. 
(117), we obtain the result 


LO:(4), Olr) 1) 
A a$ f de aP) f An(n) 
(FLO. (H), LOO, GOUJO 


a 
HP BLO 01 (7) 14) . ame 


a 
a 


` Equation (122) constitutes a thermodynamic relation- 
ship between (3[;(4), O;(¢+-7) ],)© and the indicat 
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macroscopically observable characteristics of the egui- ; (t4+-7))0 
librium fluctuations. i toa, ie), ae 
We further note that (0/0F;)(3[0:,0;(r)],) can a E 7 
be re-expressed in terms of the nonlinear BO of = |-0 f dh AF x(t) (år (h)g: Og; 7) 
the> system by invoking the fluctuation-dissipation H 
theorem, Eqs. (71) and (72). In order to keep the 2 2 1 dV i;(w) 
ws notation simple, we consider explicitly the case of no = uk f dw coswr— Re L (126) 
applied magnetic field, for which (3[0,,0;(r)],) TÊ “o o? k í 
reduces to ; We return finally t i i ior A 
; y to a brief discussion of the rela- KA 
9 GOGOA] tionship of the noise response function 6, (4,l2), 
defined in Eq. (114), to the second-order response. 
2 p” Re Yi;(w) This relationship also stems from the basic relationship 
p= f dw coswrE® (w; 8)-———-. (123) between equilibrium commutators and anticommu- 
To CH tators given in Eq. (82). Replacing the operator Q; in 


A 


Just as GLO:;,0;® (7) ],)© is a function of the applied 
forces Fr, Y:;(w) also in general depends upon Fp. Con- 
sequently, Eq. (122) can be written as 


(4(0:(2), O;(t+7) 4) 
f dh'T(h—h') 


X (CALO (1), COO, 03 (+7) TO 
OY ;;(w) 


= -Í dh AF ;,(t;) 
k 


= 


2 f2 1 
° ò -- f dw coswrE® (w; 8)— Re . (124) 
To wo 


oF 


The physical significance of the derivative 3 Y;;(w)/3F; 
can be regarded as arising from the nonlinearity of the 
system in the following way. Most physical systems 
are nonlinear. Nevertheless, for sufficiently small devi- 
ations from a given “operating point” (corresponding to 
a constant applied force Fx) the linear approximation is 
adequate. As a consequence of the nonlinearity of the 
system, however, the admittance matrix must in 
general be a function of the “operating point.” The 
quantity dY;;(w)/dF;, specifies the first-order contri- 
bution to this dependence on Fy. 

Equation (124) is an alternate thermodynamic ex- 
pression for (3[0;(d), Q;((4+-7),) to that given in 
Eq. (122), expressing the first-order driven noise in 
terms of the equilibrium third moment (1Q.(h’), 
X10: (H, Q;(t4+7)]14]4)© and the second-order 
response function dV ;;(w)/0F%. à 

In the classical limit, Eqs. (122) and (124) reduce, 


respectively, to 


(gi q3(¢-+7)) 


=x] -6 f dh AF (h) ilt OHT) 


a 


A 


ð . , 
+F,—(qig;(r)) }, (125) 
OF; 
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Eq. (82) by the commutator — (1/14) LQ; Q: OTt) ]- k 3 
and noting Eqs. (106) and (114) for pix” (t,t2) and i 
Bix (t1,f2), respectively, we find that 


2 ð 

bin st) = f dty'T(t,—ty')——8 ize (th ,t2). (127) 
=> ôtı 

The spectral form of Eq. (127) is found to be 


1 2 o A 
655. (l,l) =— f don f dwĘe”1teivats a 
2 —o 


LT Yan 


ijk P (w1,2) 


L 
XED (w; p) (128) 


w1 


where Li (w1,w2) is the double Fourier transforms of 
bin (tite) defined by 


1 «wo a2 
f don f dace iteiozta 


Qij” (ht) =— 
E —20 ` 


Qa 
X Lij” (1,02). (129) 


Equations (128) and (129) have the following formal 
implications. Using Eq. (114) for 6:3. (t1,f2), Eq. (19) 
for 4[0,(1), O;(t+7) 1,4) can be written in the form 


GLO, GUI d 
j > diy F .(t1)9 521% (hi — tr, i— k) 


t+r 
+f dh F r(t) ir (hat, t+r—t) |. (130) 


Since Eqs. (127) and (128) express the function. 
9:3: (tz) in terms of the macroscopically observable _ 
second-order aftereffect function pij ® (tut), Eq. (130) E 
permits us to compute (@[Q;(/), Q;@+7)],) from s 
suitable measurements of the second-order response. 


11. FLUCTUATION SYMMETRY 


In the preceding two secti we es 
preceding tyoisegtion SEIN CCS 


basic interrelations 


ne Second-orde 
Ss 


$ 


c o. 1094 
the first-erder second moment, and the equilibrium 
third-moment. e 

In order “better to appreciate this basic interrela- 
‘tionship, it is of interest to examine its consistency from 
the point of view of symmetry. For this purpose it is 
convenient to represent the response (Q;(#)) toʻa set 
of applied forces F(t) by the symbolic expansion 


(O) =X: OE FX jO+F LV TFX OH o (131) 
7 e ik 


The response functions x are suitable combinations of 
the aftereffect functions defined previously, while the 
Ş; are integral operators, linear functionals of the forces 
F(t) acting on the X’s. X; denotes the equilibrium 
expeCtation value (Q;). Similarly, the driven second 
moment {3[Q;(¢), Q;(t+7) ],) can be represented by the 
expansion 


GEO: Olr) 


=E OHE Fr tr OHA D Fr SFréruztH -e (132) 
k kl 


where the ¢ are suitable noise response functions, and 
the Fs are appropriate integral operators. 

The physical symmetry of many systems is such that 
reversal of all forces simply reverses all responses 
{Q;(t))—(Q;). For such systems only the odd terms in 
Eq. (131), and only the even terms in Eq. (132) can 
exist. 

The physical symmetry referred to above also has 
obvious implications for the equilibrium fluctuations. 
Consider particularly the third equilibrium correlation 
moment (9;9;(t1)q.(t2)). This moment is defined in 
terms of an integral involving the equilibrium joint 
probability distribution W3 (q,; q,,t1; qx,t2). For many 
systems the physical symmetry implies that this joint 
probability distribution is unchanged if each of the q’s 
is replaced by its negative. For such a “fluctuation- 
symmetric” system, all odd equilibrium correlation 
moments vanish. 

The relations which were proven among the second- 
order response, the first-order noise, and the equi- 

librium third moments indicate that the physical sym- 
metries referred to in the two preceding paragraphs are 
equivalent, as we might intuitively expect. A system 
which is fluctuation-symmetric, with no odd equi- 
librium moments, exhibits no second-order response, 
and no first-order noise. : yee 
A homogeneous system, symmetric under spatial in- 
“version, is fluctuation-symmetric with respect to its 
transport properties. In such systems, electron, phonon, 
or other currents can exhibit ng first-order noise. 
i With fluctuation-symmetric systems, at Is mecessary to 
so +0 second ofder to obtain a contribution to the driven 


mI ih. Gee i ily observed in semi- 
ade and, m fact, Is casuy 
By analogy with Eqs. (111), (124), and 


7 


ye 
zA 
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h secon i ise can be appreciable 
ch second-order driven nois pp ‘ 
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(127), the second-order noise may be presumed to 
depend generally in some complicated way upon the 
third-order response and the equilibrium fourth 
moment. In Sec. 12, we give a limited discussion of one 
contribution to second-order noise. c 

On the other hand, there are many systems which do 
not obey fluctuation-symmetry and which therefore 
may exhibit first-order driven noise. Rectifiers, for 
example, because of their pronounced asynimetry with 
respect to current flow, necessarily possess significant 
equilibrium third moments. 

Finally, it is possible for a system to be fuctuation- 
symmetric with respect to some of its variables but not 
with respect to others. Thus, for example, a p-n jr-nction 
is fluctuation-symmetric with respect to current in the 
plane of the junction but fluctuation-asymmetric with 
respect to current perpendicular to this plane. Another 
example would be a bulk solid, which we have pre- 
viously mentioned as being fluctuation-symmetric with 
respect.to its transport properties. Such a system would 
not in general be fluctuation symmetric with respect to 
its thermodynamic extensive variables such as energy, 
volume, or the number of particles in the conduction 
band. 


12. SECOND-ORDER DRIVEN NOISE 


Although we do not undertake a complete discussion 
of the second-order noise in this paper, it is of interest 
to indicate how the general theory would apply in a 
specific physical situation. As an example, we consider 
the steady-state thermal generation-recombination 
noise in semiconductors. 

We consider explicitly the second-order driven 
current noise (3[J;(t),J;(t+7)],)®. In accordance 
with the recipe developed in Sec. 3 for computing driven 
second moments in terms of the equilibrium system, we 
have that 


GLI:(), Ji(t-7) ],) 
=H OW, L(+) 

HEH, JOT) a 
HOO, TOUET). (133) 
Although the first and third terms also contribute to 
LJ:(4), J;(t+7) 14), we focus attention on the term 
GLI: (4), J; (t+7)],)©, which is subject to clear 
physical interpretation. 

The first-order driven current operator J;“)(t) is 
obtained in accordance with the discussion at the end 
of Sec. 7. 


1 £ A 
TAD (W) = a 2 f URIA OCRA L (134) 


—% 


In the steady state, the electric field &;({) is constant, 


CC-0. Gurukul Kangri University Haridwar Collection. Digitized by S3 Foundation USA 


ran 


IRREVERSIBLE THE 


and Eq. (134) reduces to 


HOO 
J; D=- Z8 k epli | 
ore ih ka ` h 


HOL 


2 xf TOONANE Eia | (135) 


` > 
where we have let ¢;’=/—t, and extracted the resulting 
t dependence of the commutator as indicated. 
We define the unperturbed Heisenberg operator 
oxi (t) corresponding to the kith element of the con- 
ductivity matrix by rewriting Eq. (135) as 


TOOS Ero i (À) (136) 
k 
whence 
1 HOL 
cui (= —— exp: | 
5 ih h 
0 H®} 
xf dhl ® (H, Ji | SE | (137) 
rp 1 


Using Eq. (136) for J;@(é), the contribution 
ALI: (t), J;® (t+7) 1.) to the second-order steady- 
state current noise becomes 


GLY: .(1) JE j%(n) 1]; pO 
=} E63 Lorgo ® (7) |) © 


where we have set /=0 in virtue of time stationarity. 
The quantity ($[Loxi,01; (7) ],) can be interpreted as 
the second correlation moment between the spon- 
taneous equilibrium fluctuations of the conductivity 
matrix elements cz; and o;,)(r). 

The correlation function ([,0(#)],)© is easily 
calculated in the case of a simple semiconductor for 
which the conductivity is given by o =ner,/m*, where 
n is the equilibrium carrier concentration, m* is the 
effective mass, and 7s denotes a simple relaxation time 
for the scattering mechanisms. Assuming that 


(138) 


D 


n(t) =n! "e (139) 


where» rg denotes a relaxation time associated with 
thermal charge carrier generation and recombination, 
the second-order term (JJ (¢)) in the steady-state 
current noise aol is 


(0) § e? 


(JOT (f))\O = Gi 


eiT, 
= \Oe-ttag (140) 
m* 


which is a well-known resvit.” 


15 See Roe example W. Shockley, Holes and Electrons in Semi- 
conductors (D. Nap Nostrand Company, Inc., Princeton, New 
Jersey; 1950). _ 
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°” There exist in general other contributioħs to the 
second-order steady-state noise, as can bẹ seen “from 
Eq. (133). Here we have sought only to establish the- 
connection of the general formalism’ with the ustal 
model treatment of semiconductor noise. 


13. HIGHER-ORDER STEP-DRIVEN RESPONSE 


Whereas we have previously shown that the second- 
order résponse in a general process is characterized by 
the difference of two equilibrium third moments, the 
classical result of Eq. (38) suggests the possibility of 
establishing a more conventional quantum relationship 
in the case of step-driven processes. Although measure- 
ment of two distinct equilibrium third moments is still 
required to determine the second-order step-driven 
response, we shall find that for this simple class of 
processes the relationship is in close formal analogy to 
the first-order fluctuation-dissipation theorem. Further, 
the uniquely quantum-mechanical effects are more 
easily visualized in this case. 

It is convenient to consider (Q;(¢))™ as given in the 
form of Eq. (33). Rewriting this expression so as to 
indicate explicitly both contributions from a given pair 
of forces F;, Fr, and assuming that (0)=0, ` 


AL 


B 
(0) =4 E FFs if Ol! ads 
ik 0 0 


XL; (— ih) Qn (— ih): (DY) 


HOO (= ad.) 0; (— 17.0 DY. 
(141) 


Inserting exp[-+GH® ] in front of the operator Q; (¢) 
and permuting the operators cyclicly, the integrand of 
the second term in Eq. (141) can be rewritten 
(Qi (=7hd1)Q; (= iha): ® ()) 
= (2: (Qe (—imd. +18); 
X (—thd2+7hB))©. (142) 


Thus, inverting the order of integration, and making 
the successive transformations \2’=@8—)j, Ai’ =B—De, 


B M 
J d f AKQe (=i) Q; (— ih): (0) O 


8 M 
= f dı OOR (HO, (Fiho) 
> n KOOCHANDO. (8) 


Teei (143) into Eq. (141), we thereby obéain — 
(Q;(#)) in the form 


Bi AL Á 
(Qld) =3 E PPr Í aa 
x qk 0 0 e 


e 


; A Fe 


FENE 
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It is flirther convenient to define the second-ordér 
step-response function ®;z:® (t) by rewriting Eq. (144) 


as 
Q (Q:())@ =3 x BFP 5x5 (t) 
2 


whence 
2 g M1 
O= f ix; f ars(O}(—ihr) 
O« 


XO; (—7hr2)0; (t)) + (c.c.) ]. 


The notation (c.c.) is used to indicate the complex 
conjugate of the first term. ®;,;°(¢) represents the 
second-order response to unit step-function forces F; 
and Fz. 

We now proceed to analyze ;,;° (t) by decomposing 
the equilibrium expectation values appearing in Eq. 
(146) into appropriate summations over matrix ele- 
ments in the unperturbed energy representation. This 
technique is similar to that employed in Sec. 5 in 
connection with ¢;;"(¢), and yields 


(146) 


8 `l 
jp: ()= f dx, i ME p(E)(Es|05| En) 
0 0 lmn 


X(Em|Qx| En En|Q:| Eye EEn) 


er2(Em—En) epla 
1 


J (c.c.) (147) 


where (;|Q;|Z,,) is the matrix element of Q; between 
the eigenstates of H having the energy eigenvalues 
E, and Em, and p(E,)=e-F2!/>-, e-BE1, 

Performing the indicated temperature integration, 
and replacing the triple summation by a triple integral 
over energy eigenvalues, this becomes 


Bi (1) = ii dE; f dE», f dEnp(Ei)n (Ei) 


Xn (En (Ea | E 


e8(Zi-En) 
(Em— En) (Ei— F 


(Eı— 


e8(Zi-Em) 


-— 4 
(En— En) (E:— Em) 


X (i| Q;| En) Em| Qu | En) En|Qs| E) 


a. (En— E)t : 
x és] i ~A (148) 


where n(Eı) is is the energy “density-of-states function. In- 
C _ troducing the ‘tr: formations 


a i folie EqieEy+ hie, En= Either 


eae 
es 
ME 


(149) ° 
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finally obtain $;p:® (4) in the form 


i O10 f “lin f doret N (onw) (1) 
2m J =) : bakes 
where 
N ji (ww?) 
| 1  exp(—/Bw) ee) 
= FP 
W1W2 @1(w1— wo) @2(wW2—w1) 
1 = exp(+/8w) 
Xenilorir)+| Te 
(OSOD wiılwı— w2) 
exp(+hBw)] 
lente —we) (151) 
wo(w2—w) 


and 


8 jk. (1,02) 


Sn (One) = Í dEp(Ej (Ejn (E+ ho:) 


Xn (E+hw)(E| Q; E-+ hw ) 
X(E+ hes |Qx| E+ hes)(E+ hos Q:l E). 


(152) 


In order to relate the second-order step-driven 
response to the equilibrium fluctuations, we undertake 
a similar spectral analysis of the equilibrium correlation 
moment of the variables corresponding to the operators 
Q;, Ok (4), and QO; (t;+/2). However, as discussed at 
the end of Sec. 9, there exist several equally valid, 
operationally distinct quantum-mechanical expressions 
for a given equilibrium third moment. Thus, the quan- 
tity Wj,; defined in Eq. (112) corresponds to one 
particular set of experimental conditions, while the 
fully symmetrized equilibrium form W;,; defined in 
Eq. (113) is appropriate to a different experimental 
arrangement. We find, in fact, that measurement of 
both Wj. (t,t) and Wj; (ht) is required for a 
complete experimental determination of ®;r:® (0, 


except in the classical limit. a 
Equation (113) for F;r:® (4,t2) is coriveniently re- 
written G 


W iri O (tr,te) = E (00O (4) OQ. (Ht) 
+0; (—1hp)Ok® (t1)Q:® (hH t) 
+2; (—1hB)Q. (t1—thB) 
XO; (4: +t2)) V+ (c.c.) (153) 


where we have inserted exp[-+6H] at suitable posi- 
tions in the second and third terms and permuted the 
operators cyclicly. 

As in the case of ®;,;®(d), the quantity Tyas (tala) 
can be decomposed into a triple integral over matrix 
elements in the unperturbed energy representation. In- 

odu ansformations (149) and letting 


~ 
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@1—> —wı 42 —w in the (c.c.) term, we obtain the 


expression 
> , 1 sp A 
W 53 Oh, t) == T do f duet itigivate 
2m Yn —o 
i 2 XG ji (ww) (154) 
where 
» > 
Gjiri ® (w1,02) 
h? 


TA [1+ exp(— Abw) +exp(— hbw:)] 


gins (ww) -+[1-+exp (+h Bo) 


+exp(+hBw2) Jg” (~w —w2)}, 


gjxi(w1,w2) having been given previously in Eq. (152). 
Gjxi© (w1,w2) is the double Fourier transform of the 
equilibrium third moment V jz; (4,t2). 

Similar analysis of the equilibrium third moment 
W jn: (h,t2) of Eq. (112) yields the result 


1 a o 
Y jki ® (ti,t2) =— f do f dase itegivte 
2m Yo —c0 


(155) 


XG jx: (ww) (156) 
where 
x Gj (w1,W2) 
h 
=7 (Fep (neonli 
+[1+exp(+ Bor) gi" (wn —w2)}. (157) 


We now discuss the relation of the second-order 

step- -driven response to the equilibrium fluctuations 

using the spectral quantities Njr:® (wi,w2), Giri (1,02), 

and G jki® (w1,w2) obtained above. In order to indicate 

the formal analogy of this relationship to the first-order 

a fluctuation-dissipation theorem, it is convenient to 
define the functions rjr:® (vnw) and fx: (@1,02). 


1  exp(—/Bw:) exp(— =] 


Tii (Aw) = 
A PER ww: wlw w) w2(w2—«) 


a 


X giri(wiw2), (158) 


fii (1,02) = Trtep(- hbwi)+exp(— Abw) | 

Xgiri(wr,o2), (159) 
in terms of which Eqs. (159) and (155) can be rewritten 
Nixs® (wrw) =r (vnw) Fri: (—w, ~w), (160) 
Giri O (w0) = fins (w1,02) + fini * (w —@2). (161) 


Comparison of Eqs. (158) and (159) shows ‘that , 
Tiri? (ww) and firi (ww), Which characterize ie 
second-order step-driven response and the equilibrium 


~ 
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fluctuations, respectively, in the sense of Eqs. (160) 
and (161), are related according to 


: Foa p 

KOOLON naue OT 162 3 

h jk (w w2) EO Caan a (w w2) ( ) | 
where x T 
: ; A i 
E% (lonw; b) ; f í 
A 1+exp(— iBw1)-+exp(— bw) i 

ss —. (163) 


6 1 exp(—MBer) exp(— Abw) 879 6% 


wiw: wlw w)  w(wə—wı) 


Equation (162) strongly suggests itself as a direct 
extension to second-order processes of the fluctuation- 
dissipation theorem of Eq. (58), which relates the 
Fourier transform L;;“ (w) of the first-order aftereffect 
function ¢;;"(¢) to the Fourier transform G;;(w) of 
the equilibrium second fluctuation moment. The uni- 
versal function E® (w1,w2; 8) is the second-order analog 
of E®(w;8). It is uniquely quantum-mechanical in 
origin, corresponding to a slight smearing out of the 
microscopic contributions to the macroscopic response 
at extremely high frequencies. In the classical limit, 
E® (w1,02; 8) — (1/6?) as indicated. 

Whereas Eq. (58) constitutes a true thermodynamic 
relationship, however, Eq. (162) does not. That is, the 
function fj: (w1,w2) is not macroscopically observ- 
able, although its sum with f;.;*©(—w1, —we) is, as 
indicated by Eq. (161). i. 

The specific way in which quantum-mechanical a4 
effects enter the picture is evident by first considering Be 
what happens in the classical limit. For this case, Eqs. 
(151), (155), and (157) reduce to 5 


202 


B 
N jx: (ww) =, Boeloe) Hena an —we) | 


=2E (ww) =G iri ® (1,2). (164) 


The functions G r: (1,02) and Gjxi©(@1,w2) have 
become equivalent, and measurement of either com- 
pletely determines N jx; (w1,02). The temporal form of 
Eq. (164) is obtained immediately from “Eqs. (150), 
(154), and (156). : 


B55 (1) za E ugal). (165) 


Substitution of this result into Eq. (145) for (Q,(2))® 
yields the classical result given previously in Eq. mm 
We decompose g jx:(w1,w2) into its real and i imag 
even and odd parts with respect to simult 
reversal of w; and w». i 


galono) ON 
= Reg ;ri(wnw) Hi Ing is 
+Reg jr: (wwa) 


The superscripts G 


1038 


odd part, respectively, under the transformatiof 
wı > —w i, @2—? —we. Using this decomposition, the 
classical relationship (164) becomes 


K 


4282 
N 5xi (w1,w2) Base Dg irilw w2) 


t 


+2 Img jx:(w1,2) | =3°E 5 (ww) 5 (167) 


Thus, in the classical limit we are concerned only with 
Reg 5,;(wi,w2) and Im g;x;(@1,2), both of which 
are determined by experimental knowledge of the 
complex quantity Gx: (w1,02) = Gyn: (w1,w2). 

In the general quantum case, however, NV jx: (wi,w2) 
depexds upon all four components of 9;x;(w1,w2) because 
of the uniquely quantum-mechanical spreading intro- 
duced by the quantities 


1 pace st fe) — 


w2(w2—«) 

appearing in Eq. (151). For this case Gx; (w1,w2) and 
Gjx; (1,2) are no longer equivalent, and Eq. (155) 
and (157) constitute two independent complex ex- 
pressions which can be solved for the four components 
Of gjxi(w1,w2). Because of the quantum-mechanical 
interference among the components of gjx;(w1,we), 
measurement of both equilibrium third moments 
Vj: (t1,f2) and Yiri (t1,t2) is necessary to determine 
completely the function V;,;(w1,w2) and, consequently, 
by Eg. (150), the second-order step-driven response, 
Piri (t). 

It may be presumed that an analogous quantum- 
mechanical analysis of the third- and higher-order terms 
in the step-driven response can be made, although we 
do not attempt to carry out this laborious program 
here. Instead we simply refer to the classical relation 
between the step-driven response and the equilibrium 
fluctuations, given previously in Eqs. (36) through (39). 


14. STEP-DRIVEN NOISE 


@1(@1—w»2) 


For step-driven processes, the perturbation expansion 
of the driven noise (3[Q;(t),0;(t+-7) ],) simplifies con- 
siderably, exhibiting a strong formal similarity to the 
step-driven response (Q;(t)). We can, therefore, use the 
techniques employed previously for analyzing the step- 
driven response to discuss the relationship between the 

step-driven noise and the equilibrium fluctuations. 
Consider Eq. (117) for the first-order driven noise 


GLO, Os(t-+7) 1). For step-function fortes, the 
un 


sym metrized term vanishes, and Eq. (117)*reduces, 
upon integration, to z 


GEO, DW je 
tes r D Fy f dir (t) GTO (4), 
OL Aled 
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(0), (168) which is equivalent to Eq. 
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Equation (168) can also be obtained from Eq. (88) 
for the first-order step-driven response (Q;(t))™ by 
replacing the operator Q; by the anticommutator 
3[0:,0;(r)],. Equation (168) relates the first-order 
step-driven noise (3[Q;(#), Q;(t+7)],)® to the tuird 
equilibrium correlation moment 


212 (A), LOO, CH C4) I. 


In the classical limit it reduces to 


(qi(A)g;(t+-7))Y =—B 2 Fy¢qegi(Qqi(t+7))™. (169) 


The formal similarity between Eqs. (88) and (168) 
also carries over into the higher-order terms. Thus, in 
the case of step-function forces, Eq. (21) for G[CO:(t,) 
Q;(t+7)]4) reduces to 


(2(0.(0), O;(¢+7) 4) 


= (-=) z Piri f an f asco ces, 


th] k 
XLO. (A), 2005), Q5 +7) 4 LI) 


which also follows from Eq. (24) for the second-order 
step-driven response (Q;(t))™, if we replace Q; by 
310;,0;©(7)];. Consequently, the analysis of the 
previous section for the second-order step-driyen 
response can be applied directly to the second-order 
step-driven noise. Similarly, it may be assumed that 
the third- and higher-order terms can also be treated 
on an equivalent basis. 

The consequences of the above formal resemblance 
can be most simply demonstrated in the classical limit. 
Consider Eq. (36) for the full step-driven response 
(gi(t)). Replacing the operator Q; by the anticommu- 
tator 4[0:,0;(7) ],, which corresponds to replacing 
the classical variable q; by the product qig;(7), Eq. (36) 
becomes 


(gia C7) =Kexp(—B X Fagedgsai(t 7) %/ 


(exp(—B >> Figz)), (171) 


(170) 


Equation (171) expresses the step-driven -noise 
(qi(t)q;(t++7)) wholly in terms of the third correlation 
moment (exp(—6 X r Figx)gi(t)g;(t-+7))™ of the spon- 
taneous equilibrium fluctuations. 

It is possible to summarize the results of this section 
and the preceding one in the following intuitively 
appealing way. We first differentiate Eq. (36) for 
(q:(t)) with respect to F;, evaluating the result at Fy, 
Fo, ---=0. 


ô 
. en) Blgg: O 172) 


(37). oe 
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Taking the second derivative of (g;(t)) with respect 
to F; and F}, we obtain 


2 


aaa aÈ; ahar O =$?(g3g14: (0) O 


D 


(173) 


which is equivalent to Eq. (38). 
On’ the other hand, differentiating Eq. (171) for 
(q;(2)qi(t-+%)) with respect to Fy, we obtain 


an 
op ae = EO 2) (174) 


which òis equivalent to Eq. (169). Letting ¢=0 and 
replacing r by £, this becomes 


ð 
—(q;q: (2) = —Llqrg34:())®. (175) 
OF, 


Equations (173) and (175) can be combined in the 
form 


e2 


(0) — 8a ))=B(qiqugi()) (176) 


OF OF. 


which constitutes a triple relationship among the 
second-order response, the first-order noise, and the 
equilibrium third moment for a step-driven process. It 
iseapparent that further differentiation of Eqs. (36) 
and (171) would yield a whole hierarchy of analogous 
higher order thermodynamic relationships. 

Although they apply literally only to step-driven 
process, Eqs. (172) and (176) exhibit most of the essen- 
tial elements of the more general theory. Therefore, the 
results presented above characterize the general struc- 
ture of irreversible thermodynamics. 

Finally, we compare our results for a step-driven 
process with the results of time-independent equilibrium 
fluctuation theory. Letting =0, Eqs. (172) and (176) 
reduce to 


ð 
—(qi) © = —B(qiqs), (177) 
aF, 


(Gi i) = (qiq) O = -62 (01) 


2 


(178) 


OF ,0F;, 


Equations (177) and (178) are precisely those which 
can be derived for a generalized canonical ensemble 
using standard equilibrium fluctuation theory.’® 


15. THE PATH DISTRIBUTION FUNCTION 


Recently one of us has discussed the first-order term 
in the nonequilibrium path distribution function 
W (q.t)8. This function spesifies the probability that the 


16 See for example R. F. Greene and H. B. Callen, Phys. Rev. 
83, 1231 (1951). , 
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macroscopic variable corresponding to the operator Q 
has tke value g in the driven ensemble at the time ¢. In 
this section we review the theory of the path distribu- 
tion function and its application to the first-order 
problem discussed in Secs. 5 through 7. We also discuss 
the path distribution function for a step-driven process. 

In order to keep the notation simple, we, restrict 
ourselves initially to a one-dimensional process. The 
distribution function W(q,t) can be expressed in terms 
of its characteristic function K,(,t). 


VGj== fa mK (vl) (179) 
DA OR a 


whence K;,(y,t) is given by 


1 i) 
Kio)=— | dge~**W 1 (q,t) 
T)? Yo 


TOE Tr pe-¥@®, (180) 


~ Qn)! (27)? 
where ->09 = Ut(1)e-"@U/(/), Then 
Wa (ql) =Tr p5(g—O) =6(q—O0)), 


where 5(g—Q(t)) is the driven 6-function operator = 4 
Ut(t)8(q—Q)U(t), which selects from p® the appro- i 
priate contributions to W1(q,¢). i 

Since W1(qg,t) is just the expectation of value of 
6(g—Q) in the driven system at time ¢, the theory 
developed previously for the driven response (Q(#)) can 
be applied directly to the path distribution function. y 
Thus, for example, the first-order term W1(q,t) is 
obtained in symmetrized form from Eq. (88) for : 
(0:(6))® by simply replacing Q; by 6(g—Q). z 


(181) 4 


t < 
Wy (CAJ) = -f ank (o) f di T (ti— t) 


XGLQ(4’—1), 6(g—Q) 14), 


the ¢ dependence in the integrand has been transferred 
to Q by performing a unitary transformation with 
exp{+7[Ht/h]. r 
The quantity GLAO (a — t), 6(q—-Q) ],) is inter- 
preted in the following way. Since the classical analog 


(182) 


of the operator 5(q—Q) is simply the ô function 6(g—q’), i 

we carn calculate the equilibrium correlation moment y 

of the* variables corresponding to the er . ne 

Q(t;’—2) and 6(q— Q) according to a | 3 
a 


LQ (a ge OL)” 


-f dq'6(q— DWN g -m 


where W,® (q) is the “equilibrium probab 


here, 1040 wW. 


tion, and.(Q (t;’—#)),. denotes the equilibrium ex- 
pectation value of the variable corresponding to the 
operator Q at time (¢’ —2) conditional on the variable 
corresponding to.Q having the value q at time zero. The 
q’ integration can be performed immediately to yield 


GQ (9), 8(q—Q) ],) 
S a SMO G@O (U9) (184) 


Substituting the result (184) back into Eq. (482) we 
obtain 


t 
WO (q,) =W, O(g) i dhE (ty) 


& 


x f ; dT (ah )(Q(ty’—1))4. (185) 


The corresponding first-order term W1 (q1, -+ qn; 2) 
in the path distribution function for an n-dimensional 
process can be developed in a completely analogous way. 
In place of the single operator 6(q—(Q), we introduce a 
symmetrized form of the product ô(qı— Q1) - - -6(g@n—Qn) 
of 6-function operators. The result is 


W1 (ga + +5903) 


n t 
== TOG E f dF (hs) 
j=l J. 


x if IKT (th—t')(Qj (ty! —D)ar--an. (186) 


WiO (q1, `" -,qn) is the simultaneous equilibrium prob- 
ability distribution for the variables q1,---,gn, while 
(QO (1—4) Oa- -an denotes the equilibrium expec- 
tation value of the variable corresponding to Q; at time 
(4 —t) conditional on the variables corresponding to 
Q;,---,Qn having the values qi,---,gn at time zero. In 
the classical limit, Eq. (186) reduces to 


Wi (q,---,9n; 4) Be —BW, (q1,°* + Qn) 


n t 
XL diy F (t)i DJa- -an ®. (187) 


7=1 —2n 


Equation (186) constitutes a generalized statement 
of the fluctuation-dissipation theorem, expressing 
W1™(q1,-*+5Qn3 t) in terms of the equilibrium prob- 
ability distribution Wi (q1,  *,qn) and the eqQilibrium 
coriditional expectation value (Qs (t’—)ja---an. 
Since all of the previous theorems regarding the first- 
can be derived from Eq. (186), this form 


roblem í i 
ae considered the fundamental relationship for the 


if of irreversibility. e i 
ee Pe one the full path distribution function 


is written in terms of 
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W1i(q1,°+*,9n; 4) as 


(128) 


í 


Cas(O= f d f dgan gn; D. 


—2 n < 


According to Eq. (36), however, 


(q:(t)} can also be 
written ' 


w0= f dqı: ° Jf dgnqgiW ® (q1,° Fy - Gn) 
X<expL—B X Figi(— i)a: + an / 
7 r 
(exp(—B È Fg;))®. (189) 
7 


The ¢ dependence of the time-stationary quantity 
(exp(—B >>; F3q;)9:(t)) has been translated into the 
exponential exp(—6 >°; F';q;). 


(expL—B Do; F 3q3(—£) Jar an 


denotes the equilibrium expectation value of 
exp(—B X; Fiq) 


at time —é conditional on all variables having the 
values q1,' * *,qn at time zero. 

Since Eqs. (188) and (189) must be identicai, it 
follows that the step-driven path distribution function 
W1(q1,-++,Qn; 4) is given, in the classical limit, by 


Wilg: sgn; 4) 
=W O(g,- + ,9n)expL—B 2 Fiqi( =) J)ar---an / 
7 


(exp—8 2 Fg;))®. (190) 


Equation (190) expresses the path distribution function, 
characterizing the time evolution of a step-driven 
ensemble, in terms of the equilibrium probability dis- 
tribution Wi (q1, * -,qn), together with the conditional 
expectation value (exp[ —B >>; Fjq;(—2))ar-- oh 

The significance of this result for the pat distribution 
function can be made more apparent by rewriting Eq. 
(36) for (g;(#)) in the form 


(asld)= f dqi- o f dgn WO (q1, 0 n) 
Xexp(—6 D Fig) (q:())a an ®/ 


(exp(—6 X Fig ®. (191) 

n A 5 
However, letting ‘=0 in Eq. (189), we find’ that the 
initial perturbed equilibrium probability distribution 
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W1(q1," + +,Qn; 0) is given by 


W1(q1,° + + Qn; 0) 


ee a W 1 (q1: - - qn) exp(—B 2 Fiq3)/ 
7 


X(exp(=8E Fig). (192) 
iy » i 
Inserting this result in Eq. (191), we obtain 


a:0)= f dqi:- Jf dga Wilgi: “Qn; 0) 
» — 7) 


X(qi())ar---an™. (193) 

Equation (193) shows explicitly how the step-driven 
response (q,(¢)) is built up from the regression of the 
equilibrium fluctuations, characterized by the equi- 
librium conditional expectation value (q;(t))a1---on™, 
and weighted according to the initial perturbed dis- 
tribution Waı(qı,***,qn; 0). Although this result is 
precisely that which we might intuitively expect, it has 
often been pointed out in the literature?:® that there is 
no clear a priori justification for identifying the be- 
havior of a system undergoing an irreversible process 
with the spontaneous equilibrium fluctuations in this 
way. The equilibrium fluctuations are microscopic in 
nature and generally on an extremely ` small scale, 
whereas the macroscopic response functions measured 
in the laboratory are normally orders of magnitude 
larger. Nevertheless, the proof of the assumption that 
macroscopic processes follow the same laws of regression 
as the equilibrium fluctuations is provided by Eq. (193). 


16. THE JOINT PATH DISTRIBUTION FUNCTION 


Just as a more detailed description of the equilibrium 
behavior can be obtained by introducing joint prob- 
ability distributions containing two or more times, it is 
possible to describe in greater detail the evolution of a 
driven ensemble by introducing joint path distribution 
functions. In this final section we extend the theory of 
the prewious section to include the joint path distribu- 
tion W2(q,t;%’,t’), which specifies the probability that 
the variable corresponding to the operator Q ha8 the 
value q at time ¢ and the value q’ at time /’ in a driven 
ensemble. We consider explicitly the case of a single 
variable, although the extension of the theory to multi- 
dimensional processes is quite straightforward. 
W.2(q,t; q',t’) can be expressed in terms of its charac- 
teristic function Ka(v,t; v't’). 


W(q,t; gt) 
Aaa! 


where K2(»,t; p’,t’) is given by 


av 


<— f[ af qr'eèaer't' Kalv; v't’) (194) 
` 2r Ja ` : j 


CC-0. Gurukul Kangri University Haridwar Collection. Digitized by SJ Foundation USA > 


\ 
14 
K(v,t; v't’) >. 
1 ig r r ipla! s tr 
E ay f dge” "a W2(q,t; gt) 
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1 
see), Be = (468), 
Qa 


According to the discussion of Sec. 3, the quantum- 
mechanical form of the driven seeond moment 
Ka(y,t; v't’) is 


1 
K2(v,t; v't’) pap) (196) 
T ° 


so that Eq. (194) for W2(q,t; q’,t’) becomes 


We(q,t; gt) 


1 
= fof dv'(3[ ei” IRW] giv’ la’), (0) 


OF da 
=z (4—09), èl- 0) LL). * 


Equation (197) states that W2(q,t; q’,t’) is just the 
driven correlation moment between the 6-function 
operators 6(g—Q(t)) and 6(q’—Q(t’)). Therefore, the 
treatment of driven second moments developed 
throughout the preceding sections of this paper is 
immediately applicable to the joint path distribution 
function. 

We limit ourselves here to a discussion of the joint 
path distribution function for a classical step-driven 
process. The driven second moment (q(t)q(¢’)) can be 
written in terms of W2(q,t; q’,t’) as 


(197) 


adatey= f agf agawat gt) (198) 


—20 


On the other hand, according to Eq. (171), (¢(q(t’)) 
can be computed according to 


QO- fda fafa WCQ; g, =) 


e (saN (t—t’) (0) 


e (e~BFa) (0) (199) 
$ : - 9 
where IV (g; g',t—t') is thè (time-stationary) equi- 
librium joint probability distribution and 
(ECO aae V denotes. the equilibrium expecta- 
tion value of e~*¥2 at time —?’ conditional on q at time 
zero and q’ at time (¢—?’). us. as ae 4 
From Eqs. (198) and (199) it follows thatthe step->> _ 
driven joint path distribution function is given,.in tie = 
D eee Sm, 


ee 


a > 


x 
i 
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W(qt; gt) =W: (g; gq', t—t') 
G j (EBEN grae 


(e-BF a) (0) 200) 
e D S 
` Thus, W2(q,t; q',t’) is related in a particularly simple 
way to the equilibrium joint probability W. (q;g',t—t') 
and the equilibrium conditional expectation value 
(ePFat)) (t-t’) (0) : 
Finally, we rewrite Eq. (171) for (¢(t)g(¢’)) so as to 
further emphasize the relationship of this quantity to 
the regression of equilibrium fluctuations. 


B (a)q(t’))a 

NN (0) AR 2 
o= f amoge (201) 
(aq (t ))a® denotes the second equilibrium correlation 
moment between g(t) and g(é’), conditional on the 
value g at time zero. However, according to Eq. (192) 
Wy (q)e8F2/(eFFa) is just the initial (¢=0) per- 
turbed equilibrium distribution function W1,(q,0). 
Hence, Eq. (200) becomes 


aOa)= f aatala0)aate))«. (202) 


This result shows explicitly how the step-driven second 
moment (q(#)q(t’)) is built up from the regression of 
the equilibrium fluctuations, characterized by the con- 
ditional equilibrium second moment (q(t)g(¢’)),© and 
weighted according to the initial perturbed distribution 
W1(q,0} 9 


APPENDIX A 


In this appendix we compare a system driven from 
t— — œ by the step-function forces defined in Sec. 4 
to the subsequent motion of a system characterized at 
t=0 by the generalized canonical density operator p(0) 
of Eq. (25). 

Consider the first-order term (Q;(¢))” in the response 
during a step-driven process, as given by Eq. (23). The 
equilibrium expectation value appearing in this ex- 

pression can be written 


(£05 (4), OL) 3 
TOQ (40: (2) : 
—Q; (i) exp[+BH]0; (tr)) 


= (0; (A) QO exp ROU) a 
T en Kexpl—BHO]Q() (A-1) 


laJ 
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This form can be further rewritten as follows. 


VECKA (2) _) 
$ r- 


Xexp[—AH JQ; (N) 


c 


B 
--f dH O, Q;® (hih) ]-0:® (D). (A-2) 


0 


Noting that [H ®, Q;® (t1— ih) ]= —ihĝQ;O (lı— ihi), 
we have 


(£0; (41),0: () J) 
B : 
=ih Í dQ; (hih): ONO. (A-3) 


Inserting the result (A-3) into Eq. (23) and performing 
the time integration, we obtain 


8 
(OO) =a Ff NLO; (— ih): ® (DY) 
= im (Q;® (a= thAQi(D) J. (A-4) 


The contribution to (A-4) from the 4, — — œ limit 
is evaluated by taking 


B 
lim f AKO; (h — ih): (NV 
—2 0 


ti 


ito 
= lim -f anf dty(Q; (t:—2thd1) 0; (HYV. 
0 0 


T—n T 
(A-5) 


Because of the factor 1/T, the oscillatory part of the 
integrand gives no contribution, and we are left with 


B 
lim f AQ; (ih): ()) 
tioa o ra 


A 
= f AKO0;))™=B(0,0;) (A-6) 
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where O, denotes the diagonal portion of the operator 
Q: with respect to the unperturbed Hamiltonian H. 

Using the result (A-6), Eq. (A-4) for (Q;(¢)) 
becomes 


B 
(0) =-E r| J a@ocimyoome 


a 


ve inserted expl-H®] in the second ^ -6000 an 


Ka \ 
-< ‘where We Fady ermuted the operators ċyclicly. 
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IRREVERSIBLE 


The paradoxical contrast between Eq. (A-7) and 
Eq. (32) for.(Q:(t))® has been discussed by R. Kubo,’ 
who suggests that the former equation refers to an 
adiabatic system, whereas the latter refers to an iso- 
thermal system, so that the two need not be equal. 

However, the same difficulty is found to arise even 
in an equilibrium isothermal system. Thus, in the limit 
t= 2, the equilibrium correlation moment 


: 3L0:,0;° O14)” 
becomes 
lim BLO,0OL)=OON (A-8) 
where the limit is evaluated as in Eq. (A-6). In order 
for the ensemble to be ergodic, however, in the sense 
that as ¿—> © the quantities involved become com- 
pletely uncorrelated, we require 
lim(3LQ:,05 QI) =Q. (A-9) 
We believe that the resolution of this problem lies in 
the following interpretation. Throughout the discussion 
we consider an ensemble in continual interaction with 
a temperature reservoir. This should be contrasted with 
the interpretation adopted by Kubo, which is that the 
interaction with the temperature reservoir is removed 
at the moment of imposition of the applied forces, the 
engemble thereafter being adiabatic. In our interpre- 
tation the Hamiltonian H® therefore contains a term 
corresponding to the interaction with a temperature 
reservoir, which we have not indicated explicitly for 
reasons to be explained momentarily. The additional 
interaction term induces incoherent transitions among 
the states of the system such that the ensemble 
“forgets” the details of its previous behavior after a 
sufficiently long time. This insures that the ensemble 
satisfies the ergodic requirement states in Eq. (A-9). 
The justification for not indicating the interaction 
term explicitly in calculating the driven response is as 
follows. It is always possible to choose the term of 
interaction with the temperature reservoir to be so 
small that, for times comparable to those in which we 
are interested, the disordering effects arising from this 
source are ifegligible. Since Eq. (32) gives the first- 
order`respònse (Q;(t))® corresponding to an ensemble 
chosen so as to be in (generalized) canonical equilib- 
rium up to ‘=O, it must therefore yield the appropriate 
evolution of (Q;(é)) for any finite time ¢>0. 
However, in the limit ¿—> ©, the effects of the con- 
tinued temperature interaction manifest themselves, 
regardless of the strength of the interaction. In this 
limit Eq. (32) reduces to 


0:0) 3 -E POT O-N] GAO 
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Here itis necessary to take explicit í 
interaction with the temperature reservoir. ‘This can be 
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accomplished, according to a comparison of Eqs. (A-8) 
and (4-9), by replacing the quantity 0; by the average 
value (Q,). The foregoing arguments apply as well 
to the time evolution of the full step-driven response” 
(0;(2)) given in Eq. (26), in the expansion of which Eq. 
(32) is the first-order term. 

In order to make Eq. (A-7) consistent with our inter- 
pretation, we recall that the quantity O; appearing in 
it arose from the evaluation of a ¿— œ limit. Again we 
take explicit account of the temperature interaction in 
this limit by replacing 0; by (Q;). Thus, Eq. (A-7) 
becomes identical to Eq. (32). If we evaluate Eq. (24), 
for (Q;(t))® and the corresponding higher-order terms 
in the response (Q,(¢)) during a step-driven process, 
replacing quantities of the type Q; by (Q;) whenever 
they appear, we obtain expressions identical to` Egs. 
(33), (34), etc. The technique for accomplishing this is 
essentially an iteration of that employed in putting Eq. 
(23) into the form (A-7). 


APPENDIX B 


The causal nature of a linear process finds expression 
in the well-known Kramers-Krénig dispersion formulas ` 
relating the real and imaginary parts of the complex 
admittance matrix elements. > 

We first indicate the proof of these relations. Con- 
sider the Fourier transform of Eq. (63) for Y;;(w). 


o Y;;(@) S R - 
f duet =| do f dteet gp;; D(t) 
La iw —21 0 


=27r Í dt'6(t—t')6:;59(). (B-1) 
0 


Therefore, 


2 Vij(o) (27¢:;9() for t>0 
f iet - -| (B-2) 


be iw 0 for ¿<0 
whence it follows that F:;(w)/iw can have poles only in 
the upper half of the complex w plane. 

For a function Y;;(w)/iw which is everywhere analytic 
in the lower half of the w plane, Cauchy’s integral 


theorem states that 


Falo) as R ¥i;(0’) 
iw ir iw (w’ — w) 
e P p” A (w’) 
=- | wes 
. mY iw (w — uw) 


where the complex “integration is taken around the 
contour, shown in Fig. 5, and P denotes ‘a Cauchy 
principal value. Decomposing Y;;(w)*into its real and 


Account of thee imaginary parts Yij@)=Re ¥i;)+iIm Ya) andon 
equating the real and*imaginary parts of Eq. 3), we os 3 
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contour mR in the complex @ plane. 
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Imw w -plane Ima a-plane 
= ce Re w 
Fic. 5 SS 
2 
O Rea 

obtain The integral is to be taken along the contotr shown in 
I y Fig. 6. The integrand has simple poles at a;= (2l+1)ir, 
Re Zo == fas S a Voa) (B-4) 1=0, +1, --:. The residue of the integrand at the 
‘eGo =) pole is obtained by letting a=a;+Aa, multiplying the 


Im Falo) P r° 7 Re Yil’) 
= loy = 


w (w!—w) 


Equations (B-4) and (B-S) are the Kramers-Krönig 
dispersion relations. 

We use Eq. (B-5), together with the results of equi- 
librium fluctuation theory, to derive the :=0 form of 
the fluctuation-dissipation theorem, Eq. (73). Letting 
w=0, the quantity [Im Y;;(w)/w] becomes simply the 
capacitance [0(g;)/dF;], so that Eq. (B-5) assumes 
the form : 


(B-5) 
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However, for a generalized canonical ensemble in 
contact with a series of reservoirs with constant inten- 
sive parameters ---, F;, F;, ---, the equilibrium second 
moment (¢:9¢;) is given by! 


1 ôg) O 
O° =a 


ae (B-7) 


Substitution of Eq. (B-7) into Eq. (B-6) yields the 
result 
P p° Relo) 
(9:93) = —— f dom 
TB J cr w? 


which is identical to Eq. (73) with t set=0. 


(B-8) 


APPENDIX C 


In this Appendix, following Kubo,’ we evaluate the 
universal function T (¢), defined in Eq. (83). For ¢>0, 
this can be calculated by performing a contour inte- 
gration around the upper half of the complex o plane, 
while for <0, the integration is taken around the 
lower hale plane. We consider explicitly the case 1>0, 


since the ¢<0 calculation proceéds in an identical way. 


Letting a=fGw, Eq. (83) ‘can be rewritten as a 


A 


integrand by Aa, and taking the limit as Aœ —> Q Thus 
residue at a; 


Aa |: 
= lim —exp|]1 
MAD ap 


(a+ Aa) 1 1— elarta) 
hB eoe 
2 -expl— (2/-+1)mt/he] 
e o TET 
2 exp[ — (21+1)rt/h6] 
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Equation (C-1) ls now evaluated by taking 


Aa 


(C-2) 


I(t) =2mi >> (residue at a) $ 
1=0 


«© expl—(2/+1)mt/hp | 
(21+1) i 


It is convenient to take the time derivative of Eq. (C-3) 
before performing the summation. 


(C-3) 


wh i=0 


dr (t) 4 a 
= 1)al/ 
dt hB 1=0 (561 
4 t/h 2 ; 
expl- (r1/»6)] ——csch—. (C-4) 
2B 1—exp(—2zt/ hB) hB hB 


In performing the summation in Eq. (@4) we have 
made use of the expansion >>; «'=[1/(1—«}]. We now 
integrate Eq. (C-4) with respect to /, which yields 
finally the desired result, 


2 wt 
r()=— In cot: (C-5) 
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The corresponding result for ¿<0 is identical to Eq. 
(C-5), except that ¢ is replaced by —1. Thus, for all ¢, 
we can write that 
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Review of High-Temperature Rotating-Plasma 


eens: Experiments 


> Jous M. Wircox 
F Lawrence Radiation Laboratory, University of California, Berkeley, California 
a . . . 
HE major experimental effort in the controlled or . 3 
thermonuclear reaction program has been based V g= 10°E/B, (2) 
> 


upon three different methods of plasma confinement— 
the pinch approach, the stellarator approach, and 
the magnetic-mirror approach.! Among the other 

proacheS being investigated is the rotating-pl 
system. By means of crossed electric and z 
fields in a geometry having cylindrical 
charged particle can be induced to rota 
axis of symmetry. The new degree of freed 
with this rotation may offer new possii 
taining and heating plasmas. Early wo 
was done at Oak Ridge.? We discuss } 
mental approaches that are being acti; 


ap- 


and the Berkeley? and Moscow® ion 
periments. 


BASIC CONCEPTS 


hich are menee tamni 
NICN are mpoari2et 


V¥e first discuss a few concepts w 
in these experiments, and then describe the $ 


e 
C wit iua 


ments individually. We will examine the 


mirror enhancement with a rotating plasma, and 


AMS., 4 Mayl 


the possible formation of anode sheaths. 

The well-known “‘electric drift velocity” is imparted 
to a charged particle in the presence of crossed electric 
and magnetic fields. For a singly charged particle 
we have 


Va=c(EXB)/B*, (1) 


1 For a review of the United States’ program see Amasa Bishop, 
The U. S. Program in Controlled Fusion (Addison-Wesley Pub- 
lishing Cémpany, Reading, Massachusetts, 1958). 

2 See referenc? 1, p. 67. 

3 Anderson; Baker, Bratenahl, Furth, Kunkel, and Stone, 
Second International Conference on Peaceful Uses of Atomic Energy, 
Geneva, 1958, paper No. 373; Anderson, Baker, Bratenahl, Furth, 
Ise, Kunkel, and Stone, “Study and use of rotating plasma,” 
University of California Radiation Laboratory Rept. UCRL-8067 
(December, 1957). 

4Boyer, Hammel, Longmire, Nagle, Ribe, and Riesenfeld, 
Second International Conference on Peaceful Uses of Atomic Energy, 
Geneva, 1958, paper No. 2383. 

5 Wilcox, Gow, and Smith, Bull. Am. Phys. Soc. Ser. IT, 4, 55 
(1959) ; “The ion magnetron,” University of California Radiation 
Laboratory Rept. UCRL-8579. 

€M. S. Yoffe (private communication); E, E. Yushmanoy, 
Plasma Phygics and the Problems of Controlled Thermonuclear 
Reactions, Vol. IV (to be published in translation by Pergamon 
Press,-New York). ` ¥ 


1L. Spitzer, Physics of Fully Ionized Gases (Interscience Pub- a 


lishers, Inc., New York, 1956). 
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here the electric drift velocity Vg is in cm/sec, the 
clectric field E is in volts/cm, and the magnetic field B 
is in gauss. Thus, for example, in an electric field of 
3 kv/cm and a magnetic field of 10 kilogauss, a_deu- 
teron will have a velocity of 3107 cm/sec and thus 
an energy of 1 kv. The electric drift velocity is per- 
iicular to both the electric and magnetic fields. It 
ives not depend on the mass of the particle, and it is 
e direction for positive and negative particles. 
n infinite plane geometry, ions and electrons 


ame velocity and direction and there is no 


W 
` 
=! 


:> move with a radius of curvature R, anda centrifugal 
sorc term must be considered, so that the balance of 
on 2 particle is now 


2_E+(1/c) VpXB]+mV p?R/R?2=0, (3) 


here R is the radius vector of the particle position, 
i the particle, e is the algebraic value of 
c is the velocity of light. This leads to 
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(4) 


Tons and electrons now move with different velocities, 
and an electric current je flows. Since the jgX<B force 
of the magnetic field is opposing the centrifugal force 
of the rotating particles, the plasma “leans” on the 
magnetic lines, and if the plasma rotational energy is 
sufficient the magnetic lines bow outward, thus weaken- 
ing the feld near the axis. The diamagnetic drift current 
density is 


Ja= (Ne/c)(Va— Và), (5) 


where X is the particle density (electrostatic forces will 
insure approximate equality of ion and electron den- 
sities), Vy is obtained from Eq. (3) applied to ions, and 
Ve is obtained from Eq- (3) applied to electrons. Since 
the centrifugal force on the electrons is small compared 
with the centrifugal force on the ions, we have E% 
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Fic. 1. Diagram of Homopolar device. 


same direction as the external solenoidal current that 
gives rise to B, jp is a diamagnetic current. For the 
case mentioned in the foregoing with an outward electric 
field of 3 kv/cm and a magnetic field of 10 kilogauss, 
and with a deuteron density of 10%/cm* and a radius 
of curvature of 10 cm, the drift-current density is 
34 amp/cm?. ~ j 

A plasma’ that is rotating in equilibrium with an 
applied electric field E can be compared with a con- 
denser that is in equilibrium with an applied electric 
field; however, in the rotating plasma the energy is 
stored in the form of kinetic energy of the moving 
particles. We can derive the dielectric constant k of 
the plasma by equating this kinetic-energy density to 
the increased energy density of the electric field, 


) (7) 


where p is the density of the plasma, and where use 
has been made of Eq. (2), so that the dielectric constant 
of the rotating plasma is 


k=1+ (Ampc?/B?). (8) 


This number can be very large in a practical situation; 
for the case mentioned in the foregoing the dielectric 
constant is 3.7104. The effect of the displacement of 
the magnetic lines has not been included in this 
derivation. 

Post has shown that under certain conditions a 
charged particle is reflected if it moves into a region of 
increasing magnetic field, so that particles car? be con- 
taired if the magnetic field is made to increasé at both 
ends.® Post has also shown that in order for a particle 
to pass through the mirror, we must have 
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where W, and W, are the perpendicular and parallel 
kinetic energies, and c refers to the center and m to 
the mirror. If the plasma is rotating, the centrifugal 
force tends to keep the particles away from the axis, 
where they must go to escape, and thus ‘the mirror 
containment efficiency is enhanced.’ An elegant theo- 
retical treatment shows that for a rotating plasma the 
foregoing inequality is to be replaced by 


Wulc)>Wi(o)[(Bi,/B.)—1] 
+3MVz"(c)[1 . (BS Ba); 


where M is the mass of the particle and Vz is given by 
Eq. (2). In this case W, is to be measured in the rotating 
frame of reference. Thus a rotating plasma čs more 
efficiently contained by the magnetic mirror. 

Finally, we consider the possible formation of an 
anode sheath for the case in which the anode surface 
is parallel to the magnetic field lines. In a conventional 
electron-discharge tube the sheath forms at the cathode 
because of the high mobility of the electrons. However, 
for motion perpendicular to a magnetic field the ions 
have a greater mobility, while the electrons are tightly 
confined to their field lines. The ions acquire their 
electric drift energy by moving a short distance in a 
direction parallel to the electric field. As they move 
away from the surface of the anode a region with net 
negative space charge is left behind. In this narrow 
sheath a large voltage drop may be present. In some of 
the experiments the presence of the anode sheath is 
considered desirable, while in others it is avoided. 


(10) 


THE HOMPOLAR EXPERIMENT 


In the Homopolar experiment the emphasis has been 
on the study of the basic properties of rotating plasmas, 
therefore argon has been used for the most part. In this 
experiment? the object is to create a spinning disk of 
plasma which is somewhat similar to the flywheel of a 
conventional Homopolar generator. The experimental 
geometry is shown in Fig. 1. The vacuum chamber, 
which has a diameter of 30 cm and a height of 5 cm, is 
placed between the pole tips of a magnet so that there 
is an axial magnetic field of up to 18 000 gauss. A radial 
electric field exists between the inner electrode and the 
consentric cylindrical outer electrode. By operation at 
a comparatively high gas pressure (~ 100 u) the anode 
sheath effects are largely removed, so that the electric 
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Fic. 2. Current and voltage wave forms in the Homopolar device. 


CC-0. Gurukul Kangri University Haridwar Collection. Digitized by S3 Foundation USA i ¢ 


ROTATING-PLASMA EXPERIMENTS 


field exists through the body of the plasma, and the 
entire plasma is set spinning with the electric drift 
velocity. Initially a large pulse of current is required 
to nize the gas and supply the energy of rotation. 
Oncg tht plasmarhas been set in rotation a much smaller 
current will be drawn, due only to the viscous drag of 
the plasma. In Fig. 2 the heavy trace shows the be- 
havior-of the voltage, while the positive light trace, 
which extenels from 0 to 6 usec, is the charging-current 
pulse. The negative current pulses are discussed below. 

The equivalent circuit of the Homopolar experiment 
is shown in Fig. 3. Thus the spinning plasma is repre- 
sented by a capacitance Cy, which arises from the 
dielectric effect discussed above, with a parallel re- 
sistance Rz, which represents viscous effects. Indeed 
this geometry may have application as a very-low- 
inductance capacitor for driving controlled-fusion ex- 
periments.’ 

Plasma rotation has been demonstrated by three 
observations. If we consider the rotating Homopolar as 
a charged capacitance, then it should be possible to 
recover the initial charge that was fed in by the pulse 
of charging current. For this purpose the terminals of 
the Homopolar are rapidly shorted with the crowbar 
circuit shown in Fig. 3. The resulting current pulse is in 
the opposite direction to the charging current, and 
resembles one of the negative traces shown in Fig. 2. 
By application of the crowbar at various times, the 
amount of energy still stored in the system after any 
intérval can be measured. The dissipation of stored 
energy is indicated by the decreased crowbar current 
traces at longer times. The crowbar method has shown 
that under favorable conditions more than half of the 
input energy can be recovered. 

Plasma rotation can also be observed by the Doppler 
shift of the emitted radiation, as shown in Fig. 4. 
A spectrometer views the rotating plasma through a 
tangential port. If the direction of the magnetic field is 
reversed, then, according to Eq. (1), the direction of 
rotation will be reversed. The rotational velocities calcu- 
lated from the Doppler shift agree with the velocities 
calculated from the crowbar current. Rotational energies 
of a few®hundred electron volts are measured during 


operation witf argon. > 
o 


S : 2 Fic. 4. Deppler-shiftéd spectra from Homopolar device. 


9 Anderson, Baker, Bratenahl, Furth, and Kunkel, J. Appl. Phys. 30, 188 (1959). 
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Fic. 3. Equivalent circuit of Homopolar device. 


SOURCE CROWBAR (OPTIONAL) 


A third observation of plasma rotation is the “spin 
loop” data. As shown above, the centrifugal force from 
the plasma rotation gives rise to a diamagnetic current 
in the @ direction, and thus to a radial component of 
magnetic field. We may think of the rotating particles 
as “leaning” on the magnetic lines and bending them 
outward. This effect is independent of the direction of 
rotation. Thus the signal in a magnetic probe coil which 
is oriented to detect H, should be independent of the 
sense of rotation. The observed spin-loop signal is 
shown in Fig. 5, and indeed has the same polarity for 
both directions of rotation. A slight misalignment of the 
pickup coil is believed to account for the difference in 
amplitude between the two signals. 

Thus a considerable storage of rotating energy has 
been demonstrated in the Homopolar experiment in a 
geometry which is stable for times of the order of 
100 usec. For controlled-fusion applications some of 
this energy will have to be converted to random or 
heat energy, which presumably can be accomplished 
with suitable perturbations in the electric or magnetic 
fields. 

The same basic ideas have been incorporated in the 
improved geometry shown in Fig. 6. In this case the 
centrifugal force is more effective in pulling the plasma 
away from the insulators. Experiments are just be- 
ginning with this geometry; however, a decay time of 
several hundred microseconds has been obsérved. 
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Fic. 5. Spin-loop signal from Homopolar device—100 4 Ar; 17.4 kilogauss; 7 kv. 


IXION 


This experiment is named for the king in Greek 
mythology who aspired to the love of a goddess and 
was therefore sentenced to be forever bound to a 
flaming rotating wheel. 

The Ixion* geometry as shown in Fig. 7 is long 
(86 cm) in comparison with the flat-disk geometry of 
the Homopolar. The pulsed magnetic field increases 
from 9 kilogauss at the median plane to 20 kilogauss at 
the mirror throat, and the pressure between pulses is 
2X10- mm Hg. The first experiments were done with 
metallic central electrodes, but the latest model has a 
plasma “center rod.” A puff of deuterium gas corre- 
sponding to one micron pressure in the apparatus is 
ionized and driven into the machine by a magnetic 
shock coil. This plasma electrode makes electrical 
contact with a pair of tungsten ring electrodes located 
at the mirror throats. After the plasma electrode has 
arrived, the high voltage (10 kv) is pulsed on. For about 
200 usec negligible current is drawn, and then a sudden 

impulse of current of about 20000 amp is drawn for 
25 usec, as shown in Figs. 8(A) and 8(B), whieh have a 
sweep speed of 200 usec/cm. The voltage fallseto about 
one-half its original value as the discharge occurs, and 
then decays to zero in about 500 usec. This current 
spike and the voltage behavior afe interpreted as being 
lopment of a rotating plasma. The 
caused by the develop j| is attributed to the loss 
subsequent decay of the voltage 1s a 


A 


bar technique and by Doppler-shift observations. When 
Ixion is short-circuited, about one-fifth of the original 
charge can be recovered. In the Homopolar experiment 
one-half the original charge was recovered, but this was 
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Fic. 6. Later version of Homopolar device. 
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ROTATING-PLASMA EXPERIMENTS 


with operation with argon, and these machines are 
found experimentally to operate better with heavy 
gases than with deuterium. The capacitance of Ixion 
can be calculated from the plasma dielectric constant 
given in Eq. (8), increased by a factor of the order 
of 1», which corrects for the diamagnetic weakening 
of the applied magnetic field. Using this capacitance 
and the observéd amount of charge that can be re- 
covered, one can compute that the rotating deuterium 
plasma wouid have an equivalent préssure of 5.6 u. The 
pressure of the injected deuterium is 1.54; the dis- 
crepancy may be accounted for by the presence of 
impurities er by an error in the geometrical correction. 
The Doppler shift that occurs when the magnetic 
field is ceversed is shown in Fig. 8(D), which is a profile 
of a carbon line. The drift velocity of 410° cm/sec 
calculated herefrom is consistent with the value calcu- 
lated from the applied electric and magnetic fields. 
The spectral analysis also shows many other impurity 
lines during the current pulse, and H and D lines lasting 
for 100 usec after the current pulse. 
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Fic. 7. Diagram of Ixion. 


The high-impedance delay for 200 usec after the 
voltage is pulsed on was examined with an earlier model 
having a metallic central electrode. By electrostatic 
probing it was found that most of the voltage drop 
occurred just outside the anode, corresponding to the 
presenca of an electron sheath as discussed above. Thus 
it is seen thet the electric field does not penetrate the 
main. body of the plasma and bulk rotation “does 
not occur. 

If the magnetic field is lowered, the voltage raised, 
or the pressure raised excessively, the discharge is 
oscillatory as shown in Fig. 8(C). The voltage falls to 
zero almost at once. Neutrons occur during the current 
spike in this mode of operation, with about 10° neutrons 
per discharge at-40 kv and a detectable yield at 10 kv. 
If the magnetic field is increased so as to obtain the 
capacitive-type operation shown in Figs. 8(A) and 8(B), 
the neutrons disappear. The origin of these neutrons 
has not been satisfactorily explained. 


Tn an Ixion with a metal center rod, large diamagnetic ~ 
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Fic. 8. Experimental data from Ixion. 


signals were observed with a magnetic probe coil inside 
the center rod during the oscillatory discharge of 
Fig. 8(C). These signals corresponded to almost com- 
plete removal of the axial magnetic field from the 
center of the machine. The diamagnetic effect decreased 
to zero as the probe was moved away from the center 
plane to the mirror throat. Diamagnetic signals have 
also been observed on Ixion with the plasma center rod. 
The diamagnetic decrease for voltage-holding operation 
is of the order of 30% at the center, and decreases to 
zero at the mirror throat. 


BERKELEY ION MAGNETRON 


This experiment® utilizes a long solenoidal magnetic 


field (120 cm) having a mirror throat at each end and a 
metal center electrode, as shown in Fig. 9. However, 
the philosophy on which this is based is quite different 
from that in the two preceding experiments. The tube 
is operated at the comparatively low pressure of a 
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Fıc. 10. Experimental data from Berkeley ion magnetron— 
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fraction of a micron, and thus the anode sheath develops 
about the central electrode. Since this sheath contains 

a considerable density of ionizing electrons circulating 
around the axis, a neutral deuterium molecule that 
enters the sheath volume has a good chance of being 
ionized. The ion thus formed will be rapidly accelerated 
out of the sheath by the electric field, and will circulate 
around the axis in magnetronlike orbits with a kinetic 
energy equal to the potential energy at the point in the 
sheath at which it was ionized. The magnetic field is 
considerably stronger than the magnetron cutoff value, 
so that the circulating ions cannot reach the outer wall. 
Each time the ion returns toward the axis it is reflected 
by the sheath potential; the sheath protects the central 
electrode. After several revolutions this fast ion under- 
goes a charge-exchange collision which sends a fast 
neutral into the outer wall and leaves behind a thermal 
ion. This process limits the circulating current to a 
rather low value in the present experiments. In contrast 
to the two pulsed experiments described above, the ion 
magnetron operates under continuous conditigns, with 
a magnetic field of 3 kilogauss at the center and 12 
kilogauss at the mirror throats; and an applied voltage 
of 10 kv. The ion magnetron: operates in a high-im- 
pedance mode whichmay ‘be comparable té the 200-usec 
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violent oscillations of the center electrode. This elec- 
trode was mechanically supported at only one end, so 
that it was free to oscillate as a pendulum. This motion 
could be viewed through the quartz port shown in 
Fig. 9. Several exploratory experiménts supported the 
view that the force responsible for these oscillacions 
came from the recoil of ions accelerated in the sheath. 
From the magnitude of the forces involved, the circu- 
lating current was estimated to be equivalent to about 
100 amp when the current drawn by the tube was about 
1 amp. 

When the tube voltage is turned on, the behavior of 
the gauge pressure and tube current is as shown in 
Fig. 10. The size and duration of the initial current 
pulse can be computed by considering that the neutral 
gas initially present flows into the sheath with thermal 
velocity. 

With steady operation at a tube voltage of 10 kv and 
a tube current of 1 amp, neutrons are produced at a 
rate of a few times 10! per sec. They are probably 
produced by head-on collisions between ions entering 
the sheath and ions leaving the sheath. When the 
geometry shown in Fig. 9 was modified to include 
vacuum pumping at both ends of the tube, the neutron 
rate increased by an order of magnitude, since the 
circulating current was able to reach a higher value 
before being limited by charge exchange. 


MOSCOW ION MAGNETRON 4 


This experiment’ also utilizes a solenoidal magnétic 
field with mirror throats on both ends. However, a 
direct-current plasma beam serves as the central elec- 
trode, which is positive at voltages up to 40 kv, direct 
current or pulsed. The vacuum chamber is 170 cm long 
and 50 cm in diameter, which makes it the largest 
among the experiments discussed here. The system 
geometry shown in Fig. 11 has been drawn from a sketch 
made by Dr. M. S. Yoffe. The central magnetic field is 
8 kilogauss, rising to 12 kilogauss at the mirror throats. 
Oil diffusion pumps at the ends and evaporated- 
titanium pumps“in the middle give a large yacuum- 
pumping rate and a base pressure of 10-7 mm Hg. 
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High-energy neutrals from the charge-exchange proc- 
ess previously mentioned are detected by secondary- 
electron emission, as shown in Fig. 11. Neutron pro- 
duction is observed with a plastic phosphor viewed by 
a phototube. In one type of experiment the high voltage 
is prised’ on with a square wave for from 10 usec to 
2 msec. At.the end of the square wave the voltage is 
returned to ground in about 10-7 sec. The high-energy 
neutral-detector response decreases approximately ex- 
ponentially*with a lifetime of 1 or 2°*msec, and neutron 
pulses are seen for a few msec after the voltage has 
gone to zero. Thus the decay time of the system appears 
to be 1 or 2 msec, somewhat longer than has been ob- 
served with the Homopolar or Ixion; however, the 
system*dimensions are larger in this experiment, and 
the mode of operation is considerably different. 

This ion magnetron makes 107 to 108 neutrons per 
second at 40 kv, with a tube current of 1 or 2 amp. The 
curve for neutron yield vs voltage follows the general 
shape of the D* cross section. 

In an earlier smaller model, extensive measurements 
were made with hot and cold probes of the radial dis- 
tribution of potential. Three modes of operation may 
be distinguished. If the applied voltage is very high, 
all the ions supplied by the plasma source are acceler- 
ated radially as soon as they leave the mouth of the 
plasma source, so that there is no central beam. As the 
applied voltage is reduced, a second mode of operation 
is reached in which the central beam reaches a con- 
siderable distance into the machine, but not all the way 
across. In this case a stable radial potential distribution 
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is observed, and most vf the electric field is concentrated ` 
in the vicinity of the axis, corresponding to the forma- 
tion of the anode sheath discussed above. Finally, if the 
applied voltage is decreased still more, »a mode of- 
operation is attained in which the.central plasma 
column reaches across the entire length of the machine. 
In this regime the radial potential distribution oscillates 
at frequencies of 200 to 400 kc between a steep distri- 
bution which corresponds to the presence of an anode 
sheath and a sloping distribution which cerresponds to 
the absence of a sheath voltage drop. It is not known at 
present whether these oscillations can be avoided. 
Since a plasma central electrode offers many advan- 
tages, the question of its stability is an important one. 


SUMMARY 


2 


By emphasis of various aspects of the fundamental 
concepts of electric drift velocity, diamagnetic azi- 
muthal current, dielectric constant of rotating plasma, 
magnetic mirror enhancement, and anode sheath forma- 
tion, the four experimental approaches have been 
developed. It is clear that this work has only begun to 
define the problems and possibilities of rotating plasmas. 
We hope that this article has pointed out some of the 
challenging experimental and theoretical problems that 
remain to be solved. 
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I. INTRODUCTION 


TUDY of ultrasonic absorption due to chemical re- 
Jaxation in electrolytes has recently had consider- 
able development.} As a result of the comprehensive 
work done by Professor Tamm and his associates at 
the University of Géttingen and Professor Leonard and 
his associates at the University of California in Los 
Angeles, this aspect of sound absorption is understood 
somewhat better compared to the thermal or structural 
relaxation in fluids. Recently Litovitz and his associates 
have contributed much to the understanding of the 
basic mechanisms responsible for the sound epsom ton 
in chemical solutions. 
“  Rinstein! was the first to point out that Chemical 
processes can account for sound. absorption in partially 
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dissociated gases. kle made a theoretical estimate of its 
contribution in nitrogen tetroxide.? According to him, 
the excess sound absorption in partially dissociated 
gases such as nitrogen tetroxide is due to dissociation. 

Liebermann’ in 1949 considered sound absorption in 
a chemically active fluid. According to him, chernically 
active media may exhibit unusual physical properties 
such as dynamic compressibility whose value depends 
upon the compression rate in contrast to the conven- 
tional static value. The case is similar for dynamic 
specific heat, which depends upon the rate of change of 
temperature. These distinctions between static and 
dynamic values become prominent when high-frequency 
sound waves pass through the medium which is in 
chemical equilibrium, provided the equilibrium is pres- 
sure and temperature dependent. Such chemical re- 
actions occurring in liquids are expected to affect the 
propagation of sound waves too. 

Acetic acid is the first liquid in which the chemical 
process was found to be responsible for excess sound 
absorption. Spakowski‘ and Richards? suggested that 
the absorption of ultrasonic waves arises from the për- 
turbation of the equilibrium between the dimer and 
monomer forms of acetic acid. Pinkerton’ in 1948 con- 
firmed experimentally the existence of the relaxation 
phenomenon in acetic acid and, with Lamb,® studied 
the absorption and dispersion of ultrasonic waves in 
acetic acid in great detail. They showed that a dis- 
persion and maximum in the value of absorption per 
wavelength occur and that a relaxation process which 
arises from the perturbation of the chemical equilibrium 
due to passage of sound waves does exist. Recent meas- 
urements of Piercy and Lamb’ at low concentrations of 
acetic acid indicate the presence of a second relaxation 
region with a characteristic frequency an order of mag- 
nitude higher than that for the pure acid, This second 
relaxation is ascribed to the monomer- -dimer reaction, 
whereas the first relaxation in pure acid is still un- 
explained. Litovitz and Carnevale® have studied effects 
of pressure on ultrasonic relaxation in pure acid and 
their results, however, are in agreement with monomer- 
dimer theory. : 

2 W. T. Richards, Revs. Modern Phys. 11, 36 (1939). 

3 L. Liebermann, Phys. Rev. 76, 1520 (1949). 

4 B. Spakowski, Compt. rend. acad. sci. U.R.S.S. 18, 169 (1938). 

5 J. M. M. Pinkerton, Nature 162, 106 (1948). 

6 J. Lamb and J. M. M. Pinkerton, Proc. Roy. Soc. (London) 
A199, 114 (1949). 

‘J. E. Piercy and J. Lamb, Traps. Faraday Soc. 52, 930 (1956). 

*T.-A. Litovitz and E. H. Carnevale, J. Acoust. Sac. Am. 30, 

“134 (1958). ^ 
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Another indication of the existence of such per- 
turbation effects of sound waves on chemical equi- 
librium resulting in sound absorption is provided by 
measurements of the ultrasonic absorption in sea water 
and its ‘comparison with fresh water below one mega- 
cycle per second by several workers." Leonard dis: 
covered that the cause of excess sound absorption in sea 
water is the presence of MgSQ,. Above 1 Mc the absorp- 


‘tion is similar to that of fresh water. Below 1 Mc the 


value of a/v*, where æ is the absorption coefficient and 
v the frequency, is found to increase with decrease in 
frequency. The absorption coefficient is governed by an 
empirical relation of the type 


> “2a=A [Ko (w?+ K*) J+A ww, 


where w is the angular frequency-and K’s and A’s are 
constants. The first term has the form characteristic of 
‘a relaxation phenomenon, and this led to the suggestion 
that chemical relaxation is the process responsible for 
the excess sound absorption. Wilson!® later reported 
that the relaxation frequency does exist near 130 kc. 
According to Liebermann,’ the relaxation term arises 
from a pressure-dependent equilibrium of a chemical re- 
action within the liquid, and the cause of the excess 
sound absorption in MgSO; solutions lies in the dissocia- 
tion reaction AB=2A+B involving magnesium-sulfate 
molecules or ions. He developed a theory of sound prop- 
agation in chemically active media, but its application 
to MgSO, suffered from some errors with regard to con- 
centration dependence; for example, he sug ggested that 
the absorption is proportional to the square root of the 
concentration. Wilson and Tamm and Kurtze'®* have 
shown independently that absorption is linearly propor- 
tional to concentration. This led to the explanation that 
the process is unimolecular, A@B, with a relaxing 
transition between two energy states of the ion pairs. 
Barthel!” later discussed the concentration behavior 
and concluded that if the activity coefficients for ionic 
reactants are taken into consideration the experimental 
results of Wilson which suggest unimolecular reaction 
will fit equally well with the dissociation reaction. Bies'® 
also considered the dependence of the activities of the 
ions with the passage of sound waves. He studied the 
effect of the dielectric constant of the solvent on the 
absorption and concluded in favor of a dissociation 

? F. A. Everest and O. H. O Neill, J. Acoust. Soc. Am. 19, 255 


(1946). 
1 D. A. Wilson and is N. Liebermann, J. Acoust. Soc. Am. 19, 
` 286 (1947). 
1 R. W. Leonard, J. Acoust. Soc. Am. 20, 254 (1948). 
aay Liebermann, J. Acoust. Soc. Am. 20, 868 (1948). 
(cree Combs, and Skidmore, J. Acoust. Soc. Am. 21, 63 
“R. W. Leonard, Tech. Rept. No. 1, Physics Department, 
University of California, Los Angeles, California (1950). 
w 1850. B. Wilson, Jr., Tech. Rept. No. 4, Pryde Deparment, 
niversity of fornia, Los Angeles, California (195 
1 (a) K. of aii and G. Karte, Acustica 3, 33 (1953); (b) ibid., 
p- 42; (c) ibid., p. 41. 3 
17 R° Barthel, J. Acoust. Soc. Am, 24, 313 (1952).. 
*D. A. Bies, Tech. Rept. No. 6, Physics Department, Univer- 
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mechanism’ as responsible for the excess sound 
adsorption. > 

Tamm and Kurtze! assume the existence of the 
complex MeR, H:0, where Me= metal and R=radical. ~ 
For this complex two different possibilities of dissocia- 
tion are assumed in order to explain the presence of two 
relaxation frequencies. The other possibilities which 
may cause the anomalous sound absorption, according 
to Tamm, are ion association or a complicated ‘com- 
bination of molecular dissociation ana hydrolysis. 
Thermal relaxation is ruled out, simply because there is 
no appreciable change in the absorption near 4°C where 
isothermal propagation is expected to take place. The 
assumption of ion association, i.e., the association of 
anion and cation to form an ion pair (ion pair@anion 
and cation) gives a similar dependence of absorption 
and relaxation on frequency as in the case of the dis- 
sociation process. But this assumption explains only one 
relaxation mechanism. It does not explain the second 
increase in the product of absorption cross section and 
wavelength, which is observed in the case of 2—2-valent 
electrolytes. 

Litovitz and Smithson”? suggest that all observed 
data can be explained on the basis of the assumption 
that the reaction is between two dipole molecules. They 
rule out the reaction between two ions of unlike sign as 
the rate-determining step in the primary relaxation 
process. Their results also indicate that the hydrogen 
ion is not involved in the rate-determining reaction 
causing the observed relaxational effects. 


Il. THEORY 
(a) Excess Sound Absorption 


A plane acoustic wave in an absorbing medium can 
be represented in the form : 


E= to expliw(t—x/Vo)—ax], (1) 


where £, is the amplitude at a distance x and a is the 
amplitude absorption coefficient. This expression can be 
written in the form 


E= £o exp[iw(t—x/V)], (2) 
1/V= (1/Vo)— (ia/w). (3) 


If it is assumed that the chemical equilibrium is 
pressure dependent, then a change in the pressure will 
cause a shift in the equilibrium. Such a medium has a 
complex compressibility. The isothermal compressi- 
bility of ‘a solution may be assumed to be equal to the 
isothermal compressibility of the selvent (assumed’zeal) 
plus a complex factor due to the perturbed chemical 
equilibrium. Thus 


where 


en, B=botb'. ` r 


19 K. Tamm and G. Kurtze, Acustica 4, 380 (1954). 
*0S. R. Smithson and T. A. Tite J. Acoust. Pernin 
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The contribution to the intensity attenuation co- 
efficient p-r wavelength of the complex compressibility 


is given by 
2ad= — 2r Im (6'/B0). (S) 
Consider the case of a two-state equilibrium reaction 


ky 
ASB. 


ke 


If N represents the molal fraction in the excited state, 
then at equilibrium 


dN /di=k,— (ki +k2)N=0. (6) 
Similarly in the case of a dissociation reaction 
o k 
ABZA4A+B 


2 
at equilibrium, 


dN/dt=ki(1—N) —k2N?. (7) 


If the equilibrium of the reaction is perturbed only 
slightly by the change of acoustic pressure, the rate of 
return towards equilibrium is, to a first-order ap- 
proximation, ` 


d(AN)/di= K (AN —AN,), (8) 


where ANo represents the magnitude of small perturba- 
tion and Ķ is defined as the equilibrium reaction rate. 
In the case where ANo is a constant the solution is 


given by 
AN=AN([1—exp(— Kd) ], (9) 


where 1/K is called the time constant or relaxation time 
7 of the reaction in the vicinity of equilibrium. However, 
if the perturbing force is the acoustic wave, the pertur- 
bation varies periodically with time, for example, as 
expiwt, then the solution is given by 


AN=AN)/(1+iw/K). (10) 


Liebermann? showed that the dynamic partial iso- 
thermal compressibility 6’ is given by 


B’=Bo'/(1+iw/K), (11) 
where fo’ is.the portion of the static isothermal com- 


pressibility which arises from the supposed chemical 


reaction. 
Neglecting the small specific heat term, the con- 
tribution to the sound absorption coefficient, arising 


from the chemical reaction is given by a 


ne ° By wK 


Bo (w?-+:K2)Co 


e 


(12) 
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Leonard’s measurements indicated 8'/6o=0.5X 10-4 
and K=0.9X10~ sec in 0.02 M MgSO, at 25°C. 

Liebermann considered two possibilities of the chemi- 
cal reaction—unimolecular and dissociation type. Cor- 
responding to these types Liebermann derives the fol- 
lowing equations for Bo’. For a reaction of the type APB, 


=(V/RT)[AV/V PN (1—N), (13) 


where V is the volume of solution containing 1 M of 
solute, R is the gas constant, T is the absolute tempera- 
ture, WV is the fraction of molecules in state B at equi- 
librium, and AV is dV/dN. For a reaction of the type 
AB=2A-+B, 


Bo'= (V/RT)[AV/V PLN (1—N)/(2—N) 


Here N is the fraction of AB dissociated. 

Liebermann concluded that anomalous absorption in 
MgSO, had its origin in a dissociation reaction and 
stated that for this reaction K is proportional to m 
while Bo’ would vary as mł, where m is the molar 
concentration. 

According to Barthel!’ the concentration behavior of 
Bo and K is represented as follows. For a reaction of the 


type, 


(14) 


kı 
A=B, 
ko 
K=ki +k, (15) 
and A 
Bo Œ kiko(kıit+ k) m; (16) 
and for 
kı 
ABZ a +B, 
mostly dissociated, 
Kœkı (17) 
Bo <[(ko/ki)vavam jm; (18) 
and for 
ky 
ABA +B, 
only slightly dissociated, SEEN 
e Kœ2 (kikova vgm)? $ (19) 
Bo œ [ (kə/ki)vavgem >m, (20) 


where m represents the molar concentration, kı is the 
rate constant for the forward reaction, kə is the rate 
constant for the reverse reaction, and v’s are activity 
coefficients. Activity coefficients for ionic reactants 
generally decrease rapidly with increase of concentra- 
tion. If MgSO42Mg**+-+-SO,-~ is taken as the mecha- 
nism corresponding to the case AB=A +B, mostly 
dissociated, the data of Harned and Owen* show that 

21H. S. Harned and B. Owen, The Fhysical Chemistry of 


Electrolytic Solutions (Reinhold Publishing Corporation, New 
York, 1950). 
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the mean activity coefficient for the magnesium. and 
sulfate ions varies as m} with the result that the con- 
centration dependence approximately cancel out of the 
factor vavgm making Bo proportional to m and K, inde- 
pendent of m even in the dissociation case, similar to 
unimolecular reaction. 

Wilson’? and Tamm and Kurtze!® have applied re- 
laxation theory to chemical equilibrium and obtain the 
following expressions for the absorption coefficient per 
wavelength? For the unimolecular reaction 


kı 
AZB, 


29 


n . 
the absorption per wavelength is 
m(AV)? CK WT 


a= 5 (21) 
BoRT (1+K)? 1+?7? 


with the angular frequency of relaxation 
1/r=hi+ho. 


In these equations C is the total concentration of the 
solute, AV is the difference between the partial molar 
volumes, K= Cpg/Ca4= k/k is the equilibrium constant, 
kı and ke are the forward and reverse reaction rates, 
respectively, Bo is the static compressibility, w is the 
angular frequency, and 7 is the relaxation time. 

For the dissociation reaction 


(22) 


kı 
ABÆA+B, 


2 


the results are 
m(AV)?Cé(1—6) wr 


ai= (23) 
BoRT (2—8) 1+7? 


and 
1/r= kı +2Côllk: 
=k {1+2 —8)/8]}, 


where 6 is the fractional dissociation and II is the mean 
activity coefficient product. 

Bies!® considers a small unit cube in a homogeneous 
volume of solution. Supposing that no molecules of 
solvent nor of solute get in or out of the cube and that 
the dimensions of the cube, which is in a plane sound 
field, are small compared to wavelength, then the rate 
reaction for a dissociation reaction AB=2A+8B can be 
written in terms of the numbers of molecules and ions 
per unit volume as 

ay /dt=kyn— kof nmo, (25) 
where AB=C{(1—6)=n/N, A=Cd=m/N, B=Ci=n2/ 
N, and f4?= mean squared activity coefficient provided 
C is the total concentratiorof A in either the molecular 
or the ionized state and 6, the fraction of 4 in ionized 
state. 


(24) 
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, At equilibrium 
dn/dt= —dn;/dt= —dn2/dt=0. (27) 


Here kı and ks are assumed to be independent of 
concentration while f4? is assumed to be a function of 
concentration. All the parameters 7, nı, n2, kı, ke, and 
f2 are assumed to be pressure dependent and have a 
harmonic time dependence of the form expiwt in a 
harmonic sound field. When sound waves pass through 
the medium, the concentration of the molecules and the 
ions varies because of the variation in their numbers 
and the volume. The contribution of the latter to ab- 
sorption is, however, negligible. 

Thus, the general irreversible rate equation may be 
represented by a Taylor series in terms of small varia- 
tions in all of the parameters. 

If we consider only linear terms, 


iwAnı=nAkı— fy nmAk:+kiAn 
— kaf > (nAn: + nAn) — konmn:A f3? 
— (kyn— 2Ronmynof .?)Av. 
By using the mass action equilibrium law 


K=hy/ko= (fa?nina)/n, 


(28) 


(29) 
we get 
nAky— fa2nynoAko= kof 2nyn2(AK/K). 


From well-known thermodynamic relations 


0(AFo)/dp=—AVo 


(30) 


(31) 
(32) 


and 
AFy=—RT \nk, 


the following expression for AVo is obtained: 
AVo= (—RT/K)(0K/dn) (0n/dp)r. (33) 


Here AVo is considered to be an empirical constant as 
the associated energy change AFo represents the free 
energy change when all the reactants are in their 
standard state. 

The contribution @’ to the isothermal compressibility 
due to a perturbation of the dissociation process is 


B’= — (0v/dn) (ðn/ðp)r. (34) 
We define i 
AV =N (ðv/ðn)r, : (35) 
B’= — (AV/N) (0n/dp)r. (36) 
Hence, 


AVo= (+RI/K)(AK/An)(NB'/AV) eat) 4 
AK/K=+[(AV0)(AV) JAn/(N6'RT) gm 
or 
= (WAn/ NERY), 
where W= (AV) (AV). Thus 
nAki— Aksf.?nyna= (nkıW An/NB'RT). 


To estimate the variation in the unit volume with 
’ the passage of sound waves we note that petru 
in volume Av must lag behind the acor 


x 


o> 


y $ 
1056 CAS. 


AVo 


Fic. 1. Vectorial representation of the variations in volume, 
pressure, and number of molecules , with the passage of a sound 
wave. 


since this gives the work done on the system as positive. 
Avo is the variation in volume of the solvent which is 
assumed dissipationless so that it lags behind the 
acoustic pressure by 180°. The variation in volume Av’ 
due to a shift in the equilibrium is assumed to be in 
phase with the variation in the number of molecules Az. 
The resultant change in the volume of the solution is 
Av. A decrease in volume associated with the dissocia- 
tion process has been assumed. The foregoing can be 
represented vectorially in Fig. 1. 
From Fig. 1 it follows that 


; Av=— (bo+8')yp (40) 
Av'=—B' yp. (41) 


Since the process is assumed to be adiabatic, the 
ratio of the specific heats y has been introduced. 
Also it may be assumed that 


Ao’ = (00/dn2)An2= (AV) An/N. 


Eliminating Av’ and yp between the last three expres- 
sions, we have 


Av=AV (6o+8’)An/NB’. (42) 


To estimate Af’, consider a function F defined such 
that 


=—(Z7-+22')d(Inf.?)/aP, (43) 
where I is the ionic concentration and is given by 
i4 
=N1)> nz. (44) 
i=l 
Thus, it can be shown that 
Af = —f£F(An/N—T"A1), (45) 


a 


where I’=T'/(Z°+Z:"). 
After substituting Eqs. (39), (42), and (45). RNO Eq. 
(28) and canceling our An, the following expression 


for p' is obtained: 
F: I= hyn N= EW (RT) — (AV)Bo(PL'—1) Lio- KP, 


r 


Iy! P 
en (ms en efAV—FPIAY)] (40) 
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If we assume that (AV) is very small, such that 
(AV)&F and (AV)«(AV0), all the term8 containing 
(AV) may be neglected. Thus, 


B’=L[kinW/NRT (iw~ K) 1], 
K= k[14n(n n — FN-)]. (47) 


B’ has a maximum value at w=K and thus K ïs the 
relaxation frequeiicy. 
The maximum value of (2a)) is given by 


(2ad) max i Dimas Im (8’/Bo), 


6'=nk\W/NRT (iw—K). 


where 


where 


Thus, the maximum value is 


(2a) max= (rnkıW/NRTKBo). (48) 


Expressing in terms of concentration and the degree 
of dissociation 6, the maximum value of the energy 
absorption per wavelength is given by 


(2a))max = Umax = [rC (1—ô)kıW/KBoRT] (49) 
and 


K=k[1+ (1—8) (2—Fcô)]. (50) 


As the function F always appears as Fcô, it is better 
to consider the latter. The function F, as expressed 
earlier, is 

— (ZP+Z) (d/dP) (Infz*) a 
= —8(d4/dr) (Inf,’), > 


where Z;= —Zo=2 and 


Inf = —4.606ST3(1-+AI)4 (51) 
according to Debye-Hückel theory. If the ionic di- 
ameter is considered equal to 3.08 A, S and A are 
given by 

S=5.132 10°(DT)-}, (52) 


A=109.6(DT)-, (53) 


where D is the dielectric constant, and T the tempera- 
ture. Thus 


Fcd=18.424Sc80-} (14 AT4)-2. (54) 


Eigen” has considered the general case which is char- 
actezized by a consecutive reaction of the form given 
below: 


A+++ B--=(AB) = (AB)s=2- - -=2(AB)n. 


According to him the recombination of oppositely 
charged ions is hindered by hydration leyers and can 
not be treated as a one-step mechanism. It has to pro- 
ceed stepwise. This system has a spectrum of relaxation 
times, in which the number of time constants corre- 
sponds to the number of independent steps. The re- 
laxation times have to be calculated from the whole 


22 M. Eigen, Discussions Faraday Soc. 17, 194 (193 4); 4, 25 


(1957); Chemische Relaxation (Steinkopf, Darmstadt, 1958); 
Eigen, Kurtze, and Tamm, Z. Elektrochem. 57, 103 (1953). 


Ig A 
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system of linearized rate equations. For example, in the 
case of a reaction consisting of two steps, 


Rie 
Ane eepe AB 


© S I Roy II kag I 
s with, 12, kako, ksz, the linearized rate equations are 
i a i= — ki xit kax 
A ' t= ky xı — (kzit hos) ot Rov (55) 
L3= kogzxa— kzt, 

where 6Cg=6Ca=%, 6Cap=x2, 6Cap=%X3, and ky 

= kyX$'4+Cz), where the C’s are the equilibrium con- 
centrations. 

The transformation to a new system, i= —y;i/Ti, 
where y; is a new set of concentration variables instead 
of the given variables x; and r;, the corresponding re- 
laxation time, may be written in the vectorial forms 

y=Mx 
and 
x=M~y, (56) 
where the matrices M and M~ are given by 
1 1 1 
1 1 —A 
5 o M= ET 
| hs! en 
. 2 i ee 0 
Ryo! +Ro kiz + kor 
Roikse 1 hoy 
hyo! kag 1+A kız? +ko1 
ksz 1 kiz” 
M= | — —— 1 
hog 1+A Ryo! thor 
4 1 
> — — 0 
ik 1+4 
with, . 
~ » A= (hs2/ho3)[1+ (kz1/kı2')]. 
The relaxation times result as eigenvalues of the func- 
tional equåtion 
1 1 kiz (Rost+Rs2)t+Roiks2 1 
Eige 12 (kos + he) +haiks2 PE C) 
TI m 2 Ryo! + Rar 73 
The relaxing parts of the compressibility are given by 
,_ CBB) 1+ (hi2'/k21) (AV ah 
; 0(2) =— E eee 
5 ee 2-6 1+ (Roof hes) Rie : 
5 ` FAY Roy 
AV2=AVir-n ae A 
s z z Aalto 
- (o) 
o n e a 
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a Cp (1—7) (4V3)? ee 
80) —=—$—$<—$— ea” Ve 


(58) 
RT ee 


where AVy_11= Vat+Ve—Vap, AVS V ag — Vaz. 
B and y are degrees of dissociation; B=C4/Co, (1—6) 
= (Cap t+Cas)/Co, y=Ca/(Co—Caz), -(1-y)=Can/ . 
(Co—Caz). 
The excess absorption per wavelength is given by 


2—¥ 


WT3 
ee |. 
(ad) Al “| Boe —— fee Boa)” Aa (59) 


If kiz, kokos, kso, the first step of the reaction is 
identical with the relaxation effect due to 73. For 
Cap — 0(y=8=85) it becomes identical with the binary 
one-step equilibrium treated by Bies. The other effect 
(T2) corresponds to the final equilibrium between the 
state AB and the two states A+*+, B-— and AB’. The 
relaxation time 72 is almost independent of concentra- 
tion, except in a small transition range at concentrations 
Ca for which kı? =k. Such a treatment seems to 
describe the experimental facts in a more extended con- 
centration range which Bies’s theory fail to do. It could 
not explain the behavior for concentrations above 
102 M, where the relaxation time as well as the 
absolute value of (aA/C) becomes almost independent 
of concentration. 


(b) Activation Energy for the Rate Process 


The temperature dependence of the relaxation fre- 
quency may be utilized to determine an experimental 
activation energy for the rate process. 

Consider the reaction A — M’— B where M” repre- 
sents the activated or transition state for the reaction. 

According to Eyring’s reaction rate theory, the specific 
rate for the forward reaction is given by 


k= (k/T/h)(C'/Ca) = (RT /h) RK’, (60) 


where K’ is the equilibrium constant for the activation 
reaction, k’ is Boltzmann’s constant, h is Planck’s con- 
stant, T is temperature, and C’ and Ca are the con- 
centrations of the indicated species. k’/£/h represents 
the rate at which molecules pass from the activated 
state. The true equilibrium constant for the activation 
reaction is 


lad 


A Ko =K'0'/va, (61) 


where v’ and vy are the activity coefficients. Thtn the E. 
reaction rate is given by -Aie 
k= (RIT /h) (va/v')Ke! 

> = (R'T/h)(va/v’) rp (a AEM 


where AZ” js standard fr 
Hence the pe ifi? rate 


my 
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Fic. 2. A two-state 
potential-energy dia- 
gram. 
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REACTION COORDINATE 


the reverse reaction are 
k= (k'T/h) exp(—AF;”/RT) 
k,= (k'T/h) exp(—AF,”/RT) 


(63) 
(64) 
considering the activity coefficients as unity. Figure 2 
illustrates the relation between the energies in the three 


states. 
The relaxation time 7 is given by 


t=1/(ky +k), 


where 
kyt-k=(k'T/h)Lexp(—AP ’/RT)+exp(—AP,"/RT) J. 
If we assume that k;<k,, 
7=1/k,= (h/k'T) exp(AF,“/RT). (65) 
Hence the relaxation frequency fm is given by 


Im=(k'T/2th) exp(—AF’/ RT). (66) 


Assuming activation independent of the 


temperature 


log (fm/T) = — (AH,'/RT) loge+-constant 
AF,’ =AH,'—TAS,, 


energy 


(67) 


i where AH,’ is the enthalpy and AS,’ is the entropy of 
| activation. . : 
| i The activation enthalpy can be obtained from the 

| | slope of the curve log(fn/T) vs 1/T. Since the pdV term 
j , is negligible, AH is approximately the activation energy. 
| 
$ 


a 


(c) Different Mechanisms n 


The simplest dissociation reaction for the pri- 
mary relaxation, which has been suggested by Lieber- 

í mann, and supported by Leonard, Bies,!® and Verma 

4 and Kor” for a 2-2-valent metallic sulfate MSO; is 
< MSO,2M*tt++SO.-~. (68) 


; (a) 


= : a z G “S, Verma aa G. K.-Kor, Proc. Phys. Soc. (London) 72, 
| ai 10); J. Chem: Phys: 29, 9 (1958). 

= eink À 
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(b) Barthel!‘ has suggested the following mechanism: 
M+++H,0@MOH+-+ Ht. (69) 


(c) Tamm and Kurtze? assume the existence of a 
complex of the form MeR, H:O where Me= metal, 
R=radical. For example, in the case of MgSO, the 
complex is represented by 


H O O 
SH 

JL 
H O O 


Mgt+O 


For this complex two different possibilities “of dis- 
sociation are assumed. 


(1) The formation of two univalent ions 


(MgSO,,H.O) > MgOH*++ HSO,- (70) 
which each split up in a subsequent step 
MgOHt+HSO;- — Mgt*+OH-+Ht+SO;s--. (71) 


The first step is considered to be acoustically inefficient. 
The two dissociation processes are independent of each 
other and explain the two relaxation frequencies which 
are observed. For example, in MgSO, the dissociation 
of MgOHt explains the relaxation frequency at 130 kc 
and the dissociation of HSO, the second relaxation 
frequency at 200 Mc. 

(2) Another possibility exists in the separation of 
either of the metal ion or of the radical ion from the 
complex in the form 


MgSO;, H:O — Mgtt-+H,0-SO.-— 


— Mg-H:Ot++SO;--. (72) 


(d) Smithson and Litovitz® assume the reaction to 
be between two dipole molecules, i.e., the reaction be- 
tween the solute molecules (ion pairs) and the dipole 
solvent. The great excess of the solvent assures that 
this reaction is of first order. The reaction might be of 
the type 


MnS0O,+ H,0@Mn-H.Ott+S0,-—. 4 (73) 


(e), According to Eigen,” the simplest possible mecha- 
nism allowing an explanation of all hitherto found re- 
sults may be represented by the following scheme: 


Att-+ B-~=A BIA BA BM". 
I I M IV’ 


(74) 


The first equilibrium (I-II) corresponds to diffusion- 
controlled steps of interaction of the completely hy- 
drated ions, which is present in all electrolytic solutions, 
especially in those which do not show any, detectable 
absorption below 300 Mc. A relaxation effect due to this 
interaction should be expecced at frequencies higher 

“than 10% Mc. The relaxation effects found in 2-2-valent 
electrolytes-are connected with further consecutive re- 


Q 
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action steps in the above scheme (II-III,III-IV). The 
physical reality of such discrete steps is provided by 
the fact that the maxima, at least at lower frequencies, 
have the .shape of monochromatic relaxation curves. 
However; theré exi8ts the possibility of more compli- 
cated mechanisms in which all the relaxation steps are 
not detectable. 

(£) "Recently Verma has proposed a mechanism 
which shows that when H,0 is replactd by D.O as the 
solvent, the primary relaxation frequency should not 
change, whereas the rate of process responsible for the 
secondary relaxation is altered considerably. The mecha- 
nism is as follows: 


M80,, nH,O@[M, (n—1)H,0}++ 
+[SO.,H:0}-— (a) 
M+++2H,O=(MOH)+-+ (H;0)+ (b) 
SO -+ (H,0)*=(HSO,)-+H:0 (c) 
MOH++ (HSO.)—2MSOs+H.0. > w 


The process (a) with higher activation energy corre- 
sponds to the primary relaxation frequency and the 
process (d) with the lower activation energy to the 
secondary relaxation frequency. Processes (b) and (c) 
are acoustically inefficient. 


II. METHODS FOR MEASURING 
THE ABSORPTION 


” The methods for measuring sound absorption can be 
classified into two groups depending upon the technique 
utilized, namely, (1) progressive wave technique and 
(2) standing wave technique. In the former, absorption 
is determined from the decrement in the intensity of the 
plane progressive sound waves governed by the relation 


TNE (75) 


where Io is the intensity at the source, T+ the intensity 
at a distance « from the source and a the attenuation 
constant. This is only applicable to measurements 
above 1 Mc as the absorption, being proportional to 
the square of the frequency, decreases to such an ex- 
tent Sew 1 Mc that the measurements are not reliable. 
How low is®the absorption in water at frequencies 
below 1 Mc is evident from the fact that the amplitude 
of a plane wave is reduced only by 10% in a distance 
of 1 km. Thus, in order to have accurate and reliable 
results there should be very large volumes of the liquid 
available in the laboratory, which is hardly possible on 
such a large scale. Moreover, in order to have the 
proper directivity of the sound beam, the transducer 
should be very rge. 

These difficulties are considerably reduced if ab- 
sorption is “feasured by the standing wave technique. 
In this the absorption is,determined from the decay 
with tinze of thé energy E ofa standing’ wave between , 


24 G. S. Verma, J. Chem. Phys. 29, 1186 (1958). > 
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the boundaries of the liquid. The decay is governed by 
the relation 
VHE Eve EK, (76) 


where ô=ac, c being the velocity of propagation of sound 
waves. The difficulty here lies in the elimination of the 
boundary losses which influence the decay constant con- 
siderably. The decay constant can either be determined 
by the resonance method or the reverberation method. 
In the resonance method the décay constant is deter- 
mined from the decay time of only a single mode of 
oscillation of the system consisting of the liquid plus 
the container, whereas in the reverberation method the 
system is excited to a large number of normal modes 
of oscillation. 


(a) Methods Based upon Progressive 
Wave Technique 


Methods which are based upon the progressive wave 
technique can be further classified into three main 
groups which are (1) mechanical methods, (2) optical 
methods, and (3) electrical methods. 


(1) Mechanical Methods . 


The mechanical method, which is in frequent use, 
depends upon the measurement of sound radiation 
pressure. When sound waves meet an obstacle the 
pressure at the front differs from that at the back. The 
pressure behind the obstacle is assumed to be the 
pressure in the medium at rest, i.e., in the absence of the 
sound beam. According to Lord Rayleigh the difference 
of the pressure, which is known as radiation pressure, is 
given by 


=2(y+1)(//C), (77) 


where y is the ratio of specific heats, C the velocity, 
and J the intensity of sound waves. Thus if the radia- 
tion pressure is measured at different distances the 
absorption coefficient for the intensity is easily known 
from the variation with distance. 

The actual radiation pressure, however, as en- 
countered in an acoustic beam under ordinary experi- E 
mental conditions is different from Rayleigh’s pressure. FS 
Mean radiation pressure P per unit area of a plang comad 
pressional wave, however, is defined by 


P= p+ (prè), 


where is the excess pressure of the acoustic wa 
the density of the medium, and u the veloci 
generalization of Eq. (78), the mean tadiation 
upon an obstacle of arbitrary shape in an acou 
can be written 


t 


pess: 
(P)= pnt (pulum) ~ 
where u now stands for the vector of 


> u(«,y,2) and n is the vecto 
surface element under ke 


H> 


— | — 


E í 


Gr 

Borgnis,”> the radiation pressure for a beam of finite 

| width in any fluid and on any plane reflecting surface 
_ is given by . 


B=X(E))xkin= ((E))totai+ (((E))xin— ((E))vot). (80) 
At small amplitudes ((E}))pot= ((E}))xkin and, therefore, 
~ 2K E)) xin= (EB) cota = B(1+6"), (81) 


where FE; isthe mean energy density of the iticident 
waves and £ the reflection coefficient. Hence 


P=2E,,= E;(1+6?) 
= spowfo° (1+?) = (I/C) (1+-6"). (82) 


The above expression is valid both in liquids and gases, 
wher. terms of third and higher order in kfp= (27/)) £o 
are excluded. 

There are various ways to measure the radiation 
pressure. One way uses a torsion balance, which carries 
on one side a plate or cone which is struck by sound 
waves, while on the other side weights can be placed to 
balance the upthrust upon the plate or cone. Sometimes 
the displacement of the plate or the bead at the end of 
a long wire when a sound beam traveling horizontally 
impinges upon it, is used as a measure of sound radia- 
tion pressure. Another convenient way to measure the 
intensity uses a movable vane forming one plate of the 
condenser, and variations in the capacitance of the 
system are noted. W. Rufer, employing the above 
technique, successfully measured the absorption in a 
large number of electrolytes over a frequency range of 
2.9-8.85 Mc for concentrations of 10% and 1 mole/liter. 

Ultrasonic absorption in aqueous solutions of \InSO,; 
has been studied by the pressure balance method in the 
frequency range (12-43 Mc) by Smith, Barret, and 
Beyer*’ with the purpose of examining the behavior of 
aqueous solutions at higher frequencies. 

McNamara and Beyer” have suggested a variation 
of radiation pressure method which possesses an ac- 
curacy comparable to that of pulse and optical methods. 
Tt consists in modulating the transducer at an audio- 
frequency and detecting the consequent low-frequency 
variations in the radiation pressure with a condenser 

microphone. 

According to Markham, Beyer, and Lindsay,” the 
measurements with this method below 3 Mc are not 
reliable since difficulties arise such as those due to re- 
tarding forces which might be comparable to the forces 

which are being measured, specular reflections from the 
walls of the container, hydrodynamic flow, failure of 
- the detector to intercept the divergent sound, beam at 
low frequencies, and‘cavitation at high intensities. 
Another convenient and acctrate method of measur- 


Revs. Modern Phys’ 25, 653 (1953). 
Physik 41, 301 (1942). 


Tas F. E. Borgnis, 


: Th ara 23,71 (1951). 
: d Beyer, J. Acoust. Soc. Am. 23, 
k E Smith, Nemata, Ri k Beyer, J. Acoust. Soc. Am. 25. 
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ing the absorption coefficient is provided by the ve- 
locity of acoustic streaming in a fluid. It is now a well- 
established fact that there exists a time-independent 
gradient of radiation pressure associated with the pfop- 
agation of plane progressive waves of sound through 
an attenuating fluid medium. This constant gradient 
produces an acceleration of the fluid and hence stream- 
ing. Viscous forces arise which resist the streaming till 
the steady state‘condition is reached when ‘these forces 
exactly balance the gradient of radiation pressure. Ac- 
cording to Eckart® (1948), Liebermann! (1949), Karim 
and Rosenhead® (1952), and Karim* (1953), measure- 
ments of the streaming velocity provides an inde- 
pendent method of determining the ratio of thé second 
coefficient of viscosity to coefficient of shear viscosity. 
On the other hand, Nyborg*! (1953) and Piercy and 
Lamb’ (1954) have established both by theory and 
experiments that the sound attenuation rather than 
the second viscosity is the factor responsible for 
streaming. This conclusion was originally pointed out 
by Fox and Herzfeld®* (1950). Markham?” (1952) too 
supported this view. 

The experimental arrangement for measuring the 
sound absorption in liquids at frequencies around 1 Mc 
is shown in Fig. 3. The main tube is completely filled 
by the beam of progressive sound waves and the side 
tube provides the return path of the streaming where 
progressive sound waves do not extend. a 

According to Poiseuille, the flow velocity vo of a fluid 
along the axis of a tube of radius and length /is given by 


vo= (Apr?/4nl), l (83) 


where Ap is the difference in pressure between the ends 
of the side tube and 7 is the coefficient of shear viscosity 
of the fluid, provided the effect of the corners of the side 
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Fic. 3. Sound absorption measurement by acoustic 
streaming method. 9 


3 C. Eckart, Phys. Rev. 73, 68 (1946). 
3 L. Liebermann, Phys. Rev. 75, 1415; 76; 440 (1949). 
32S. M. Karim and L. Rosenhead, Revs. Modern Phys. 24, 108 
1952). ; 
; 33 2 M. Karim, J. Acoust. Soc. Am. 25, 997 (2253). 
“ W. L. Nyborg, J. Acoust. Soc. Am. 25, 68 (1953). 
35 J. E. Piercy and J. Lamb, ?roc. Roy. Soc. (London): A226, = 
43 (1954). . ° £ 
3 F, E. Fox and K. F. Herzfeld, Phys. Rev. 78, 156 (1950). 
31 J. J. Markham, Phys. Rev. 86, 497 (1952). , 
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tube on‘the.flow is neglected, which is true for small ve- 
locities of streaming. The difference in pressure Ap 
arises because of the change in radiation pressure due to 
attenuation inssound waves over a length / in the main 
tubt. Thus, the streaming velocity on the axis of the 
tube is given by 

a 


Ey? (1— e22) 
s 00> ie (84) 
È 4nl 


where Æo is the acoustic energy density at the entrance 
to the side tube nearer the source, which is experi- 
mentally known by using the radiation pressure vane. 
Small aluminum particles are suspended in the liquid, 
‘and the velocity of flow is determined by noting their 
motion through a microscope and 7 is taken from pub- 
lished tables. According to Piercy, absorptions accurate 
within 5% can be measured. Recently Hall and Lamb?’ 
have extended the frequency range of this method down 
to 100 kc, with an accuracy of +11% at 130 ke. 


(2) Optical Method 


The optical method is based on the Debye-Sears 
effect and was first of all developed by Bicquard.® 
Recently many improvements in the general technique 
have been made by Burton,” Willis,’ and Sette.” 
Kurtze and Tamm used the optical method for the 
measurement of ultrasonic absorption in electrolytic 
Solutions in the frequency range 3-200 Mec. 

The experimental arrangement used by Tamm and 
Kurtze for the frequency range 3-15 Mc is shown in 
Fig. 4. Sound beam from the quartz crystal is crossed 
perpendicularly by a parallel beam of light which, by 
moving the mirrors can be shifted along it. The liquid 
tank is 1 m in length and has rubber wedges placed at 
its ends to avoid the formation of standing waves. The 
first order of the resulting diffraction pattern is given 
to a photoelectric cell which is followed by a multiplier. 
The output of this multiplier is connected to a loga- 
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Fic. 4. Sound absorption measurement by the optical method. 


PHOTOCELL 
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rithmic level recorder. The sound is modulated and the 
multiplier is also tuned to the modulation frequency. 
In the frequency range 20-100 Mc the principle of the 
method is the same except that the vessel its&f is 
moved instead of a light beam. 


(3) Electrical Methods 


There are three general electrical methods for meas- 
urement of ultrasonic absorption in fluids based upon 
the progressive wave technique. They are (1) inter- 
ferometric methods, (2) direct methods, and (3) pulse 
method. 

The interferometric methods consist essentially of a 
plane quartz crystal which acts as a transducer and a 
reflector set parallel and opposite to the radiating face 
of the crystal. The distance between them can be varied 
either by moving the reflector or the transducer itself. 
Sound waves emanating from the crystal are reflected 
from the reflector, and stationary waves are formed in 
the space in between them. If the reflected waves reach- 
ing the radiating face are 180° out of phase with the 
signal just coming out, distrubance just at the radiating 
face will be reduced to zero and the plate current in the 
output stage of the driver oscillator will increase. As 
the reflector is moved towards or away from the source, 
plate current increases every time the reflector is 
moved away a half wavelength. According to Pielemeier, 
the current reading in the output stage of the driving 
oscillator is proportional to excess pressure in the sound 
beam at the face of the crystal. Thus, if x represents the 
distance through which the reflector is moved, the in- 
crease in path length will be 2x and the pressure of the 
returning wave is decreased by e~*¢*. Hence the ab- 
sorption can be determined by noting the current at 
two maxima and its decrement with distance. A 

The whole picture is not so simple, for it has been 
shown by Hubbard“ and others that the current is a 
complicated function of the absorption coefficient, the 
reflection coefficient of the reflecting surface, and the 
distance moved by the reflector. Hubbard used a buffer 
amplifier between the oscillator and the interferometer 
to reduce the effects of load variations on the frequency 
of the oscillator. Alleman* showed that lack of perfect 
alignment between the transducer and the reflector 
leads to erroneous results. Pumper*® has discussed the 
departure of the sound beam from a plane wave. Some 
uncertainty regarding the reflection also remains, as it 
is alleged to be modified by heat conduction. This 
method has been used by several workers*?-9 includin 

tat had 


— 


the author.” , 
In the direct method the excess pressure. or the 


3 W. H. Pielemeier, Phys. Rev. 34, 1184 (1929). 

“J. Cv Hubbard, Phys. Rev. 38, 1011 (1931) 41, 523 ( 

4 R. 5. Alleman, Phys. Rev. 55, 87 (1939). _ 

‘E. J. Pumper, J. Phys. U.S.S.R. 1, 41% (1939). 

F E Hor, ee we 973 poe i 
. L. Hunter, J. Acoust. Soc. 5 36 (19 

49 J. L. Hunter and F., £. Fox, 7. Acoust: SoG Q 

t G, S. Verma, J. Chem. Phys. 18, 1352 ( ( 
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PULSE 
FORMER 


` a quartz transducer and polished refector set up 
parallel to each other. The pulsed radio-frequency 
voltage at 15 Mc with a repetition rate of the order of 
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Fic. 5. Sound absorption measurement by the pulse method. 


tensity is measured by the microphone which is placed 
along the axis of the sound beam. The method is not so 
simple as it appears to be, as many difficulties are 
associated with it. In the case of the transducer which 
is assumed to be a pistonlike source, Fraunhofer dif- 
fraction takes place within a distance of a*/A from the 
crystal, where a is the radius of the transducer and the 
typical Fresnel pattern is observed at much greater dis- 
tances. King®! and Born® have deduced approximate 
theoretical expressions for the radiation field which 
can be used for the determination of the absorption 
coefficient. Correction has to be made for the finite size 
of the microphone in case its dimensions are not small 
compared to the wavelength. Standing waves as usual 
offer a problem which can be reduced by tilting the 
microphone, so that the face of the microphone is not 
perpendicular to the axis of the beam. 

The direct method has been successfully utilized by 
Smith, Barrett, and Beyer” on the determination of 
ultrasonic absorption in MgSO, in the neighborhood of 
3 Mc. The detector consisted of a barium titanate 
crystal connected in parallel with a selenium detector. 
These iwo units are mounted in the edge of a probe and 
connected with a Rubicon galvanometer. The response 
of the rectifier as read on the galvanometer is propor- 
tional to the average intensity of the sound beam over 
crystal face. 

Any method which utilizes continuous waves suffers 
from two defects. First, there always exists the possi- 
bility of stationary waves being formed, which leads to 
erroneous results of absorption, and secondly, local 

heating due to sound energy in the main beam may 
cause temperature waves from the center of the sound 
beam, which is at a higher temperature, towards the 
edges of the beam, which is comparatively at a lower 
temperature. Because of the change in temperature, 
the temperature recorded by the thermometerskept at 
tain distance from the main beam is not the true 


1000 per sec and pulse width of 1 usec was applied to 
the quartz crystal. Thus the average power is ‘only 
1/1000 that of a continuous signal at the same ampli- 
tude. The same crystal which acts as a transdvCer is 
used as a receiver. The distance between ‘he crystal 
and the reflector can be arranged such that the reflected 
signal arrives at the instant when there is,no trans- 
mission of pulse so that stationary waves are not 
formed. “The scheme is essentially to use the liquid 
sample as a storage medium for short sound pulses and 
to measure the time delay and the attenuation under- 
gone by the sound in traversing a known path within 
the liquid. The acoustical pulses are generated from 
electrical pulses by means of a crystal transducer and 
are converted back to electrical form upon completing 
a transit through the liquid. The effect of changing the 
acoustic length can be compensated for electrically. 
Hence, the measured delay produced by the increase in 
path length gives a measure of the sound velocity; the 
attenuation which must be removed from the electrical 
circuit to balance the acoustical losses in additional dis- 
tances provides a measure of the absorption. With the 
help of suitable electronic equipment the central and 
the reflected pulses are compared on an oscilloscope. 
The initial pulse is generally allowed to pass through a 
calibrated attenuator and the attenuation required to 
reduce the size of the initial pulse to match that of the 
reflected pulse measures the power loss in transmission 
plus losses due to reflection. When the reflector is 
moved through a distance xw it will mean an increase in 
path length by 2x. Thus, if the attenuation noted is 
plotted against the distance 2x, the absorption coeff- 
cient can be determined from the graph.’’™ 

Litovitz and his associates have studied ultrasonic 
absorption in MnSO, and NH,OH by the pulse method. 
Lamb and his associates have utilized pulse technique 
in studying the relaxation due to rotational isomerism 
in unsaturated aldehydes and ketones** and triethyl- 
amine.®> Young and Petrauskas have also studied rota- 
tional isomers of methylbutanes®® by the pulse method. 
Recently, the details of the pulse technique have been 
reviewed by Lamb and others.’ 


(b) Methods Based upon Standing 
Wave Technique 


As stated earlier, there are two methods based upon 


5 : a O o at which absorption is being measured. standing wave technique—namely, the reverberation 
— Bat these difficulties are removed in the pulse method. method and the resonance method. These are frequently 
e Í nt^ Pellam and = 
- The experimental arrdngement used by, D Peis 3 54M. S. de Groot, and J. Lamb, Trans. Faradsy, Soc. 51, 1676 
Galt". for the measurement of ultrasonic absorption in (1955). i 
Z Ee tah in Fig. 5. The arrangement consists o 55 E. L. Heasell and J. Lamb, Rroc. Roy. Soc. (London) A237, 
liquids is shown in Fig 5 a 233 (1956). . ° 
Ga ee Can J R rch 11, 135; 11, 484 (1934). a s6 J; M. Young and A. A. Petrauskas, J. Chem. Phys. 25, 943 
2° sy, V. King. ae j 1 1956). A 
e i ve the Physik. 120, 388 (1948). Phys. 14, 608 (1946). 57 Andreae, Bass, Heasell, and Lamb, Acustice 8, 131 (1958). 
FR Pelam and J. K. Galt, ehem Tays: 2f, - 
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used for the -measurement of ultrasonic absorption 

| below 1 Mc. The reverberation technique has been used 
by Mulders,** Moen,® and Tamm and Kurtze.! The 
resonançe methpd*has been used by Leonard, Bies, and 
Tanmm and Kurtze. 


` (1) Resonance Method 


The resonance method differs from the reverberation 
method in several aspects, the most important of which 
is the use- of a single resonant mode of the spherical 
resonator. “ʻA small piezo-electric crystal transducer 
attached to the wall of a liquid-filled spherical resonator 
is use@-to eXcite one of the natural modes of the resona- 
tor until the sound intensity reaches a sufficiently high 
level. The transducer is then switched from the exciting 
amplifier to a receiving amplifier. The sound energy 
level now in the resonator decays exponentially at a 
rate determined. by the energy losses in the system. A 
high-speed power level recorder which plots the loga- 
rithm of the decaying receiver output voltage against 
time, yields a straight decay curve, the slope of which 
is easily measured. If the losses of the sound energy at 
the walls and supports of the resonator are negligible, 
the decrease in sound level results only from the ab- 
sorption of sound in the contained liquid, and thus the 
decay rate is a measure of the sound absorption coefh- 

~ cient for the liquid. The decay rate in decibels per second 
divided by the sound velocity yields the plane wave 
absorption coefficient in decibels per unit of distance.” 
The schematic diagram of the apparatus is shown in 
Fig. 6. 

The system is normally in the receiving condition as 
shown. Boundary losses which are an important factor 
can be eliminated by subtracting the decay constant 
corresponding to the case when the resonator is filled 
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Fic. 6. Sound absorption measurement by the resonance method. 
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With a standard liquid. Thus the true absorption of the 
electrolyte is known from 


Qelectrolyte = &— Qstandard = (ô = standard) / C. 


(85) 
The advantages of having a spherical glass resonator 
are many. First, the deformation and wall frictional 
losses are minimized by making the walls extremely 
thin, which is possible in this stable shape. Secondly, 
there exist certain radial modes of vibrations of which 
boundary losses can be neglected. Thirdly, it is possible 
to estimate theoretically the boundary losses consider- 
ing the deviations of the sound field in the case of an 
ideal spherical resonator. Bies used round-bottom 
twelve-liter Pyrex boiling flasks as resonators. 


(2) Reverberation Method 


In the reverberation method the vessel which is filled 
with the liquid is excited to a large number of normal 
modes of oscillations so that a diffuse sound field occurs. 
The absorption coefficient is determined from the rate 
of decay of the sound. The decay of sound energy is 
exponential provided the boundary losses affect all the 
normal modes of vibration nearly uniformly. The na- 
ture of the sound field does not matter so long as the 
sound field is almost uniform throughout the volume of 
the liquid. If the walls of the container are irregular 
or very symmetrical, standing wave patterns do not 
offer any serious problem. Scattering of the sound field, 
instead of leading to erroneous absorption, helps in 
achieving the diffuse sound field. The vessel used by 
Tamm and Kurtze is a seamless cylindrical 100-liter 
vessel (54 cm in diam and in height) made of pure 
anodized and varnished aluminum, 3.5 mm thick. 

According to Mulders®® the time constant of the 
decay of sound amplitude not only depends upon the 
absorption of ultrasonic waves in the liquids but also 
on other causes such as the radiation of sound energy 
into the air and the energy losses by the friction of 
sound waves along the walls. According to him the 
decay constant can be expressed as a function of the 
dimensions of the vessel. In the case of a cylindrical 
vessel of radius R and height H filled with a liquid of 
density p and shear viscosity coefficient 7, the observed 
decay constant 6 is given by 3 


ò=aC+1.18[(2/R)+ 4/H)]n/p)'+ (140/pH), (86) 


neglecting the absorption in the attachment of the vessel 
and the microphone, etc. The second term represents 
corrections due to the friction of the walls. The third 
term represents the correction Jue to radiation int air 
from the liquid surface, a factor which is not very im- 
portant. In the case of a cylindrical vessel filled with 
water, Mulders found at 25°C and 1010 ke 5=58.5 
sect, 1.18[ (2/R)+ (1/H) ](fn/R)!=26 sec, (140/pH) 
=1.5 using the constants R=0.14 m, H=0.10 m, ; 
= 1000, n=0.89X 10~%, C= 1500 msec. Thus aC con 
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out to be equal to 31.0 sect and (a@/»?)X 108 msec? 
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z TABLE I. Values of the absorption coefficien? œ in decibels per unit length as referred to 0.1-M/1 
S X solution at 20°C în the electrolytes investigated so far. 
A x 10 kc/sec 100 kc/sec 1 Mc/sec 10 Mc/sec 100 Mc/sec 
` Electrolytes db/km db/km db/m db/cm db/cm- 
H:O 0.0217 2.17 0.217 0.217 PT 
Al: (SO4)s 20.1 . 100 1.0 1.0 57 
^ BeSQ, 1.75 4.8 0.29 0.29 28 A 
CaCrO, 23.1 0.31 0.31 29 
Ca(CH;COO): = 0.036 s 3.6 0.36 6 0.36 36 
CoSO, 1.77 162 2.3 0.39 31 
CuSO, 0.282 23 0.43 0.39 39 
MgCrO, 20.72 54.7 0.48 0.29 29 
MgSO; 3.92 247 0.9 0.3 29 
Mgs:0; 1.42 90 0.57 0.28 23.8 
MnSO, 0.55 51 5.2 0.74 29 
NiSO, 31.0 86 0.48 0.39 29 « 
ZnSO, 20.05 5 0.5 0.5 36 
© HS0; 0.24 0.24 24 
AlCl; 0.223 0.223 22:9 
K2SO,4 0.227 0.227 22.7 
KCrO; 0.226 0.226 22.6 
La (NO3); 0.5 34 
Li:CO; 0.238 0.238 23.8 
LisSO, 0.229 0.229 22.9 
NaCO; 0.24 0.24 24 
NaPO; 3.9 0.39 0.35 30.7 
Na2SO, 0.25 0.25 25 
(NH4)2SO4 0.227 0.227 22.7 


= 20.5 in agreement with measurements of other workers 
utilizing other methods. Thus, all that is necessary is 
one measurement of the reverberation time, and other 
corrections can be calculated from the dimensions of 
the vessel and the physical constants of the liquid, 
namely, the viscosity n and the density p. 

The schematic arrangement of the apparatus is 
shown in Fig. 7. After the sound source has been stopped, 
the sound reverberates in the vessel. This can be ob- 
served by a crystal microphone and an amplifier. The 
time constant of decay of sound can be measured at 
the output of the amplifier. Tamm and Kurtze used the 
frequency band for excitation of width 20 kc. This is 
generated by pulsing a variable carrier frequency which 
is restricted to five definite values, i.e., 50, 100, 200, 500, 
and 1000 kc. The electric output of a piezoelectric re- 
ceiver, which is a measure of the sound energy in the 
vessel (i.e., after amplification, frequency conversion, 
and filtering) is recorded by a Newman type high-speed 

level recorder or by a log amplifier combined with‘an 
oscilloscope. 

Lamb and Sherwood® and Karpovich® used this 
technique in the study of relaxations due to rotational 
isomers in cyclohexane derivatives. i 


IV. RESULTS AND DISCUSSIONS ^ 


: (a) Values of Abşorption and 
Relaxation Frequencies 


Me enf E electrolytes, for example, NaCl, NaBr, 
and KBr, have yery low absorption, i.e., the*solutions 


bas ‘ey, Tamb and- J. Sherwood, Trans. Faraday Soc. 51, 1674, 


1959). 
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of these electrolytes have almost the same absorption 
as that of the pure water. 2-1- and 3-1-valent electro- 
lytes as well as 1-2-valent electrolytes behave almost 
in a similar way to 1-1-valent electrolytes below 10-Mc. 
Only above 10 Mc do they exhibit measurable 
absorption. 

2-2-valent electrolytes, in general, possess a con- 
siderably high absorption in the whole frequency range 
which has been investigated. In view of the fact that 
sulfates of bivalent metals are soluble in water and are 
obtained in sufficient quantities with necessary degree 
of purity, mostly the sulfates have been investigated; 


for example, BeSO;, NiSOx, MgSO, CoSOs, MnSO,,- 


ZnSO4, CuSO,, etc. Thiosulfates and chromates such 
MgS:03, MgCrO,, and CaCrO,, which have been in- 
vestigated, also possess high absorption. Als(SOs)s, 
which is a 3-2-valent electrolyte, has strikirigly very 
large absorption. 


The values of the absorption coefficient œ i Gecibels 


per, unit length as referred to 0.1-M/1 so? ution at 20°C 
in the electrolytes investigated so far are given in Table 
I. Table TI gives the values of the primary and second- 
ary relaxation frequencies. 
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Figure 8(a) represents a summary of all measured 
frequency curves by Tamm and Kurtze at 0.1 M/1. 


Figure 8(b) shows a/v? for 0.1-M/I solutions plotted ° 


against frequency (Tamm and Kurtze!®), 
5 


(b) Effect of Concentration on Absorption 


x 

In‘ nearly all of the 2-2-valent electrolytes investi- 
gated thè absorption is approximately linearly de- 
pendent upon the concentration in the concentration 
range from 0.001 to 0.1M/1 (Tamm and Kurtze). The 
absorption cross section Q is defined by 


ane e Q=2a/nL, (87) 
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Frc. 8. (a) Sound absorption spectra of electrolytes in aqueous 
solu Se at 20°C according to measurements of Tamm, Kurtze, 
and Kaiser {relative to water) for 10 to 10 M solutions. 
(b) a/v? for,0°1-M/I solutions of some electrolytes plotted,against 
frequency. [In Fig. 8(a) read MgSO; for NgSO, and replace “//” 
by “X” before and “02” by “QN”? after wavelength. ] 


where n is the concentration in moles per liter and L is 
the numberof molecules per mole. This can be written as 
9, 


OL m?]=1.66X10-[2a(cem)/n(M/)]. (88) 


The absorption cross section is thus found to be con- 
stant in the range 0.001 to 0.1 M/I. Figure 9 shows the 
dependencé’of the absorption cross section Q of MgSOu, 
MnSOi, ZnSO, and Cu§Os, on concentration (Tamm 
and Kurtze!)? Wilson carried out the measurements op 
the concentration behavior of the excess sound absorp- 
tion in MgSQ, in the congentration range from 0. 
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Fic. 9. Dependence of the absorption cross section (Q) of 
MgSO,, MnSO,, and CuSO, on concentration. 


0.02M. According to him the relationship between the 
excess sound absorption and the concentration is non- 
linear; in fact, the absorption cross section tends to zero 
as the concentration goes to zero. Table III gives the 
values of the maximum absorption, per wavelength in 
MgSO; solutions at several concentrations, temperature 
being 23° to 25°C. 

Kor and Verma” have studied the concentration be- 
havior of MnSO, in great detail from 0.0025 to 1.0M. 
The relationship between the excess sound absorption 
and the concentration is found to be linear from 0.02 to 
0.1M (Fig. 10), but becomes nonlinear both below 
0.02M (Fig. 10) and above 0.1M (Fig. 11). The ab- 
sorption cross section is constant over the range 0.02 
to 0.1M and then diminishes as the concentration in- 
creases beyond 0.1M. As the concentration is lowered 
below 0.02M, the absorption cross section increases. 


(c) Effect of Concentration on 
Relaxation Frequency 


According to Wilson’s measurements in MgSOu, the 
relaxation frequency remains approximately constant 
in the range of concentration from 0.003 to 0.02M. 


Taste II. Values of primary and secondary relaxation 
frequencies in some of the electrolytes. 


Primary relaxation Secondary relaxation 


Electrolyte frequency frequency 
FeSO, 10° cps 300 Mc 
NiSO4 10‘ cps SEE 
MgSO: 130 kc 200 Mc 
Mgs:03 180 kce | , No relaxatiop in 
100-300 Mc 
MgCrO, 180 ke 260 Mc 
So a 
MnSO,, c -~ 30M 
a CER 
uSO; c . >100 Mce. 
Ale (SÒ): 20 Me » eee N 
(very low conc.) , = Sa 
La(NOs)2 ~ 50 Mc D a : Rites a 
003 to : ne 
iversity Haridwar Collection. Digitized by S3 Foundation USA aut 
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TABLE III. Values of maximum absorption per wavelength and 
relaxation frequency in MgSO; solutions at different concentra- 
tions at 23°-25°C. s 


¢ Conc. M/1 um X108 vm (ke) 
0.003 4.6 130 
0.0053 15.5 140 p 
^ 0.008 25.5 140 
G 0.01 ° 35.5 130 
0.014 48 138 > 
0.02% 8 80 150 


Tamm and Kurtze also report that the relaxation fre- 
quency is approximately constant in the range from 0.01 
to 0.1M. In MnSO, Smithson and Litovitz report no 
effect of the concentration on the secondary relaxation 
frequency with concentration but, according to them, 
there is indication of a small increase beyond 0.1M. 
According to Kor and Verma the relaxation frequency 
is independent of the concentration in the range 0.02 
to 0.1M but increases with concentration in the range 
0.0025 to 0.02M. It again increases with concentration 
above 0.1M. 

Smithson and Litovitz argue that the specific re- 
action rate kı is concentration dependent too (which has 
been neglected in the past) and the effect of increasing 
the ionic strength by increasing the concentration is 
always to increase the reaction rate, whatever may be 
the nature of the reaction whether it is between two 
ions of same sign or opposite signs or between ions and 
neutral molecules. Hence very small concentration de- 
pendence of the reaction rate may favor the reaction 
between two dipoles. Dissociation of an ion pair also 
remains a possibility, but at present there is no theory 
for the effect of ionic strength on the dissociation of an 
ion pair. 


(d) Effect of Dielectric Constant 


According to Tamm and Kurtze the electrolytic dis- 
sociation is caused by the dipole forces of the molecules 
of the solute, a measure of this dissociation is the di- 
electric constant of the liquid. In liquids with smaller 
dielectric constant than water, the dissociation energy 
therefore is larger and the relaxation frequencies shifted 
to lower values. Tamm and Kurtze used water-alcohol 
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mixtures to vary the dielectric constant of thé solvent 
and report lowering of the relaxation frequency with the 
decrease in the dielectric constant of the solvent. How- 
ever, the maximum absorption increases in magnitude, 
the reason for which, according to them, most probably 
lies in the increase in the percentage of the undissociated 
molecules. oO 

Bies used dioxane-water mixtures as a solvent. Ac- 
cording to his measurements there is small change in 
the relaxation frequency with concentration. By ex- 
trapolating it to zero concentration, he determined the 
forward reaction rate kı on the assumption that the 
mechanism responsible for the excess sound absorption 
is of the dissociation type MgSOs;@Mgt++S0,—. He 
found that the reaction rate kı first went up and then 
down as the dielectric constant of the solvent was 
lowered. He calculated the reverse reaction rate kə 
utilizing the experimentally measured values of equi- 
librium constant K by conductivity bridge and the 
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Fic. 11. Nonlinearity of the absorption coefficient with concen- 
tration at various frequencies above 0.1 M in MnSO, at 25°C. 


above extrapolated values of kı. According to Laidler,” 
the rate of change of the specific rate of reaction of a 
pair of reacting ions of valence Zı and Zə with the 
reciprocal of the dielectric constant is given. hy- the 
equation 


°  [-dInke/d(1/D)J=(2Z122/KTr), (89) 


where Zı= —Z2=2 in the above dissociation, y is equal 
to twice the mean ionic radius 3.08X10-8 cm, K the 
Boltzmann constant, and e is the electrostatic unit of 
charge. Bies reported that the specific rate of associa- 
tion does increase with the reciprocal of the dielectric 
constant of the medium as it should if jons of opposite 
signs are involved, but apparently not in a linear 
fashion as expected from the above expression. How- 
ever, Bies gave no explanation for his observation that 
the forward reaction rate weit first up and then down 
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as the ditlectric constant was lowered. At present there 
does not seem any explanation for this anomalous be- 
hayior of the reaction rate with respect to the dielectric 
constant, However,.Kor and Verma report continuous 
incr€ase in the reaction rate kı with the lowering of the 
dielectric constant in MnSO, solutions. They also used 
waterdioxane mixtures to vary the dielectric constant 
of the solvent. 

Smithson and Litovitz used methanol-water mix- 
tures for varying the dielectric constant. These mixtures 
have the advantage that water and methanol are similar 
and the absorption in such mixtures changes only 
slightly and almost linearly with percentage methanol 
(Burton),® but in this mixture both the molecules are 
polar and the interaction between the polar constituents 
of the mixtures may affect the absorption due to main 
dipole-dipole reaction if it is so. They find a decrease 
in the primary relaxation frequency by a small amount 
with decreasing dielectric constant. However, the 
secondary relaxation peak was observed to increase with 
decreasing dielectric constant. The magnitude of the 
absorption increased considerably as D was lowered. 

The theory of the influence of the dielectric constant 


TABLE IV. Effect of dielectric constant and ionic strength 
on the rates of a reaction. 


Effect of increas- Effect of in- 
t ing dielectric creasing ionic 


Reaction type constant strength 
e 
1. Two dipole molecules increase no effect 
2. Two iorfs 
(a) same sign increase increase 
(b) opposite sign decrease increase 
3. Ion and neutral molecule decrease increase 


on the rates of reaction, which has experimental evi- 
dence in the case of slow reactions has been discussed 
in several books; see Amis,“ Laidler,® or Glasstone, 
Laidler, and Eyring. Table IV, presented by the 
latter authors, shows the effect of dielectric constant 
and ionic strength on the rates of a reaction. According 
to Smithson and Litovitz, the lowering of the primary 
relaxation frequency with the decrease in the dielectric 
constant sugzests that the forward reaction is gither 
between ions of like sign or between dipole molecules. 
Either of these reactions would be of the first order if 
one of the reactants were present in great excess and 
would behave like a dissociation process. 

However, the reaction involving two charged ions of 
the same sign does net seem to satisfy the condition of 
large excess. Hefice, they conclude in favor of a forward 
reaction between solute molecules (ion pairs) and the 
an 

u £ eee Minin, Penal cee Sibi (The 
Macmillan Company, New York, 1949). 

65 Glasstone, Laidler, and Eyring, Theory of Rate Processes 
(McGraw-Hill Book Company, Inc., New York, 1941). 
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Mg SO, $ 
65 kcal/mole l 


RELAXATION FREQUENCY 
abs. TEMPERATURE 


2.8 3.0 3.2 3.4 3.6 
> 1/abs TEMPERATURE I1/T 9 9-3°K J 


vm/T vs 1/T for CoSOs, MgSO,, and NiSO,. 


dipole solvent molecules, and the excess of the solvent 
assures that the reaction is of the first order. 


(e) Effect of Temperature 


The relaxation frequency vm increases with increase in 
temperature, while the maximum excess absorption is 
not altered appreciably. A plot of logym/T vs 1/T gives 
a straight line, as found by Wilson for MgSO,, by Tamm 
and Kurtze for MgSO,, CoSO,, and NiSO, (Fig.12), 
and by Smithson and Litovitz, and Kor and Verma for 
MnSO,. The experimental activation energy for the 
relaxation process obtained from the slope of the above 
curve are about 6.5 kcal/M for MgSO,, 8.6 kcal/M for 
NiSO,, about 6 kcal/M for CoSO,. In MnSO, Smithson 
and Litovitz found activation energy to be 8.68 kcal/M 
in the isodielectric solution. Isocomposition measure- 
ments were made with two different solvents, pure 
water and 29.52% methanol gave energies of 7.85 and 
8.05 kcal/M, respectively, both of which are lower than 
isodielectric activation energy. 

According to Amis and Jaffe®* the activation energy 
AE for an ionic reaction consists of two parts, AF), ] 
which represents a contribution to activation energy { 
due to electrostatic forces and AE», a contribution due 
to nonelectrostatic forces, AE=AE,+AE>. According 
to Amis AZ;, in the case of a reaction between two 
bivalent ions, should be equal to 50% of the total 
activation energy. However, Smithson and Litovitz, 
using the methods outlined by Amis and utilizing their 
values of d(k)/d(1/T) and d\ky)/d(1/D), find AE, to F 
be just equal to 8% of the total activation energy. Such __ 
a low value of the activation energy for the electrostatice 
contribution, according to them,‘ rules out ion-ion re- - 
action as the rate determining step, However, suh a - 
low value may fit with the dipole-dipole Teaction. a 


Fic. 12. Temperature dependence of the relaxation frequency: 


‘SE. S. Amis and G. Jaffe, J. Chem. Phys. 10, 646 (ow, 
arig s 


— (Ce sie 


ff) Effect of Heavy Water When o 
Used as Solvent oh. 


“When heavy water is used as a solvent in place of 
water, the rate of the reaction in the solution is ex- 
pected to change if the rate-determining step involves 
making.or breaking a bond between a hydrogen-and an 
oxygen atom; apart from the pure solvent effect, which 
is very small, since the dielectric constant of heavy 
water differs only by less than 1% and the viscosity by 
about 20%. The half-quantum zero-point energy in the 
case of DO is smaller than that of H:O since the 
heavier deuterium atom oscillates more slowly compared 
to the lighter protium atom. If the total energy required 
to make a molecule react is almost the same whether it 
is a deuterium or a protium compound, the critical 
increment is larger and hence the reaction rate smaller 
for a deuterium compound. Smithson and Litovitz, 
however, notice no measurable shift in the lower relaxa- 
tion peak. This means that the rate determining step 
does not involve the OH bond. It rules out such possi- 
bilities as Eqs. (69) or (70). However, a secondary re- 
laxation peak is not observed in D.O solvent. This 
might be due to the OH bond which might have shifted 
out of the frequency range or that the relaxation which 
is normally observed at about 200 Mc in H20 solution is 
shifted down so as to obscure the small peak. 


(g) Effect of Pressure 


Fisher® for the first time reported that acoustic ab- 
sorption in MgSO, solutions shows a considerable de- 
crease with elevated pressure. Absorption measurements 
were made near 100 kc by means of a resonant cylin- 
drical steel cavity which could be pressurized. Observa- 
tions «at several frequencies in this region give the 
pressure dependence of the relaxation frequency. Carne- 
vale and Litovitz®® have also studied the effect of pres- 
sure on ultrasonic relaxation in manganous sulfate and 
ammonium hydroxide. Absorption data were obtained 
for MnSOx, at frequencies from 9 to 75 Mc/sec up to 
pressures of 3930 kg/cm? and a temperature of 60°C. 
The relaxation frequency increased from 19.1 Mc/sec 
at atmospheric pressure to 30.7 Mc/sec at 3930 kg/cm?. 
Absorption measurements in NH,OH were obtained ‘at 
frequencies from 15 to 75 Mc/sec up to pressures of 
2030 kg/cm? and a temperature of 45°C. The relaxation 
frequency increased from 21.2 Mc/sec at atmospheric 
pressure to 52.0 Mc/sec at 2030 kg/cm’. The results are 
explained on the basis of chemical] dissociation, 


a 


o ACKNOWLEDGMENTS 


The author wishes to acknowledge with thanks the 
helpful discussions he has had with Professor Ernest 
Yeager. He isespecially grateful to Professor K. Bannerji 
for his valuable-suggestions. Special thanks are also 


eee Diber, evale and F. 


‘A. Litovitz, J. Acoust. Soc. Am. 29, 
_ # E. H. Carn j 
769 (957); 30, 610 (ieee): 


J. Acoust. Soc. Am. 29, 656 (1957); 30, 442 (1958). ° 


CC-0. Gurukul Kangri University Haridwar Collection. Digitized by S3 Foundation USA 


VERMA 


due Professor Frank Hovorka for the grant which made 
possible his stay at Western Reserve University and for 
the facilities which enabled him to complete this 
article. Finally, the author also wishes to ‘thank Mr. 
Jacob Schrope for his help in the preparation ef the 
diagrams. : 


OTHER REFERENCES ON ULTRASONIC ABSORPTION 
IN LIQUIDS SINCE 1951 e 


Earlier references found in Reviews of Modern Physics are 


(a) W. T. Richards, “Supersonic phenomena,” Revs. Modern 
Phys. 11, 36 (1939); 

(b) Markham, Beyer, and Lindsay, “Absorption of sound in 
fluids,” Revs. Modern Phys. 23, 353 (1951); ; oo 

(c) S. M. Karim and L. Rosenhead, “The second coefficient 
of viscosity of liquids and gases,’’ Revs. Modern Phys. 24, 108 
(1952); 


and in the following textbooks: 


(d) Ludwig Bergmann, Der Ultraschall und Anwendung in 
Wissenschaft und Technik (S. Hirzel Verlag, Stuttgart, 1954), 
sixth edition; 

(e) E. G. Richardson, Ultrasonic Physics (Elsevier Publishing 
Company, London, 1952); 

(f) P. Vigoureux, Ultrasonics 
London, 1950); 

(g) E. G. Richardson, Technical Aspects of Sound (Elsevier 
Publishing Company, London, 1957), Vols. I and II; 

(h) A. E. Crawford, Ultrasonic Engineering (Academic Press, 
Inc., New York, 1955); 

(i) T. F. Heuter and R. H. Bolt, Sonics (John Wiley and Sons, 
Inc., New York, 1955); 

(j) E. G. Richardson, Relaxation Spectrometry (Interscience 
Publishers, Inc., New York, 1957); G 

(k) K. F. Herzfeld and T. A. Litovitz, Absorption and Disper- 
sion of Ulirasonic Waves (Academic Press, Inc., New York, to 
be published). 


Further references include: 


(Chapman and Hall Ltd., 


(1) T. A. Litovitz, “Ultrasonic absorption of glycerin in the 
liquid and vitreous state,” J. Acoust. Soc. Am. 23, 75 (1951). 

(2) K. Tamm and G. Kurtze, “Absorption of sound in aqueous 
solutions of electrolytes,” Nature 168, 346 (1951). 

(3) E. Freedman, “On the propagation of ultrasonic waves in 
acetic acid,” J. Chem. Phys. 19, 1318 (1951). 

(4) A. G. Chynoweth and W. T. Schneider, “Ultrasonic prop- 
agation in binary liquid system near their critical solution tem- 
perature,” J. Chem. Phys. 19, 1566 (1951). 

(5) R. E. Barrett and R. T. Beyer, “Anomalous effect in the 
ultrasonic absorption of an electrolytic solution,” Phys. Rev. 84, 
1061 (1951). x 

(6) A. B. Pippard, “Ultrasonic propagation in liquid helium 
near lambda point,” Phil. Mag. 42, 1209 (1951). ae = 

(7) A. N. Hunter, “Ultrasonic absorption in supercooled 
liquids,” Proc. Phys. Soc. (London) B64, 288 (1952). 

(8) K. F. Herzfeld, “Origin of ultrasonic absorption in liquids,” 
J. Chem. Phys. 20, 288 (1952). 

(9) D. Sette, “On the elastic relaxation in CS2,” Ricerca sci. 
22, 467 (1952). 

(10) D. Sette, “Elastic relaxation and structure of liquids, 
Ricerca sci. 22, 461 (1952). f 

(11) P. A. Hudson and M. Eisner, “Debye-Frenkel theory of 
ultrasonic absorption,” Phys. Rev. 85, 746 (1952). 

(12) H. J. McSkimin, “Measurements of dynamic shear vis- 
cosity and stiffness of viscous liquids by means of traveling tor- 
sional waves,” J. Acoust. Soc. Am. 24, 355 (1952). 

(13) Y. Wada and S. Shumbo, “Experimental investigation on 
the origin of anomalous absorption of ultrasonic waves in liquids,” 
J. Acoust. Soc. Am. 24, 199 (1952). 

(14) L. H. Hall, “Attenuation of sound resulting from ionic 
relaxation,” J. Acoust. Soc. Am.°24, 704 (1982). à > 

(15) J. Méixner, “General theory of sound absorption in gases 
and liquids with allowance of transport phenomena,” Acustica 2, 
101 (1952). ° 5 


F r 


amea 


"n 


ULTRASONIC ABSORPTION IN ELECTROLYTES 


(16) I. M. Chalatnikov, “Thermal conductivity 3 
sorption in hełium II,” J. Exptl. Theoret. Phys. Bai 41952) 
(17) M. Manes, “On the relaxation time of equilibrium system 
as related to ultrasonic absorption and reaction kinetics,” J. Chem 
Phys. 20, 1658 (1952). ee 
ey S. panit Chari, and pA “Ultrasonic ab- 
sorpfon in various organic liquids at 15 Mcs,” Naturwissen- 
iaten 39,.483 (1952). a csi Naturwissen 

(19) Nomoto, Kishimoto, and Ikeda, “Ultrasonic absorption in 
castor%oil,” Bull. Kobayashi Inst. Phys. Research 2, 72 (1952). 

(20) T. A. Litovitz and D. Sette, “Dielectric and ultrasonic 
relaxation in glycerol,” J. Chem. Phys. 21, 17 (1953). 

(21) T. Kishimoto and O. Nomoto, “Absorption of ultrasonic 
waves in CS2,” Bull. Kobayashi. Inst. Phys. Research 2, 63 (1952). 

(22) A. K. Dutta, “Ultrasonic absorption and relaxation 
mechanism,” Indian J. Phys. 35, 279 (1952). 

(23) D. Sette, “Ultrasonic absorption in liquid mixtures and 
structyral effects,” J. Chem. Phys. 21, 558 (1953). 

(24) X. Tamm, “Sound absorption measurement in water and 
in aqueous solutions of electrolytes in the frequency range 5 kc 
to 1 Mc,” Nachr. Akad. Wiss. Göttingen, Math.-physik. KI. 80, 
110 (1952). 

(25) Parthasarthy, Srinivasan, and Chari, “Absorption of 
ultrasonic waves in liquids at 5 Mc from thermal considerations,” 
Nuovo cimento 10, 264 (1953). f 

(26) Parthasarthy, Chari, and Mahendroo, “Calorimetric de- 
termination of the absorption coefficient in organic liquids,” 
J. phys. radium 14, 366 (1953). s 

(27) Parthasarthy, Chari, and Mahendroo,” Determination of 
ultrasonic absorption coefficient in liquids by a new technique,” 
Z. Naturforsch. 8a, 272 (1953). i 

(28) C. E. Chase, “Ultrasonic measurements in liquid helium,” 
Phys. Rev. 91, 489 (1953). 

(29) F. E. Fox and T. M. Marion, “Ultrasonic dispersion in 
water solutions of MgSQ,,” J. Acoust. Soc. Am. 25, 661 (1953). 

(30) P. L. Connoly and F. E. Fox, “Ultrasonic absorption in 
MgSO, solutions,” J. Acoust. Soc. Am. 25, 658 (1953). 

(31) M. Pancholy, ‘Temperature variation of velocity and 
ab{orption coefficient of ultrasonic waves in heavy water,” J. 
Acoust. Soc. Am. 25, 1003 (1953). 

(32) S. Parthasarthy and A. F. Chapghar,” Viscosity as a 
factor in the anomalous absorption of ultrasonic waves in liquids,” 
Ann. Physik 12, 316 (1953). 

(33) I. G. Mikhailov, “Absorption of ultrasonic waves in certain 
viscous liquids,” Doklady Akad. Nauk. S.S.S.R. 89, 991 (1953). 

(34) S. Parthasarthy and P. P. Mahendroo, “Absorption co- 
efficient in some liquids determined by a new thermal technique,” 
Nuovo cimento 10, 1196 (1953). 

(35) Parthasarthy, Chari, and Srinivasan,” Variation of ultra- 
sonic absorption with frequency in organic liquids,” Acustica 3, 
363 (1953). 

(36) I. G. Mikhailov and V. A. Soloviev, “Ultrasonic absorp- 
tion in liquids and the molecular mechanism of bulk viscosity,” 
Upsekhi Fiz. Nauk. 50, 3 (1952). ; 

(37) M. Manes, “Relationship between kinetics and acoustic 
phenomenon in equilibrium systems,” J. Chem. Phys. 21, 1791 

1953). 

' (3 E. Freedman, “On the use of ultrasonic absorption for the 
deteri™Mation of very rapid reaction rates at equilibrium,” J. 
Chem. Phys. 23, 1784 (1953). i 

(39) G. Fs» Alprey and W. G. Schneider, “Ultrasonic abserption 
in binary liquid systems near the critical region of temperature,” 
Discussions Faraday Soc. 15, 218 (1953). i 

(40) A. W. Pryor and R. Roscoe, “The velocity and absorption 
of sound in aqueous sugar solutions,” Proc. Phys. Soc. (London) 
B67, 70 (1954). 

(41) Parthasarthy, Srinivasan, and Chari, “The thermal effects 
of ultrasonic waves in liquids and its relation to their absorption 
coefficient,” Acustica 3, 407 (1954). f 

(42) .C. E. Chase, “Ultrasonic measurements in liquid helium,” 
Proc. Roy. Soc. (Ieondon) A220, 116 (1953). 

(43) J. F. Mifsud, “Absorption of ultrasonic waves in liquids 
under pressure,” Phys. Rev. 94, 812 (1954). 

(44) Litovit2, Higgs, and Meista, “Ultrasonic propagation and 
its relation to molecular structure in the diols,” J. Chem. Phys. 
22, 1281 (1954). > ec ae 5 5 

(45) Litovitz, Lyon, and Peselnick, “Ultrasonic`relaxation ande 
its relation to structure in viscous liquids,” J. Acoust. Soc. Am. 
26, 566 (1954). è ý 

e 


CC-0. Gurukul Kangri University Haridwar Collection. Digitized by S3 Foundation USA 3 Ka 


, (46) Barret, Beyer, and McNamara, “Ultrasonic absorption in 
acetate, solutions,” J. Acoust. Soc. Am. 26, 966 (1954). a 

(47) E. L. Carstensen, “Relaxational processes in aqueous solu- 
tions of MnSQ, and CoSQ,,” J. Acoust. Soc. Am» 26, 862 (1954). 

(48) W. J. Fry and R. B. Fry, “Determination of absolute 
sound levels and acoustic absorption coefficients by thermocJuple 
prohes—experiment and theory,” J. Acoust. Soc. Am. 26, 311 
(1954) ; 26, 294 (1954). 

(49)*F. E. Fox and W. A. Wallace, “Absorption of finite ampli- 
tude sound waves,” J. Acoust. Soc. Am. 26, 994 (1954). 


A 


(50) S. Parthasarthy, “Absorption and dispersion of ultra- i 


sonic waves in liquids,” J. Acoust. Sos. Am. 26, 611 (1954). 

(51) I. G. Mikhailov and L. I. Savina, “The absorption of 
ultrasonic waves in binary mixtures of liquid with one relaxing 
component,” Doklady Akad. Nauk. S.S.S.R. 96, 1147 (1954). 

(52) N. I. Koshkin and V. F. Nozdrev, “Investigation on the 
absorption of ultrasound in the series of hydrocarbons using the 
pulse method,” Doklady Akad. Nauk. S.S.S.R. 92, 793 (1953). 

(53) T. Kishimoto and O. Nomoto, “Absorption of ultrasonic 
waves in organic liquids I. Liquids with positive temperature co- 
efficient of absorption,” J. Phys. Soc. Japan 9, 620 (1954). 

(54) N. S. Taskoprulu, “The laws for the excess absorption and 
the dispersion of sound in MgSO, solutions at frequencies about 
3 Mc,” Istanbul Univ. Fen Fak. Mecmuasi C19, 10 (1954), 

(55) E. Grossetti, “Determination of the ultrasonic absorption 
coefficient of benzol with thermal method,” Nuovo cimento 1, 
525 (1955). 

(56) D. Sette, “Ultrasonic absorption in water-methyl alcohol 
mixtures,” Ricerca. sci. 25, 576 (1955). 

(57) D. Sette, “Ultrasonic absorption in aniline-ethyl alcohol 
mixtures,” Acustica 5, 195 (1955). 

(58) S. Parthasarthy and A. P. Deshmukh, “Sound absorption 
in liquids in relation to light scattering data,» Ann. Physik 15, 
417 (1955). 

(59) T. A. Litovitz and E. H. Carnevale, “The effect of pres- 
sure on sound propagation in water,” J. Appl. Phys. 26, 816 (1955). 

(60) C. E. Chase and M. A. Herlin, “Ultrasonic measurements 
in magnetically cooled liquid helium,” Phys. Rev. 99, 669 (1955). 

(61) A. Cilesiz, “A new pulse method to measure absorption in 
transparent liquids,” Istanbul Univ. Fen Fak. Mecmuasi €20, 94 
(1955). 

(62) R. T. Beyer, “Ultrasonic absorption in toluene,” J. 
Acoust. Soc. Am. 27, 1 (1955). 

(63) E. H. Carnevale and T. A. Litovitz, “Pressure dependence 
of sound propagation in the primary alcohols,” J. Acoust. Soc. 
Am. 27, 547 (1955). 

(64) Fox, Culey, and Larson, “Phase velocity and absorption 
measurements in water containing air bubbles,” J. Aceust. Soc. 
Am. 27, 534 (1955). 

(65) M. M. Kannuna, “Measurements of the absorption of 
ultrasonic waves in liquids by the method of isochromates,’’ J. 
Acoust. Soc. Am. 27, 5 (1955). 

(66) Stakutis, Morse, Dill, and Beyer, “Attenuation of ultra- 
sound in aqueous suspensions,” J. Acoust. Soc. Am. 27, 539 (1955). 

(67) D. M. Towle and R. B. Lindsay, “Absorption and ve- 
locity of ultrasonic waves of finite amplitude in liquids,” J. Acoust. 
Soc. Am. 27, 530 (1955). 

(68) E. G. Richardson, “Acoustic experiments relating to the 
coefficients of viscosity of various liquids,” Proc. Roy. Soc. 
(London) A226, 16 (1954). z 

(69) Parthasarthy, Chari, and Srinivasan, “The dependence of 
intensity of sound at source on its absorption coefficient in liquids,” 
J. Chem. Phys. 21, 185 (1953); Physik. Ber. 34, 2031 (1953). 

(70) A. W. Pryor and E. G. Richardson, “Velocity and absorp- 
tion of ultrasonics in liquid sulphur,” J. Phys. Chem. 59, 14 (1955). 

(71) °H. Krishnamuthi, “Relaxational effects in aqueous solu- 
tions of sulphur dioxide,” Proc. Indian Acad. Sci. 41, 125 (1955). 


(72) S Parthasarthy and M. Pancholy, “Ultrasonic absorption 3 


constants in liquids by an improved optical method,” Z. Physik 
138, 635 (1954); Physik. Ber. 34, 2032 (1954). © 
GE Tipnis, and. Rancho; “Relation between 
optically determined absorption of ultrasonic waves and speci! 
poy. Phesik 140, 504 (1955). * Pome se 


(74) Y. Nozdrev and A. Sultanov, “On the establishment”and 


study of two relaxation regions during the passage of ultr i 


waves in ethyl acetate,” Doklady Akad. Nauk. S.&.S.R. 104, Bha 


(1958). 


1069 ` 


` 5 » 
(75) A. W. Nolle, “The effect of hydrostatic prêssure in the two 
; ea: 3 s 


n7 


~ 


1070 R S. 


state model of compressional relaxation in liquids,” Phys. Rey. 
94, 812. (1934). 5 ’ 
(76) S. Parthasarthy and D. S. Guruswamy, “Sound absorption 
in liquids in relation to their specific heats,” Ann. Physik 16, 31 
1955). : 
$ Aa I. Gabrielli and L. Verdini, “Velocity of propagation and 
absorption coefficient of ultrasonics in mesomorphic liquids,” 
Nuovo cimento 2, 426 (1955). ae 
(78) I. Lyon and T. A. Litovitz, “Ultrasonic relaxation in 
normal propyl atcohol,” J. Appl. Phys. 27, 179 (1956). 
(79) J. Lamb and D. H. A. Huddart, “The absorption of ultra- 
sonic waves in, propionic. acid,” Trans. Faraday Soc. 46, 540 
(1950) ; Physik. Ber. 34, 5836 (1955). 
(80) T. Kishimoto and O. Nomoto, “Absorption of ultrasonic 
waves in organic liquids II. Liquids with negative temperature co- 
efficient of absorption,” Bull. Kobayashi Inst. Phys. Research 4, 
175 (1954). 
(81) A. Van Itterbeek, “Acoustische Relaxatieversch ynselen 
in fluida by lage temperaturen,” Tijdschr. Natuurk. 21, 349 
(1955). 
(82) K. Eppler, ‘“‘Absorbion of ultrasonics in liquid mixtures,” 
Z. Naturforsch. 10a, 744 (1955). - 
(83) B. B. Deo, “Ultrasonic absorption in solutions,” Indian J. 
Phys. 38, 352 (1955). 
(84) C. E. Chase, “Ultrasonic propagation in liquid helium,” 
Am. J. Phys. 24, 136 (1956). 
(85) S. Parthasarthy and A. F. Chapghar, “Sound absorption 
in liquids in relation to their physical properties—viscosity and 
specific heats,” Ann. Physik 6, 297 (1955). 
(86) S. Parthasarthy and D. S. Guruswamy, “Sound absorp- 
tion in liquids in relation to their specific heats II,” Ann. Physik 
16, 287 (1955). A 
(87) L. E. Lawley and R. D. C. Reed, “A reverberation method 
for the measurement of the absorption of ultrasonics in liquids,” 
Acustica 5, 316 (1955). 
(88) T. Kishimoto and O. Nomoto, “Absorption of ultrasonic 
waves in organic liquids III., Liquids with negative temperature 
E of sound absorption,” J. Phys. Soc. Japan 10, 933 
(89) C. R. Rao, “Sound absorption coefficients based on in- 
tensity measurements of diffraction orders,” Proc. Indian Acad. 
Sci. A42, 331 (1955). 
(90) Parthasarthy, Pancholy, and Tipnis, “A new thermal 
method for sound absorption in liquids,” Nature 176, 611 (1955). 
(91) A. Mer and W. Maier, “Absorption of ultrasonics and 
association in solutions of phenol in CCl, CsHi2, and CsH;Cl,” 
Z. Naturforsch. 10a, 997 (1955). 
(92) I. G. Mikhailov and G. N. Fep-Fanaov, “Differential 
method of measuring ultrasonic absorption in fluids,” Akust. Z. 
2, 194 (1956). 
(93) E. Grossetti, “Determination of the ultrasonic absorption 
coefficient by thermal methods,” Nuovo cimento 3, 673 (1956). 
(94) Bormosoy, Nozdrev, Sobolev, and Sultanav, ‘“Experi- 
mental investigation of relaxational processes arising in the 
Pose. of ultrasonic waves through fluids,” Akust. Z. 2, 118 
1956). 
(95) H. J. McSkimin, “Wave 
ment of the elastic properties of 
Soc. Am. 28, 1288 (1956). l 
(96) J. F. Mifsud and A. W. Nolle,” Velocity and absorption. of 
ultrasonic waves in several nonassociated liquids under high 
pressure, J. Acoust. Soc. Am. 28, 469 (1956). _ t 
(97) M. Mokhtar and M. Youssef, “Sensitive electrodynamic 
balance for measurement of absorption of ultrasonic waves in 
liquids,” J. Acoust. Soc. Am. 28, 651 (1956). 

(98) L. Liebermann, “On the pressure dependence of sound 
absorption in liquids,” J. Acoust. Soc. Am. 28, 1253 (1956). 
(99) V. Narsimhan and R. T. Beyer, “Attenuation of ultra- 

i litude in liquids,” J. Acoust- Soc. Am. 
sonic PS 
28, 12. 

(100) Mss. C 
acoustic absorption, 
* (101) S. Partbasar 
- tion in. ligůids from 


propagation and the measure- 
iquids and solids,” J. Acoust. 


aves of finite 
3 (1956) 


“ohen, “An ultrasoni&spectrometer for measuring 
ond Proc. Natl. Electronics Conf. p. 490 (1956). 
thy and S. S. Mathur, “Ultrasonic abso: 

thermal steady states,” Nature 178, 378 


5 i tan a ar “ ion i . Tashi . Ki 14 i ati 
E saa nyse sal sate crap a Ks eR 
+ AT guds,” Anp. Phys da E. Meyer, “Absorption. measurements In 107 QESD: A. Litovitz, “Relati f ; ak 
fe (103) R. Cerf mee interferometry. Determination of the in (132) T. A. Litovitz, “Relation of nuclear spin-lattice relaxa- 
Cee liquids: by qijirason a = CC-0. Gurukul Kangri University Haridwar Collection. Digitized by S3 Foundation USA 
ge z 
| f vee : ? — - 


VERMA 


trinsic absorption of a macromolecule in solution,” Compt. rend. 
243, 148 (1956). A 

(104) M. S. de Groot and J. Lamb, “Ultrasonic relaxation in 
unsaturated aldehydes,” Nature 177, 1231 (1956). 

(105) Zarembo, Krassilnikov, and Schklovskaya-Kordi, “Ab- 
sorption of ultrasonic waves of finite-ampJitude in. liquids,” 
Doklady Akad. Nauk. S.S.S.R. 109, 731 (1956). 5 

(106) J. A. Newell and J. Wilks, “Absorption of sound in liquid 
helium below 1°K,” Phil. Mag. 1, 588 (1956). 

(107) J. H. Andreae and J. Lamb, “Ultrasonic relaxation? theory 
for liquids,” Proc. Phys. Soc. (London) B69, 814 (1956). 

(108) R. T. Beytr, “Ultrasonic absorption—thedry of Gierer 
and Wirtz,” J. Chem. Phys. 25, 219 (1956). ; 

(109)S. Parthasarthy, “Ultrasonic absorption in liquids in 
relation to their physical properties,” Communs. congr. intern. 
traitments ultrasons, Marseille, p. 71 (May, 1956). i 

(110) Parthasarthy, Pancholy, and Mathur, “Sound absorption 
in liquids—thermal methods,” Ann. Physik 18, 220 (1956). 

(111) V. F. Nozdrev and V. D. Sobolev, “Investigatitn of the 
ultrasonic properties of ethyl acetate in the critical region,” 
Akust. Zhur. 2, 379 (1956). 

(112) O. Nomoto, “Theory of ultrasonic absorption in aqueous 
solutions I. Introduction and general theory,” J. Phys. Soc. Japan 
11, 827 (1956). 

(113) J. Lamb, “Ultrasonic relaxation phenomenon in organic 
liquids,” Communs. congr. intern. traitments ultrasons, Marseilles, 
p. 61 (May, 1956). 

(114) A. K. Dutta and K. Samal, “Real and apparent absorp- 
tion koefficient of ultrasonic waves in liquids,” Nature 179, 95 
(1957). 

(115) A. N. Hunter, “Measurements of the velocity and ab- 
sorption of high frequency ultrasonic waves in supercooled 
liquids,” Proc. Phys. Soc. (London) B69, 965 (1956). 

(116) E. L. Heasell and J. Lamb, “The absorption of ultrasonic 
waves in a number of pure liquids over the frequency range 100 
to 200 Mc/sec,” Proc. Phys. Soc. (London) B69, 869 (1956). 

(117) L. Van Craeynest, “Absorption measurements of ultra- 
sonic waves in water and electrolytic solutions,” Verhandel. 
Koninkl. Vlaam. Acad. Wetenschap. 49 (1955). e 

(118) G. D. Mikhailov, “On the question of interaction of 
“sce, waves in fluids,” Zhur. Eksptl. i Teoret. Fiz. 30, 1142 

56). 

(119) Sarembo, Krasilinikov, and Skhlovskaya-Kordi, “On the 
propagation of ultrasonic large amplitude waves in liquids,” 
Akust. Z. 3, 29 (1957). 

(120) R. T. Beyer, “Recent research in ultrasonics & physical 
acoustics in U.S.S.R.,’’ Nuovo cimento 4, 31 (1956). 

(121) F. A. Levi, “On the measurement of ultrasonic intensities 
by means of a radiation pressure balance,” Nuovo cimento 4, 
1073 (1956). 

(122) S. Parthasarthy and P. P. Mahendroo, “Relation between 
efficiency of quartz transducers and ultrasonic absorption co- 
efficients of liquids. I, II.” Z. Physik 147, 573 (1957). 

(123) S. Parthasarthy and S. S. Mathur, “Ultrasonic absorption 
by steady thermal methods,” Ann. Physik 19, 242 (1956). 

(124) O. Nomoto, “Theory of the ultrasonic abserption in 
aqueous solutions IT. Aqueous solutions of some alcohols,” J. Phys. 
Soc. Japan 12, 300 (1957). 

(125) W. Maier and H. D. Rudolph, “Ultrasonic cHz0rption 
measurements in dilute solutions for determining the rates of 
association and dissociation of benzoic acid in COl,” Z. physik. 
Chem. 10, 83 (1957). 

(126) T. A. Litovitz, “Theory of ultrasonic thermal relaxation 
in liquids,” J. Chem. Phys. 26, 469 (1957). 

(127) Litovitz, Carnevale, and Kendall, “Effects of pressure on 
ultrasonic relaxation in liquids I,” J. Chem. Phys. 26, 465 (1957). 

(128) J. H. Andreae, “Ultrasonic relaxation in, methylene 
chloride,” Proc. Phys. Soc. (London) B70, 71 (1957). 

(129) R. Piccirelli and T. A. Litovitz, “Ultrasonic shear and 
compressional relaxation in liquid glycerol,” J. Acoust. Soc. Am. 
29, 1009 (1957). “ 

(130) H. J. McSkimin, “Ultrasonic pulse technique for measur- 
ing acoustic losses and velocities of propagation in liquids as a 
function of temperature and hydrostatic pressare,” J. Acoust. 
Soc. Am. 29, 1185 (1957). 


A 


Pi 


— 


| 


s 


í 


ULTRASONIC ABSORPTION IN ELECTROLYTES 


tion to ultrasonic viscous relaxation in liquids,” A 
Am. 29, 648 (1957). iquids,” J. Acoust, Soc. 
(133) ere Shklovskayakordy, and Zarembo, “On the 
ropagation of ultrasonic waves of finite amplitude in liquids.” 
uct. Soc. Am. 29, 642 (1957). plitude in liquids,” J. 

(134) K. F. Heszfeld, “On the origin of ultrasonic absorpti 
in Goede ty. Acoust. Soc. Am, 29. 118011957 AA LA 

(135) Schuele, Gutowski, and Carome, “Interferometric deter- 
mination of ultrasonic absorption in castor oil,” J. Acoust. Soc 
Am. 2% 1081 (1957). 5 

(136) R. T. Beyer, “Double relaxation effects,” J. Acoust. 
Soc. Am. 29,°247 (1957). ° 

(137) D. Pabuchi, “Dispersion and absorption of sound in 
liquids in general chemical equilibrium and its application to 
chemical kinetics,” J. Chem. Phys. 26, 993 (1957). 

(138) L. K.. Sarembo, “On the temperature dependence of high 
intensity sound absorption in viscous liquids,” Akust. Zhur. 3, 
163 (1957). . i 

(139%. F: Nozdrev and V. D. Sobolev, “Measurement of 
ultrasonic wave absorption in ethyl acetate at the saturation line 
by a pulse method with a double fixed distance,” Doklady Akad. 
Nauk. S.S.S.R. III, 808 (1956). 

(140) R. E. Nettleton, “Theory of the equation of state and 
ultrasonic absorption in associated liquids at ordinary tempera- 
tures,” Phys. Rev. 106, 631 (1957). A $ 

(141) I. G. Mikhailov, “Ultrasonic absorption in viscous 
liquids,” Akust. Zhur. 3, 177 (1957). 

(142) O. Nomoto, “Phenomenological theory of the molecular 
absorption and dispersion of sound in fluids and the relation be- 
tween the relaxation time of the internal energy and the relaxation 
time of the internal specific heat,” J. Phys. Soc. Japan 12, 85 
(1957). 

(143) A. Carreli and F. S. Gaeta, “A new method for the deter- 
mination of the acoustic absorption coefficient in liquids,” Nuovo 
cimento 5, 773 (1957). 

(144) R. I. Tait, “An apparatus for the study of ultrasonic 
propagation in liquids under pressure,” Acustica 7, 193 (1957. 

(145) E. G. Richardson and R. I. Tait, “Ratio of specif 
and high frequency viscosities in organic liquids under 
derived from ultrasonic propagation,” Phil. Mag. 2, +41 (19 

%146) “Molecular aspects of sound propagation in gases 
liquids,” Proc. Intern. Comm. Acoust. Congr. 2nd. Cambridge 
Massachusetts, 1956 (published 1957). 

(147) R. O. Davies and J. Lamb, “Ultrasonic anal 
molecular relaxational processes,’ Chem. Soc. London 
Revs. 11, 134 (1957). 

(148) W. Craya, “Two fluid model and sound abso 
helium II,” Helv. Phys. Acta 30, 447 (1957). 

(149) V. P. Glotov, “The theory of the relaxation a 
and dispersion of sound in strong electrolytes which are 
dissociated,” Soviet Phys. Acoustics 3, 236 (1957). 

(150) G. Laville, “Apparatus for measuring the ab 
ultrasound in liquids,” Compt. rend. 245, 1523 (195 

(151) M. S. de Groot and J. Lamb, “Ultrasonic re i 
the study of rotational isomers,” Proc. Roy. Soc. (London) AM, 
36 (1957). 

(152) O. Nomoto, ‘Ultrasonic absorption and dispe 

eliquids in relation to liquid structure Il. Liquid mixturs 
solutiotrey J. Acoust. Soc. Japan 13, 156 (1957). 

(153) R. S. Musa, “Two crystal interferon 
measuring ultya¥onic absorption coeficient tn lì 
Soc. Am. 30, 215 (1958). t 

(154) T. A.eLitovitz, “Origin of ultrasonic volume vir ib 
associated liquids,” J. Acoust. Soc. Am, 30, 210 aAa, 

(155) Dransfeld, Newell, and Wilks, “The absorption Å wand 
in liquid helium II,” Proc. Roy. Soc. (London) AAR, VO RNV 

(156) M„ Mokhtar and H. Youssef, “Ultwasoy ic adsorp, Xa 
liquids I, J. Acoust. Soc. Am. 30, 549 (1938), 


nË 


ial 
j 


nat 


sorption 


CC-0. Gurukul Kangri University Haridwar Collection. Digitized by S3 Foundation USA ce 


, (157) D. Tabuchi, “Dispersion and absorption of sound in 
ethyl formate and study of the rotation4l isomers’ J. Chem. 
Phys. 28, 1014 (1958). 


(158) R. M. Mazo, “Absorption and dispersion of sound in. 


chemically reacting fluid,” J. Chem. Phys. 28, 1223 (1958). 

(159) N. Hirai and H. Eyring, “Bulk viscosity of liquids,” J. 
Appl, Phys. 29, 810 (1958). 

(160) M. Cevolani and S. Petralia, “Ultrasonic absorption in 
aniline and in mixtures,” Nuovo cimento 7, 866 (1958). 

(161) K. Herzfeld, “Bulk viscosity and shear ~iscosity in fluids 
according to the theory of irreversible processes,” J. Chem. Phys. 
28, 595 (1958). x 

(162) S. Parthasarthy and M. Pancholy, “Ultrasonic velocity 
and absorption coefficient in a chemically reactive medium,” 
Z. angew. Phys. 10, 193 (1958). 

_(163) H. Singh, “Sound absorption in relation to free volume 
of liquids,” Nuovo cimento 9, 545 (1958). 

(164) B. R. Ramachandra Rao and H. S. Rama Rao, “A 
method of studying the variation with temperature of ultrasonic 
absorption in liquids,” Nature 182, 1794 (1958). 

(165) S. Parthasarthy and M. Pancholy, “Studies in*ultra- 
sonic propagation in mixtures of ethyl alcohol and water,” Z. 
angew. Phys. 10, 453 (1958). 

(166) P. White and G. C. Benson, “Ultrasonic absorption in 
(pure) butyric acid,” Can. J. Chem. 36, 1135 (1958). 

(167) Murphy, Ganison, and Potter, ‘Sound absorption at 50 
to 500 kc from transmission measurements in the sea,” J. Acoust. 
Soc. Am. 30, 871 (1958). 

(168) K. Barron and L. E. Lawley, “Absorption of sound in 
sediments of small particles in liquids,” Proc. Phys. Soc. (London) 
72, 933 (1958). 

(169) R. Bass and J. Lamb, “Ultrasonic relaxation of the vibra- 
tional specific heat of carbon dioxide, sulphur hexafluoride, 

i oxide, cyclopropane and methyl chloride in the liquid 
* Proc. Roy. Soc. (London) A247, 168 (1958). 
170) R. T. Beyer and V. Narsimhan, “On the absorption of 
ultrasonic waves of finite amplitude in liquids,” Soviet Phys.- 
4, 196 (1958). 
V. P. Glotov, “Reverberation tank method for the study 


und absorption in the sea,” Soviet Phys.-Acoust. 4, 243 


R 
> 


I. NaVianov and V. F. Nozdrev, “Investigation of the 
rature dependence of the ultrasonic absorption 
cal region of methyl acetate,” Soviet Phys.- 


ithailov, “On the absorption of ultrasonic waves 
F Phys.-Acoust. 4, 200 (1958). 

. “Qn the absorption of ultrasonic waves in 
-s-Acoust. 4, 205 (1958). 

ancholy, and Chapghar, “Ultrasonic ab- 
cous series of organic liquids,” Nuovo 


zation of ultrasonic waves in suspensions 


£ Phys. Soc. Japan 13, 1390 (1958). 


` “Compressional relaxation in liquids,” 


em) FS, 767 (1959). 
% 


1 ane ). Lam), “The effect of polar and non- 
wiss dvete em De Nemeric reaction in ethyl formate demon- 
asat by wimani measerements,” Trans. Faraday Soc. 55, 
TSR IRSA. 
“Sp A Litera, “Ultrasonic spectroscopy in liquids,” J. 
RR Vo ka, B_ VT GW 
a 


a 2 ‘e 


Ţ `Ţ 


1071 


NUMBER 4 OCTOBER, 1959 


CS “REVIEWS OF MODERN PSS VOLUME 31, 
e 


- ‘Anisotropic Light Scattering of Streaming 
Suspensions and Solutions 


WILFRIED HELLER 


I. INTRODUCTION 


N recent years considerable use has been made of 
light scattering for determining the shape of dis- 
solved macromolecules and of dispersed colloidal 
particles. The favored method uses the variation of 
scatfering with the angle of observation. The theories 
used are due to Gans! and Debye.? An -alternative 
method is that of Krishnan.’ Here, the relative in- 
tensities of the “four Krishnan components” of light 
scattered at 90° with respect to the primary beam are 
the experimental data. The former method uses 
incident natural light; the latter, incident plane 
polarized light. Both have in common the investi- 
gation of light scattering by systems with randomly 
oriented molecules or particles. There is no need for a 
review of either of these methods since several excellent 
summaries are already available. A much neglected, 
yet more powerful and older third method, is due to 
Diesselhorst and Freundlich.’ In this, the light scatter- 
ing is observed subsequent to an orientation of the 
scattering bodies, e.g., by streaming. Such an orientation 
produces an anisotropy of light scattering : the scattered 
intensity varies with the orientation of the electric 
vector of the incident beam with respect to the mor- 


phologically extraordinary direction of the scattering 


bodies. This review is concerned with the three principal 
phenomena of anisotropic light scattering observed or 
to be expected: dityndallism, conservative dichroism, 
and “‘bidissymmetry.” 


Ii, METHODS AND PROBLEMS OF ORIENTATION 


The necessary orientation of particles or molecules 
may be produced by streaming, or by an electric or 
magnetic field. Flow orientation is due exclusively to 
anisometric*(nonspherical) shape and is therefore of 
primary interest here. It is “kinematic”’—although the 
particles rotate continuously due to the velocity 
gradient g and to the rotatory Brownian movement, 
they spend most of their time in a position getined by 
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the angle x. This is the angle between the most probable 
direction of the longest dimension of the particles and 
the direction of the velocity gradient. If the particles or 
molecules are deformable, the phenomenon is compli- 
cated by dilation at an angle of 45° with respect to q 
and the direction of flow, and compression at an 
orthogonal angle. The direction of flow is generally 
referred to as the extraordinary direction e of the system. 
It will actually be an extracrdinary direction only if 
q— ©, i.e., when the longest dimension of the particles 
is parallel to the direction of flow (x= 90°). If the parti- 
cles are uniaxial, the direction of flow becomes then— 
according to the definition in crystal optics—the optic 
axis of the system and, therefore, the optically extra- 
ordinary direction. 

Magnetic and electric orientation are simpler in 
nature since the direction of orientation always coincides 
with the direction of the field or with the direction 
normal to it. The degree of orientation is governed here 
by the torque acting upon the particles and the super- 
imposed randomizing effect of the rotatory Brownian 
movement. Complications arise from the fact that 
magnetic and electric orientation may be due to both 
anisometry and intrinsic (magnetic or electric) anisot- 
ropy. Since the orienting torque due to anisometry is 
generally very weak the orientation is, in general, 
mainly due to a permanent or induced magnetic or 
electric dipole. The existence of the two torques can 
lead to considerable complication if they are orthogonal. 
The orientation may then change direction with field 
strength, temperature, or frequency. Some of the result- 
ing optical complications are described in detail 
elsewhere. Of particular interest is the application of_ 
alternating magnetic? or electric® fields. “4venever 
magnetic orientation is applicable, it dese-ves preference 
over electric orientation because of a variety of possible 
secondary complications of the latter.® 


Il. DITYNDALLISM 
1. Classical Experiments and Définitions 
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Dityndallism was discovered by Diesselhorst and 


nt 1R, Gans in Handbych der Experimentalphysik (Akademische | 
Freundlich® on various streaming colloidal solutions. ~ 


Verlagsgesellschaft, Leipzig, 1928), Vol. 19, p. 34; Ann. Physik 


OZ BU lecture given at the Polytechnic Institute of 
Brooklyn, Brooklyn, New York (Novelnber 241i). |. and 
3R., S. Krishnan, Proc. Indian cad. Sei SLIT K ) an 

TES g e n } 
sfumerous later papers Dypahis Light ‘Scattering by Small Particles 


© 4 See, e.g., H: C. van de Hulst, 


6 W. Heller, “Nouvelles recherches sur les propriétés magnéto- 
optiques des solutions colloidales” in Actuedités sci. et ind. 
(Hermann et Cie., editors, Paris, 1939), Vol. 806. 

7 Selected references may be found in W. Heller, Kolloid physik 
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SCATTERED 
RELATIVE INTENSITY 


Fic. 1. Dityndallism produced by flow orientation of colloidal 
rods and plates. @, scattered intensity is strong. O, scattered 
intensity is weak. 


Figgre 1 illustrates their procedure and results. The 
‘colloidal solution flows through a rectangular cell in 
the x direction. The principal velocity gradient is in the 
z direction and a secondary gradient is in the y direction. 
Consequently, rod-like particles will, at a sufficiently 
high velocity gradient, direct their morphologically 
extraordinary direction, i.e., the major axis, prefer- 
entially toward x, while disk-like particles will direct 
theirs, the shortest dimension, preferentially toward z. 
The direction of the linearly polarized beam and of its 
electric vector and the direction of observation (always 
-perpendicular to the incident beam) are varied to the 


——~ fullest possible-eatent. 


The black and white disks indicate that the observed 
== scattered intensity is weak or strong, respectively.’ 
It folls that, for one particle, the intensity of light 
scattéred’ at g0° with respect to the incident beam is 
strongest if the electric vector is perpendicular t the 
plane of observation and parallel to the largest dimen- 
sion of the particle. The laterally observed Tyndall 
beam of a dispersed system of oriented rods has there- 
fore maximdl intensity if the electric vector of the 
incident beam vibrates, for g— ©, parallel to the 
direction of flow arfd has minimal intensity for the 
orthogonal dire€tion of vibration. This effect, called 


| ~ “dityndallism,” 102 could be defined in analogy to bire- 


9A quantitative investigatien can be expected to show that 


there are*four different intensities*for platelets and three for 


rodlets. 
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fringence and dichroism by 
: A (D.)o= (WM). 3 i (1) 


where J is the intensity of light scattéred at an angle 0, 
per unit sglid angle, per unit volume of unit concenfra- 
tion, and per unit intensity of the incident beam, and o 
refers to the ordinary direction of the system. ‘Qhe use- 
fulness of this definition is limited, however, since 
(Ie—Io) miay be finite also in the unoriented system. 
An alternate definition, 


(D)o= Le’ Le!) — Io’ — Io") =AI-— Alo, (1a) 


therefore may be introduced as preferable. The super- 
scripts ’ and ” refer to the oriented and unoriented 
system, respectively. For practical work, definitiom (1a) 
may be expressed in terms of more convenient intensity 
ratios.10> : 

Flow dityndallism is an exclusive form effect only if 
produced by deformation of isotropic fluid material 


DITYNDALLISM 
Z 


Fic. 2. Geometrical definitions pertinent to flow dityndallism 
and conservative flow dichroism. 


(e.g., emulsion droplets) or by orientation of aniso- 
metric intrinsically isotropic bodies. Anisometric bodies 
are rarely intrinsically isotropic. In general, therefore, 
an intrinsic dityndallism of equal or opposite sign will 
be superimposed upon the form dityndallism. A quanti- 
tative differentiation between the two effects should be 
possible by varying the refractive index of the medium. 


*, 2. Physical Optics of Dityndallism 


The physical optics of dityngallism & somewhat 
intricate. This is discussed with the #4 pf Fig. 2, 
where the principal drawing is in perspective, while two 
auxiliary drawings pértain to the +5 Plane. view ed in the 
directiqn of the light source. The symbels x, y; and z 
indicate the direction of fow of the primary beam‘and, 
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of the velocity gra lient, respectively. The angle x be- 
tween thesmajor axis of the rod-like particle considered 
and the velocity gradient is, for simplicity, assumed to 
“be 90°. This corresponds to q— œ. The angle between 
thé electric vector of the incident beam and the yz 
plane is y, that between the vector of the scattered 
beam—or of the scattered component considered—and 
the xz plane is y’ (symbol not shown in Fig. 2). The 
angle between the variable plane of observation and 
the direction of flow is w, that between the direction of 
observation and the direction of the primary beam is 9. 

Consider first the situation pictured in Fig. 2, in 

which 6=w=x=90° and let the angle y be 90° and 0° 
in two consecutive measurements. The amplitude of the 
component of the incident beam vibrating in the 
extraordinary direction a,’ gives rise to an amplitude 
of the scattered component a,’, and, in the more general 
case, of a component a,*’. The latter is not considered 
in the drawing. Similarly, a.* (and a neglected com- 
ponent a,*’) will result from a,*." Since a'ao, the 
consecutively determined amplitudes of the com- 
ponents of the transmitted beam, a.a". Next 
assume that 02=w= x= 90°, but (0°¥~+~90°). The scat- 
tering process is then equivalent to that expected for 
two coherent components with the amplitudes a,‘ and 
to`. Consequently, the scattered linearly polarized beam, 
of amplitude a*, resulting from their interference, vi- 
brates at an angle y/~y. Measurement of the dityn- 
dallism is therefore reduced to a single measurement, 
that of the angle y’ for which the scattered intensity 
transmitted through an analyzer is maximal.” The 
difference between y and y’ reaches a maximum when 
y=45°. 

When 6=90°, w90°, x=90° (and 0°+Y+90°), the 
scattered beam is elliptically polarized, the ellipticity 
being due to phase differences between the orthogonal 
coherent components of wavelets of scattered light 
originating in different volume elements of the particle. 
The situation is the same when 6=w=90°, y=0°, 
but 790°. 

This discussion allows one to anticipate a few physical 
phenomena which have yet to be verified experi- 
mentally. It should be possible to determine x by 
rotating the direction of observation, in the xz plane, 
to the particular »~90° value at which the ellipticity 

~ vanishes, i.e., at which the scattered beam represents 
= “searly polarized light. The rate of change of the state 
ig 2 — O 


end wing to the interference of each of the two disregarded 
~ o ead with the respective coherent orthogonal components, 
ze 
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= in reality 
«surprising t 


R i S in each instance which vibrates 
A eigen een 4'=#90°, respectively. It is 
t enolarization measurements on resting solutions 
no attention ha$ been paid either to the 
v0 linearly polarized scatteretl beams or to 
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tead of the four generally used 
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of polarization during this rotation should vary with 
the shape of the particles. This should: provide an 
auxiliary method for the determination of shapes. 
Assume that the proper w value yielding linearly 
polarized light is found: a transition from @:=90° to 
690° will then also produce ellipticity and the rate of 
its change with @ also should vary with the dimensions 
of the particles or molecules. : 

The case may arise that ellipticity is observed even 
for 2=w=90°, x=90°, and 0°+Y+90°. This would be 
indicative of intrinsic anisotropy and the degree of 
ellipticity would be indicative of the magnitude of 
intrinsic anisotropy. Ellipticity would be due here to 
the difference in retardation of the orthogons! com- 
ponents of scattered light within the scattered particle 
by virtue of differences in its refractive indices parallel 
and perpendicular to the rod axis. This e“ect would 
represent the true complement, in the scattering process, 
to intrinsic birefringence. À 


3. Theory of Dityndallism 


The theory of dityndallism is concerned with the 
scattering process of nonspherical particles and with the 
distribution function of their principal axes. The latter 
was treated first by Langevin” for the relatively simple 
case of magnetic (or electric) orientation. This theory 
yields the relationship 


(aa) (ku— Ho) = 2.0, 


where the subscript 2 refers to the unoriented system 
and ų is the refractive index. The relationship should, of 
course, hold also for turbidity and true absorption. It 
was verified for the former effect. The distribution 
function for electric orientation of flexible macro- 
molecules was derived, more recently, by Isihara.! 
The more complicated case of flow orientation of rigid 
spheroids has been treated by Boeder!® and, more 
rigorously, by Peterlin and Stuart.” Finally, the 
distribution function for the end-to-end distance of 
flow deformed molecules has been developed.tecently.!8 
The theory of anisotropic scattering for the‘ case in 
which intrinsically isotropic and rigid anisometric__— 
particles are small compared to the wavelengt?of the 
incident beam is implicitly containedwin an early 
treatment by Rayleigh.” A recent extension of this 
theory by A. F. Stevenson includes the dimensions 
commonly encountered with proteins and synthetic 
macromolecules. This theory will be published in the 


near future. ‘ ; 


13 P, Langevin, Radium 7, 249 (1910). aft 


14 W, Heller and G. Quimfe, Phys. Rev. 61, 382 (1942). 

16 Tsihara, Koyama, Yamada, and Nihioka, J. Polymer Sci. 17, 
341 (1955); see also C. Wippies; J. Polymer Sci. 23, 199 (1957). 

16 P, Boeder, Z. Physik 75, 258 (1932). a 

17 A. Peterlin and H. A. Stuart, Hand- und Jahrbuch d. Chem. 
Physik, A. Eucken and K. L. Wolff, editors (Akademische Ver- 
lagagesellschaft, Leipzig, 1943), Vol. VIII, Sec. IB. ” 

18 Peterlin, Heller, and Nakagaki, J. Chem. Phys. 28, 470 (1958). 

19 Lord Rayleigh, Phil. Mag. 41, 447 (1871); 44, 28 (1897). 
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Fic. 3. Conservative magnetic dichroism of an a-FeOOH-sol. 


The treatment of anisotropic light scattering to be 
expected from random coils deformed by flow is difficult, 
and suitable approximations are therefore necessary. 
Exploratory investigations to that effect have been pub- 
lished elsewhere.??:! 


_IV. THE CONSERVATIVE DICHROISM 
a at 


1. Definitions and Methods of Measurement 


Figure 2 shows that the turbidity 7 of a dispersed 
systen#@qf oriented anisometric particles varies with 
the directio of the electric vector of the incident 
linearly polarized beam. Consequently, the difference 
(Te—To) ~”! The existence of this effect was explicitly 
proved only rather recently.® It was subsequently desig- 
nated as “conservative” dichroism.” This effect, like 
dityndallismyis primarily an effect of anisometry, but 


e 
a l E F. Stevensog and H. L. Bhatnagar, J. Chem. Phys. 29, 1336 
1 D 
*! “Turbidity rations” may often be easier to obtain or may con- 
vey suppleméntal information. Thus, the ratios Te/Tu OF To/Ty 
where # refers to the unoriented system, give the turbidity of the 
oriented system, ior the particujar direction of the electric vector 
relative testhat of the isotropic unorjented system. é 3 ; 


2 See reference 11 in Heller, Quimfe, and Yeou-Ta, Phys. Rey. ® 
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again the intrinsic anisotropy may make a contribution 


of equal or opposite sign. If the form effect predomi- 
nates, then the sign of the total effect immediately indi- 
cates the orientation of the longest particle dimension 
with respect to the extraordinary direction. In order to 
evaluate the relative contribution of the intrinsic effect, 
one can—as with birefringence—eliminate the form 
effect by suitably varying the refractive index of the 
medium.” 

Investigation of the conservative dichroism requires 
that one work in a spectral range of negligible absorp- 
tion. Otherwise, the effect observed is complex in 
nature because of the simultaneous preSence of true 
dichroism which is due to a difference in the extinction 
coefficients ee and eo. The two effects are physically - 
not separable. # 

The*theory of conservative dichroism, like “© g i 
dityndallism, is implicitly contained in će paper bys 
Rayleigh” when the particles ate shall compayeeRo the 
wavelength. The theðry for larger parti€les is, at 
present, under preparation. The latter case is of far 
greater importance since the smallness o! the turbidity 
in systems with very small particlesemaKes it difficuli 
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Fic. 4. Anisotropic interference of light scattered by oriented 
tods and plates not small compared to the wavelength of the 
incident beam. 


to obtain a reliable turbidity difference (re— ro). 
Although the attractive feature of conservative di- 
chroism, like that of dityndallism, is primarily the 
possibility of obtaining numerical data on the dimen- 
sions of nonspherical bodies, other applications also 
are interesting: one can easily obtain sémiquantitative 
information on changes in size; shape, and Structure 
of aggregates formed during ceagtilation of anisometric 
particles. Furthermore, the mechanism and kinetics 
of sol-gel transformation and “the change in rotatory 
mobility of particles during this process can be studied 
successfully.?® NAE T, 

Several methods are available for experimental 
determination of (re—ro). First, by conventional spec- 
trophotometric procedure 7e and ro may be measured 
in succession. The plane of polarization is rotated, 
between measurements, by 90° and the oriented sys- 
tem and the reference medium (solvent) are brought 
into the light path consecutively. In the case of mag- 
netic orientation, one may use a split beam of orthog- 
onally „polarized components, each of which traverses 
one compartment of a twin cell filled with the oriented 
system and solvent, respectively. The second measure- 

ment is made after rotating the twin cell by 180° about 
an axis perpendicular to the field and incident twin 
beam.’ Figure 3 shows the magnetic conservative 
dichroism thus obtained in an old colloidal solution of 
FeOOH.?’ (The results, to which the true dichroism 
makes a negligible contribution at 5400 a.u. show 


a immediately that the plate-like particles of submicro- 


‘scopic size orient themselves parallel to the lines of 
force.) These two methods give 71, in addition to 7, and 


To. This is not the case for the third method, which is 


astest, and also the most sensitive if the effects 
all. Here, y=45° (Fig. 2). The conservative 


Phys. Chem. 41, 1041 (1937). 

G. Quimfe, Corapt. rend. 205, 1152, 1394 
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cted and used for this pugpose has 
vhere!4 and will be discussed in detail 
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dichroism then produces a rotation of the plane of 
polarization #8 from which (7.—7.) is easily derived. 


V. BIDISSYMMETRY IN UNPOLARIZED LIGHT 


The rods and disks, shown schematically ia Fig. 4 
in the three possible orientations with respect fo an 
incident unpolarized beam, lead to radiation diagrams 
of which two vectors are shown schematically ix each 
case: the radius vector at 90° and that at 0° with 
respect to the direction of the primary beam. The 
former vector applies to observation in the plane of the 
paper. The reduction, by interference, in the intensity 
of laterally scattered light is more pronounced-if the 
longest direction of the particle is perpendiculam%o the 
primary beam and is directed towards the observer. 
Again, a differentiation between rods and disks should 
be possible, although the quantitative differences in 
the optical effects can be expected to be smaller than 
for incident polarized light.” Intrinsic anisotropy also 
makes a contribution here, since interference varies 
with the refractive index. 

Of particular interest is the fact that heft light 
scattering exhibits a twofold dissymmetry. This is ` 
shown schematically in Fig. 5. The cross sections of the 
schematic radiation diagrams of spheres, random coils, 
or randomly oriented nonspherical bodies (left) and of 
oriented nonspherical bodies or deformed random coils 
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Fic. 5. Conventional dissymmetry (left) and bidissymmetry 
on orientation or deformation (right). 


28 The superficial similarity with rotation due to rotatory power go 
has led a few authors to confuse dichroism with optical rotation. ; 
Rotation of the polarizer allows a very easy difierentiation between 
the two effects. ath, Se 

2 W. es and H. Zocher, Z. physik. Chem.,(Leipzig) A164, i 
55 (1933). ; 

Tf interference in the forward direction cannot be neglected— 
nearly microscopic particle size or large refractive indexdiffvrence 


thesis of G. Quimfe, Faculté des ^ between particles and medium—all three configurations of both 
k 2 


rods and disks will lead to different scattering intensities. 
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LIGHT SCATTERING OF SUSPENSIONS 


| (right), not negligibly small compared to the wave- 


length, are given by the upper“irawings for the direc- 


| tion of the primary beam indicated by the top arrows. 


The cross sections of the diagrams in the lower pair 
of drawings apply to incidence of the primary beam 
perpendicular. to the: plane of the paper. The radiation 
diagram on the left, exhibits merely the wel!-known 

“longitudinal” dissymmetry, while that on the right 
has, in addition, a “transverse” dissymmetry. Light 
scattering is Were therefore “‘bidissymmetrical”’. The 
dityndallism in polarized light also exhibits bidissym- 
metry, but its features are more complicated. 
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lost when unpolarized jight is used;\ the resulti 
greater simplicity of the phenomena allows: simplifi- 
cation of’ both the optical theory and the apparatus. 
One may, therefore, consider approximating treatments 
which would fail for polarized light. Thus, Debye’s 
“interference” treatment? was used in order to establish _ 
the theory of bidissymmetry expected for flow ede- ` 
formed random coils.!$ This theory allows onedo derive, 
from the bidigsymmetry, important information on 
molecular parameters which cannot be obtained easily 
by any other means. } 


While significant details of the “weal process are 
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